[
https://issues.apache.org/jira/browse/SENTRY-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256448#comment-16256448
]
Na Li commented on SENTRY-1964:
-------------------------------
[~akolb] the only problem is the scenario that the table location is initially
prefix of the partition's local, and then changes to another location, but
partition's location does not change.
How often does this happen? Can we add configuration to control if we send
partition to HDFS, and the default behavior is to send HDFS? In this way, for
customer who really wants the performance improvement and does not run into the
above scenario will be able to enjoy the benefit of not sending partition to
HDFS.
Later on, the component that user uses to make table location change can be
smarter to avoid such situation. For example, when changing table location,
will ask user to choose 1) change partition location as well to be
sub-directory of table location, or 2) enable sending partition to HDFS.
> HDFS sync does not need partition locations (usually)
> -----------------------------------------------------
>
> Key: SENTRY-1964
> URL: https://issues.apache.org/jira/browse/SENTRY-1964
> Project: Sentry
> Issue Type: Improvement
> Components: Sentry
> Affects Versions: 2.0.0
> Reporter: Na Li
> Assignee: Na Li
> Priority: Critical
> Attachments: SENTRY-1964.001.patch, SENTRY-1964.001.patch,
> SENTRY-1964.002.patch
>
>
> Right now, sentry saves partition info from HMS and send it to HDFS. HDFS
> only needs database and table info, and does not need partition info for ACL
> unless the partion location is not sharing the same prefix of its table.
> The partition data amount is huge, and causes performance issue. We can
> optimize it by not saving and not sending partition info if it shares the
> same path of its table.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)