[
https://issues.apache.org/jira/browse/DRILL-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422030#comment-15422030
]
ASF GitHub Bot commented on DRILL-4846:
---------------------------------------
GitHub user amansinha100 opened a pull request:
https://github.com/apache/drill/pull/569
DRILL-4846: Fix a few performance issues for metadata access:
- Create a MetadataContext that can be shared among multiple invocations
of the Metadata APIs.
- Check directory modification time only if not previously checked.
- Remove a redundant call for metadata read.
- Added more logging.
- Consolidate couple of metadata methods.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/amansinha100/incubator-drill DRILL-4846
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/569.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #569
----
commit 1f8ae8f6b88b96b5861b81a2468d8f99d3f8b1c2
Author: Aman Sinha <[email protected]>
Date: 2016-08-03T16:00:51Z
DRILL-4846: Fix a few performance issues for metadata access:
- Create a MetadataContext that can be shared among multiple invocations
of the Metadata APIs.
- Check directory modification time only if not previously checked.
- Remove a redundant call for metadata read.
- Added more logging.
- Consolidate couple of metadata methods.
----
> Eliminate extra operations during metadata cache pruning
> --------------------------------------------------------
>
> Key: DRILL-4846
> URL: https://issues.apache.org/jira/browse/DRILL-4846
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Affects Versions: 1.7.0
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Fix For: 1.8.0
>
>
> While doing performance testing for DRILL-4530 using a new data set and
> queries, we found two potential performance issues: (a) the metadata cache
> was being read twice in some cases and (b) the checking for directory
> modification time was being done twice, once as part of the first phase of
> directory-based pruning and subsequently after the second phase pruning.
> This check gets expensive for large number of directories. Creating this
> JIRA to track fixes for these issues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)