[ 
https://issues.apache.org/jira/browse/DRILL-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420379#comment-15420379
 ] 

Aman Sinha commented on DRILL-4846:
-----------------------------------

For the directory modification checks (part (b) above),  it turns out that 
maintaining a cached state within the Metadata class does not work because the 
Metadata APIs are static methods 
e.g. Metadata.readBlockMeta() (see 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java#L132).
  
In general, there's not a single Metadata instance that is created for the 
duration of the entire query.  

I have implemented an alternative approach where a separate MetadataContext is 
created for the duration of the query and here one can keep track of things 
like directory checks etc.  In the future we may use it for other purposes as 
well.  This is working functionally.  I need to get some performance tests done 
and code cleanup and will post a PR after that.  

> Eliminate extra operations during metadata cache pruning
> --------------------------------------------------------
>
>                 Key: DRILL-4846
>                 URL: https://issues.apache.org/jira/browse/DRILL-4846
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.7.0
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>             Fix For: 1.8.0
>
>
> While doing performance testing for DRILL-4530 using a new data set and 
> queries, we found two potential performance issues: (a) the metadata cache 
> was being read twice in some cases and (b) the checking for directory 
> modification time was being done twice, once as part of the first phase of 
> directory-based pruning and subsequently after the second phase pruning.   
> This check gets expensive for large number of directories.   Creating this 
> JIRA to track fixes for these issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to