[ 
https://issues.apache.org/jira/browse/HIVE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049879#comment-17049879
 ] 

Aditya Shah commented on HIVE-21225:
------------------------------------

[~gopalv]  I further noticed Hive-22001. It seems we are swallowing the fnf 
exception in the case where we do the listing to populate the cache. So, we 
could have always done this in case of multiple listings as well since the 
snapshot will be consistent once the valid Txn Write Ids list is made. And as I 
already pointed out the performance loss due to this, should we have avoided 
this?

> ACID: getAcidState() should cache a recursive dir listing locally
> -----------------------------------------------------------------
>
>                 Key: HIVE-21225
>                 URL: https://issues.apache.org/jira/browse/HIVE-21225
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21225.1.patch, HIVE-21225.10.patch, 
> HIVE-21225.11.patch, HIVE-21225.12.patch, HIVE-21225.13.patch, 
> HIVE-21225.14.patch, HIVE-21225.15.patch, HIVE-21225.15.patch, 
> HIVE-21225.16.patch, HIVE-21225.17.patch, HIVE-21225.2.patch, 
> HIVE-21225.3.patch, HIVE-21225.4.patch, HIVE-21225.4.patch, 
> HIVE-21225.5.patch, HIVE-21225.6.patch, HIVE-21225.7.patch, 
> HIVE-21225.7.patch, HIVE-21225.8.patch, HIVE-21225.9.patch, async-pid-44-2.svg
>
>
> Currently getAcidState() makes 3 calls into the FS api which could be 
> answered by making a single recursive listDir call and reusing the same data 
> to check for isRawFormat() and isValidBase().
> All delta operations for a single partition can go against a single listed 
> directory snapshot instead of interacting with the NameNode or ObjectStore 
> within the inner loop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to