[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15593627#comment-15593627 ]
Rajesh Balamohan commented on HIVE-14953: ----------------------------------------- [~sershe] - It was in FileSinkOperator.handleMMTable (getMmDirectoryCandidates) specifically. I do not see that codepath in the latest codebase in the branch now. globStatus with pattern has to be replaced with {{listStatus(path, boolean recursive)}} and any additional filtering pattern has to be applied on client side. In cloud storage systems, it would be able to do prefix listing and reduce the number of calls significantly as compared to globStatus which iterates through the files one at a time in client side. > don't use globStatus on S3 in MM tables > --------------------------------------- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task > Reporter: Rajesh Balamohan > Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)