[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-14953: -- Labels: TODOC3.0 (was: ) > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Labels: TODOC3.0 > Fix For: hive-14535 > > Attachments: HIVE-14953.01.patch, HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.01.patch, HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Attachment: (was: HIVE-14953.01.patch) > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.01.patch, HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Attachment: HIVE-14953.01.patch > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.01.patch, HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Attachment: HIVE-14953.01.patch Updated patch. Unfortunately the local logic for s3 is ugly, given what it returns. [~rajesh.balamohan] does this make sense? > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.01.patch, HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Attachment: HIVE-14953.patch Small patch. [~rajesh.balamohan] is the insert path the one where you wanted to avoid globStatus? I added listStatus in a simple case when there's no recursion. However, it seems like any recursion (DP or LB) would result in a large number of listStatus calls for each directory and then each subdirectory, etc. Are you sure it's better than globStatus? > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables
[ https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14953: Fix Version/s: hive-14535 Status: Patch Available (was: Open) > don't use globStatus on S3 in MM tables > --- > > Key: HIVE-14953 > URL: https://issues.apache.org/jira/browse/HIVE-14953 > Project: Hive > Issue Type: Sub-task >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Fix For: hive-14535 > > Attachments: HIVE-14953.patch > > > Need to investigate if recursive get is faster. Also, normal listStatus might > suffice because MM code handles directory structure in a more definite manner > than old code; so it knows where the files of interest are to be found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)