[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2017-11-13 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14953:
--
Labels: TODOC3.0  (was: )

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Attachment: (was: HIVE-14953.01.patch)

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Attachment: HIVE-14953.01.patch

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-21 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Attachment: HIVE-14953.01.patch

Updated patch. Unfortunately the local logic for s3 is ugly, given what it 
returns. [~rajesh.balamohan] does this make sense?

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.01.patch, HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Attachment: HIVE-14953.patch

Small patch.
[~rajesh.balamohan] is the insert path the one where you wanted to avoid 
globStatus? I added listStatus in a simple case when there's no recursion.
However, it seems like any recursion (DP or LB) would result in a large number 
of listStatus calls for each directory and then each subdirectory, etc. Are you 
sure it's better than globStatus?

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14953) don't use globStatus on S3 in MM tables

2016-10-20 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14953:

Fix Version/s: hive-14535
   Status: Patch Available  (was: Open)

> don't use globStatus on S3 in MM tables
> ---
>
> Key: HIVE-14953
> URL: https://issues.apache.org/jira/browse/HIVE-14953
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14953.patch
>
>
> Need to investigate if recursive get is faster. Also, normal listStatus might 
> suffice because MM code handles directory structure in a more definite manner 
> than old code; so it knows where the files of interest are to be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)