[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

2011-03-10 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005358#comment-13005358
 ] 

He Yongqiang commented on HIVE-2030:


running tests with the new patch

> isEmptyPath() to use ContentSummary cache
> -
>
> Key: HIVE-2030
> URL: https://issues.apache.org/jira/browse/HIVE-2030
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2030.1.patch, HIVE-2030.2.patch, HIVE-2030.3.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a 
> DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we 
> should be able to avoid some namenode calls and reduce latency in the case of 
> multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

2011-03-10 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005234#comment-13005234
 ] 

He Yongqiang commented on HIVE-2030:


siying, can you update the patch?

> isEmptyPath() to use ContentSummary cache
> -
>
> Key: HIVE-2030
> URL: https://issues.apache.org/jira/browse/HIVE-2030
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2030.1.patch, HIVE-2030.2.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a 
> DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we 
> should be able to avoid some namenode calls and reduce latency in the case of 
> multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

2011-03-08 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004317#comment-13004317
 ] 

He Yongqiang commented on HIVE-2030:


okay, will test and commit.

> isEmptyPath() to use ContentSummary cache
> -
>
> Key: HIVE-2030
> URL: https://issues.apache.org/jira/browse/HIVE-2030
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2030.1.patch, HIVE-2030.2.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a 
> DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we 
> should be able to avoid some namenode calls and reduce latency in the case of 
> multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

2011-03-08 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004260#comment-13004260
 ] 

Siying Dong commented on HIVE-2030:
---

Yongqiang, I don't quite understand your comment. If there is a cache miss, we 
call the original method. We never make things worse.

> isEmptyPath() to use ContentSummary cache
> -
>
> Key: HIVE-2030
> URL: https://issues.apache.org/jira/browse/HIVE-2030
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2030.1.patch, HIVE-2030.2.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a 
> DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we 
> should be able to avoid some namenode calls and reduce latency in the case of 
> multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

2011-03-08 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004253#comment-13004253
 ] 

He Yongqiang commented on HIVE-2030:


The ContentSummary is not guaranteed to be populated. Even it is, it seems this 
information is not passed to the child process. (So this is not empty only when 
executing with local mode)

> isEmptyPath() to use ContentSummary cache
> -
>
> Key: HIVE-2030
> URL: https://issues.apache.org/jira/browse/HIVE-2030
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
>Priority: Minor
> Attachments: HIVE-2030.1.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a 
> DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we 
> should be able to avoid some namenode calls and reduce latency in the case of 
> multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira