date:20160526

[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13860:
---
Attachment: HIVE-13860-java8.patch

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, 
> HIVE-13860-java8.patch, HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results

2016-05-26 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303648#comment-15303648
 ] 

Vaibhav Gumashta commented on HIVE-11527:
-

[~tasanuma0829] Thanks a lot for the work. I'll post my comments (if any) by 
tomorrow. 

> bypass HiveServer2 thrift interface for query results
> -
>
> Key: HIVE-11527
> URL: https://issues.apache.org/jira/browse/HIVE-11527
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
> Attachments: HIVE-11527.WIP.patch
>
>
> Right now, HS2 reads query results and returns them to the caller via its 
> thrift API.
> There should be an option for HS2 to return some pointer to results (an HDFS 
> link?) and for the user to read the results directly off HDFS inside the 
> cluster, or via something like WebHDFS outside the cluster
> Review board link: https://reviews.apache.org/r/40867



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13870) Decimal vector is not resized correctly

2016-05-26 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303642#comment-15303642
 ] 

Matt McCline commented on HIVE-13870:
-

LGTM +1

> Decimal vector is not resized correctly
> ---
>
> Key: HIVE-13870
> URL: https://issues.apache.org/jira/browse/HIVE-13870
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-13870.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13874:

Status: Patch Available  (was: Open)

> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information; add new Unit Tests
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch
>
>
> Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some that catch the error in HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13874:

Summary: Tighten up EOF checking in Fast DeserializeRead classes; display 
better exception information; add new Unit Tests  (was: Tighten up EOF checking 
in Fast DeserializeRead classes; display better exception information)

> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information; add new Unit Tests
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch
>
>
> Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some that catch the error in HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information; add new Unit Tests

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13874:

Attachment: HIVE-13874.01.patch

> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information; add new Unit Tests
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch
>
>
> Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some that catch the error in HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13874) Tighten up EOF checking in Fast DeserializeRead classes; display better exception information

2016-05-26 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13874:

Description: 
Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
stated row end are never read.

Display more detailed information when an exception is thrown by 
DeserializeRead classes.

Add Unit Tests, including some that catch the error in HIVE-13818.

  was:
Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
stated row end are never read.

Display more detailed information when an exception is thrown by 
DeserializeRead classes.


> Tighten up EOF checking in Fast DeserializeRead classes; display better 
> exception information
> -
>
> Key: HIVE-13874
> URL: https://issues.apache.org/jira/browse/HIVE-13874
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13874.01.patch
>
>
> Tighten up EOF bounds checking in LazyBinaryDeserializeRead so bytes beyond 
> stated row end are never read.
> Display more detailed information when an exception is thrown by 
> DeserializeRead classes.
> Add Unit Tests, including some that catch the error in HIVE-13818.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-26 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13837:
---
Status: Open  (was: Patch Available)

minor change from ":" to "." according to Oracle timestamp standard.

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>

[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-26 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13837:
---
Status: Patch Available  (was: Open)

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>

[jira] [Comment Edited] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-26 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303595#comment-15303595
 ] 

Pengcheng Xiong edited comment on HIVE-13837 at 5/27/16 6:11 AM:
-

minor change from ":" to "." according to Oracle timestamp standard. Resubmit 
the patch.


was (Author: pxiong):
minor change from ":" to "." according to Oracle timestamp standard.

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>

[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-26 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13837:
---
Attachment: HIVE-13837.02.patch

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch, HIVE-13837.02.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  T

[jira] [Commented] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303593#comment-15303593
 ] 

Hive QA commented on HIVE-13564:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806019/HIVE-13564.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 10077 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby6_map.q-join13.q-union14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_complex_types.q-groupby_map_ppr_multi_distinct.q-vectorization_16.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_vc.q-input1_limit.q-join16.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats15
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby5_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union24
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
{noformat}

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/400/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/400/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-400/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12806019 - PreCommit-HIVE-MASTER-Build

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Rajat Khandelwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303586#comment-15303586
 ] 

Rajat Khandelwal commented on HIVE-13862:
-

Just to reaffirm the gravity of this fix, in our production, we had a box with 
both mysql and hive metastore running. Without this fix, both processes are 
continuously using 500-600 percent cpu each. After deploying this, the total 
cpu usage for both processes is around 50. 

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13862.patch
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5999) Allow other characters for LINES TERMINATED BY

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303498#comment-15303498
 ] 

Ashutosh Chauhan commented on HIVE-5999:


I am not working on it. Take it over.

> Allow other characters for LINES TERMINATED BY 
> ---
>
> Key: HIVE-5999
> URL: https://issues.apache.org/jira/browse/HIVE-5999
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, Database/Schema, Hive
>Affects Versions: 0.12.0
>Reporter: Mariano Dominguez
>Assignee: Nemon Lou
>Priority: Critical
>  Labels: Delimiter, Hive, Row, SerDe
>
> LINES TERMINATED BY only supports newline '\n' right now.
> It would be nice to loosen this constraint and allow other characters.
> This limitation seems to be hardcoded here:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171
> The DDL Definition on the Hive Language manual shows this as a configurable 
> property whereas it is not. This may lead to mileading assement of being able 
> to choose a choice of field delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5999) Allow other characters for LINES TERMINATED BY

2016-05-26 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-5999:
---
Assignee: Nemon Lou  (was: Ashutosh Chauhan)

> Allow other characters for LINES TERMINATED BY 
> ---
>
> Key: HIVE-5999
> URL: https://issues.apache.org/jira/browse/HIVE-5999
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, Database/Schema, Hive
>Affects Versions: 0.12.0
>Reporter: Mariano Dominguez
>Assignee: Nemon Lou
>Priority: Critical
>  Labels: Delimiter, Hive, Row, SerDe
>
> LINES TERMINATED BY only supports newline '\n' right now.
> It would be nice to loosen this constraint and allow other characters.
> This limitation seems to be hardcoded here:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171
> The DDL Definition on the Hive Language manual shows this as a configurable 
> property whereas it is not. This may lead to mileading assement of being able 
> to choose a choice of field delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13873) Column pruning for nested fields

2016-05-26 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303491#comment-15303491
 ] 

Ferdinand Xu commented on HIVE-13873:
-

Thanks [~xuefuz] for reaching me about it. I will take a look later.

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13873) Column pruning for nested fields

2016-05-26 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303487#comment-15303487
 ] 

Xuefu Zhang commented on HIVE-13873:


FYI, [~Ferd]

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303484#comment-15303484
 ] 

Amareshwari Sriramadasu commented on HIVE-13862:


Yeah.. seems it was always falling to back to ORM - and never worked with 
directsql earlier with HIVE-11487. I dont think we have a way to test whether 
api is answered from directsql vs orm in unit tests.

btw, we deployed the above fix in our production environment, and it is working 
fine.

bq. IIRC some methods use a call on the query object that forces a single 
result, that may be a better option here.
Didnt find any. Can you give more pointers?

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13862.patch
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-05-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303480#comment-15303480
 ] 

Gopal V edited comment on HIVE-13872 at 5/27/16 4:54 AM:
-

AFAIK, the issue is that the column pruner removes the nearly all columns from 
the TableScan, but the VectorizationContext does not realize the needed columns 
list because there's no SEL operator in the middle to indicate the project of 
the 2 columns.

{code}
2016-05-27T00:52:21,575 INFO  
[IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl 
(:()) - Processing data for 
hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Object inspectors = 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 
6, 7, 8, 
2016-05-27T00:52:21,614 ERROR 
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource 
(:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row 
{code}


was (Author: gopalv):
AFAIK, the issue is that the column pruner removes the nearly all columns from 
the TableScan, but the VectorizationContext does not realize the needed columns 
list because there's no SEL operator in the middle to indicate the project of 
the 3 columns.

{code}
2016-05-27T00:52:21,575 INFO  
[IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl 
(:()) - Processing data for 
hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Object inspectors = 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 
6, 7, 8, 
2016-05-27T00:52:21,614 ERROR 
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource 
(:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row 
{code}

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap

[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-05-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303480#comment-15303480
 ] 

Gopal V commented on HIVE-13872:


AFAIK, the issue is that the column pruner removes the nearly all columns from 
the TableScan, but the VectorizationContext does not realize the needed columns 
list because there's no SEL operator in the middle to indicate the project of 
the 3 columns.

{code}
2016-05-27T00:52:21,575 INFO  
[IO-Elevator-Thread-22[attempt_1462788318414_0308_24_00_02_3]]: LlapIoImpl 
(:()) - Processing data for 
hdfs://cn108-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpcds_bin_partitioned_orc_200.db/customer_demographics/03_0
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Object inspectors = 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector
2016-05-27T00:52:21,613 WARN  
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: 
vector.VectorReduceSinkOperator (:()) - Projected columns = 0, 1, 2, 3, 4, 5, 
6, 7, 8, 
2016-05-27T00:52:21,614 ERROR 
[TezTaskRunner[attempt_1462788318414_0308_24_00_01_3]]: tez.MapRecordSource 
(:()) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row 
{code}

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13872:
---
Description: 
TPC-DS Q13 produces a cross-product without CBO simplifying the query

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}

Simplified query

{code}
set hive.cbo.enable=false;

-- explain

select count(1)  
 from store_sales
 ,customer_demographics
 where (
( 
  customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'M'
 )or
 (
   customer_demographics.cd_demo_sk = ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'U'
 ))
;
{code}

{code}
Map 3 
Map Operator Tree:
TableScan
  alias: customer_demographics
  Statistics: Num rows: 1920800 Data size: 717255532 Basic 
stats: COMPLETE Column stats: NONE
  Reduce Output Operator
sort order: 
Statistics: Num rows: 1920800 Data size: 717255532 Basic 
stats: COMPLETE Column stats: NONE
value expressions: cd_demo_sk (type: int), 
cd_marital_status (type: string)
Execution mode: vectorized, llap
LLAP IO: all inputs
{code}

  was:
TPC-DS Q13 produces a cross-product without CBO simplifying the query

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}

Simplified query

{code}
set hive.cbo.enable=false;

-- explain

select count(1)  
 from store_sales
 ,customer_demographics
 where (
( 
  customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'M'
 )or
 (
   customer_demographics.cd_demo_sk = ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'U'
 ))
;
{code}


> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = stor

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13872:
---
Description: 
TPC-DS Q13 produces a cross-product without CBO simplifying the query

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}

Simplified query

{code}
set hive.cbo.enable=false;

-- explain

select count(1)  
 from store_sales
 ,customer_demographics
 where (
( 
  customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'M'
 )or
 (
   customer_demographics.cd_demo_sk = ss_cdemo_sk
  and customer_demographics.cd_marital_status = 'U'
 ))
;
{code}

  was:
TPC-DS Q13 produces a cross-product without CBO simplifying the query

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}


> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303474#comment-15303474
 ] 

Hive QA commented on HIVE-13860:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806538/HIVE-13860-java8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 56 failed/errored test(s), 9003 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestJdbcWithMiniHA - did not produce a TEST-*.xml file
TestJdbcWithMiniMr - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_7.q-orc_merge9.q-tez_union_dynamic_partition.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-cte_4.q-vector_non_string_partition.q-delete_where_non_partitioned.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-script_pipe.q-vector_decimal_aggregate.q-vector_data_types.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorization_13.q-auto_sortmerge_join_13.q-tez_bmj_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucket_num_reducers.q-table_nonprintable.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestNegativeCliDriver-udf_invalid.q-nopart_insert.q-insert_into_with_schema.q-and-734-more
 - did not produce a TEST-*.xml file
TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file
TestSparkCliDriver-auto_join30.q-join2.q-input17.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-bucketmapjoin10.q-join_rc.q-skewjoinopt13.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby10.q-groupby4_noskew.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-ptf_rcfile.q-bucketmapjoin_negative.q-bucket_map_join_spark2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoin_noskew.q-sample2.q-skewjoinopt10.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoin_union_remove_2.q-timestamp_null.q-union32.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoinopt3.q-union27.q-multigroupby_singlemr.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-stats13.q-stats2.q-ppd_gby_join.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-vector_distinct_2.q-join15.q-load_dyn_part3.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDr

[jira] [Commented] (HIVE-13432) ACID ORC CompactorMR job throws java.lang.ArrayIndexOutOfBoundsException: 7

2016-05-26 Thread Qiuzhuang Lian (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303470#comment-15303470
 ] 

Qiuzhuang Lian commented on HIVE-13432:
---

Hi Matt,

Since we are blocked by this issue, can you please help take a look at this?

Many thanks.

> ACID ORC CompactorMR job throws java.lang.ArrayIndexOutOfBoundsException: 7
> ---
>
> Key: HIVE-13432
> URL: https://issues.apache.org/jira/browse/HIVE-13432
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 1.2.1
> Environment: Hadoop 2.6.2+Hive 1.2.1
>Reporter: Qiuzhuang Lian
>Assignee: Matt McCline
>
> After initiating HIVE ACID ORC table compaction, the CompactorMR job throws 
> exception:
> Error: java.lang.ArrayIndexOutOfBoundsException: 7
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1968)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2368)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1969)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2368)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:69)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:183)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:466)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1308)
>   at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:512)
>   at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:491)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> As a result, we see hadoop exception stack,
> 297 failed with state FAILED due to: Task failed 
> task_1458819387386_11297_m_08
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 2016-04-06 11:30:57,891 INFO  [dn209006-27]: mapreduce.Job 
> (Job.java:monitorAndPrintJob(1392)) - Counters: 14
>   Job Counters 
> Failed map tasks=16
> Killed map tasks=7
> Launched map tasks=23
> Other local map tasks=13
> Data-local map tasks=6
> Rack-local map tasks=4
> Total time spent by all maps in occupied slots (ms)=412592
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=206296
> Total vcore-seconds taken by all map tasks=206296
> Total megabyte-seconds taken by all map tasks=422494208
>   Map-Reduce Framework
> CPU time spent (ms)=0
> Physical memory (bytes) snapshot=0
> Virtual memory (bytes) snapshot=0
> 2016-04-06 11:30:57,891 ERROR [dn209006-27]: compactor.Worker 
> (Worker.java:run(176)) - Caught exception while trying to compact 
> lqz.my_orc_acid_table.  Marking clean to avoid repeated failures, 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:186)
>   at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:162)
> 2016-04-06 11:30:57,894 ERROR [dn209006-27]: txn.CompactionTxnHandler 
> (CompactionTxnHandler.java:markCleaned(327)) - Expected to remove at least 
> one row from completed_txn_components when marking compaction entry as clean!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13872:
---
Description: 
TPC-DS Q13 produces a cross-product without CBO simplifying the query

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}

  was:
TPC-DS Q13 produces a cross-product once CBO runs through

{code}
Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
projection column num 1
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
... 18 more
{code}


> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13490) Change itests to be part of the main Hive build

2016-05-26 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13490:

Attachment: HIVE-13490.03.patch

i haven't modified any files in ptest2 or anything...that should work as 
before; but because this patch enables the execution of all tests without the 
need to install into maven local repo - maybe those installs could be removed; 
however it's not entirely clear for me why it installs the artifcats for every 
{{ADDITIONAL_PROFILES}} entry.
i don't have a working ptest2 installation to validate my assumptions..so I 
think it's better for me to stay on the safe side and to not modify them ;) 

> Change itests to be part of the main Hive build
> ---
>
> Key: HIVE-13490
> URL: https://issues.apache.org/jira/browse/HIVE-13490
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13490.01.patch, HIVE-13490.02.patch, 
> HIVE-13490.03.patch
>
>
> Instead of having to build Hive, and then itests separately.
> With IntelliJ, this ends up being loaded as two separate dependencies, and 
> there's a lot of hops involved to make changes.
> Does anyone know why these have been kept separate ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state

2016-05-26 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303395#comment-15303395
 ] 

Xuefu Zhang commented on HIVE-13376:


Sounds good to me, [~lirui]. Disabling spark.yarn.submit.waitAppCompletion 
sounds good. However, I'm not sure if it has any other use other than checking 
app aliveness. Please find out. Thanks.

> HoS emits too many logs with application state
> --
>
> Key: HIVE-13376
> URL: https://issues.apache.org/jira/browse/HIVE-13376
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 2.1.0
>
> Attachments: HIVE-13376.2.patch, HIVE-13376.patch
>
>
> The logs get flooded with something like:
> > Mar 28, 3:12:21.851 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:21.912 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:22.853 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:22.913 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:23.855 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> While this is good information, it is a bit much.
> Seems like SparkJobMonitor hard-codes its interval to 1 second.  It should be 
> higher and perhaps made configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-5999) Allow other characters for LINES TERMINATED BY

2016-05-26 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303389#comment-15303389
 ] 

Nemon Lou edited comment on HIVE-5999 at 5/27/16 3:05 AM:
--

[~ashutoshc] Do you plan to work on this? I have implemented one based on text 
file.And need some review from hive community. :)


was (Author: nemon):
[~ashutoshc] Do you plan to work on this? I have implemented one based on text 
file.And nee some review from hive community. :)

> Allow other characters for LINES TERMINATED BY 
> ---
>
> Key: HIVE-5999
> URL: https://issues.apache.org/jira/browse/HIVE-5999
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, Database/Schema, Hive
>Affects Versions: 0.12.0
>Reporter: Mariano Dominguez
>Assignee: Ashutosh Chauhan
>Priority: Critical
>  Labels: Delimiter, Hive, Row, SerDe
>
> LINES TERMINATED BY only supports newline '\n' right now.
> It would be nice to loosen this constraint and allow other characters.
> This limitation seems to be hardcoded here:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171
> The DDL Definition on the Hive Language manual shows this as a configurable 
> property whereas it is not. This may lead to mileading assement of being able 
> to choose a choice of field delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5999) Allow other characters for LINES TERMINATED BY

2016-05-26 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303389#comment-15303389
 ] 

Nemon Lou commented on HIVE-5999:
-

[~ashutoshc] Do you plan to work on this? I have implemented one based on text 
file.And nee some review from hive community. :)

> Allow other characters for LINES TERMINATED BY 
> ---
>
> Key: HIVE-5999
> URL: https://issues.apache.org/jira/browse/HIVE-5999
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, Database/Schema, Hive
>Affects Versions: 0.12.0
>Reporter: Mariano Dominguez
>Assignee: Ashutosh Chauhan
>Priority: Critical
>  Labels: Delimiter, Hive, Row, SerDe
>
> LINES TERMINATED BY only supports newline '\n' right now.
> It would be nice to loosen this constraint and allow other characters.
> This limitation seems to be hardcoded here:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171
> The DDL Definition on the Hive Language manual shows this as a configurable 
> property whereas it is not. This may lead to mileading assement of being able 
> to choose a choice of field delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13376) HoS emits too many logs with application state

2016-05-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303379#comment-15303379
 ] 

Rui Li commented on HIVE-13376:
---

[~xuefuz], [~szehon] - I just did more tests about this and want to correct 
some of my previous comments:
# In yarn-cluster mode, {{SparkSubmit}} runs the {{Client}}. The Client keeps 
checking the app state and printing the logs. On hive side, we read from 
SparkSubmit's input and err streams and print to hive log.
# In yarn-client mode, {{SparkSubmit}} runs our {{RemoteDriver}}. RemoteDirver 
waits for the app to start running and then serves the job requests from hive. 
It doesn't report the app state after that.
# The verbose logging only happens with yarn-cluster mode.
# The long interval only affects yarn-client mode.
# To avoid the state reports in yarn-cluster mode, we can change log level 
(e.g. WARN instead of INFO), or we can set 
{{spark.yarn.submit.waitAppCompletion=false}} and {{SparkSubmit}} will 
terminate after it submits the app to RM.

I'd prefer disabling {{spark.yarn.submit.waitAppCompletion}}, if it doesn't 
cause any other trouble.

> HoS emits too many logs with application state
> --
>
> Key: HIVE-13376
> URL: https://issues.apache.org/jira/browse/HIVE-13376
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Fix For: 2.1.0
>
> Attachments: HIVE-13376.2.patch, HIVE-13376.patch
>
>
> The logs get flooded with something like:
> > Mar 28, 3:12:21.851 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:21.912 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:21 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:22.853 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> > Mar 28, 3:12:22.913 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:22 INFO yarn.Client: Application report 
> > for application_1458679386200_0149 (state: RUNNING)
> > Mar 28, 3:12:23.855 PMINFO
> > org.apache.hive.spark.client.SparkClientImpl
> > [stderr-redir-1]: 16/03/28 15:12:23 INFO yarn.Client: Application report 
> > for application_1458679386200_0161 (state: RUNNING)
> While this is good information, it is a bit much.
> Seems like SparkJobMonitor hard-codes its interval to 1 second.  It should be 
> higher and perhaps made configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files

2016-05-26 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303387#comment-15303387
 ] 

Aaron Fabbri commented on HIVE-13778:
-

[~sailesh] can you assign this to me please?  I will resolve it.

> DROP TABLE PURGE on S3A table with too many files does not delete the files
> ---
>
> Key: HIVE-13778
> URL: https://issues.apache.org/jira/browse/HIVE-13778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: metastore, s3
>
> I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A 
> that has many files, the files never get deleted. However, the Hive metastore 
> logs do say that the path was deleted:
> "Not moving [path] to trash"
> "Deleted the diretory [path]"
> I initially thought that this was due to the eventually consistent nature of 
> S3 for deletes, however, a week later, the files still exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files

2016-05-26 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303380#comment-15303380
 ] 

Aaron Fabbri edited comment on HIVE-13778 at 5/27/16 3:01 AM:
--

Note this is the same as 
[IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558].  
See that issue for my explanation that this is expected behavior.


was (Author: fabbri):
Note this is the same as 
[IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558]

> DROP TABLE PURGE on S3A table with too many files does not delete the files
> ---
>
> Key: HIVE-13778
> URL: https://issues.apache.org/jira/browse/HIVE-13778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: metastore, s3
>
> I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A 
> that has many files, the files never get deleted. However, the Hive metastore 
> logs do say that the path was deleted:
> "Not moving [path] to trash"
> "Deleted the diretory [path]"
> I initially thought that this was due to the eventually consistent nature of 
> S3 for deletes, however, a week later, the files still exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13778) DROP TABLE PURGE on S3A table with too many files does not delete the files

2016-05-26 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303380#comment-15303380
 ] 

Aaron Fabbri commented on HIVE-13778:
-

Note this is the same as 
[IMPALA-3558|https://issues.cloudera.org/projects/IMPALA/issues/IMPALA-3558]

> DROP TABLE PURGE on S3A table with too many files does not delete the files
> ---
>
> Key: HIVE-13778
> URL: https://issues.apache.org/jira/browse/HIVE-13778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sailesh Mukil
>Priority: Critical
>  Labels: metastore, s3
>
> I've noticed that when we do a DROP TABLE tablename PURGE on a table on S3A 
> that has many files, the files never get deleted. However, the Hive metastore 
> logs do say that the path was deleted:
> "Not moving [path] to trash"
> "Deleted the diretory [path]"
> I initially thought that this was due to the eventually consistent nature of 
> S3 for deletes, however, a week later, the files still exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303373#comment-15303373
 ] 

Hive QA commented on HIVE-13837:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806011/HIVE-13837.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 9975 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_7.q-orc_merge9.q-tez_union_dynamic_partition.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-auto_join_reordering_values.q-ptf_seqfile.q-auto_join18.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-bucketmapjoin3.q-enforce_order.q-union11.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby2_noskew_multi_distinct.q-vectorization_10.q-list_bucket_dml_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
{noformat}

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/399/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/399/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-399/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12806011 - PreCommit-HIVE-MASTER-Build

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> -

[jira] [Updated] (HIVE-13443) LLAP: signing for the second state of submit (the event)

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13443:

Attachment: HIVE-13443.01.patch

Parking the rebase...

> LLAP: signing for the second state of submit (the event)
> 
>
> Key: HIVE-13443
> URL: https://issues.apache.org/jira/browse/HIVE-13443
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13443.01.patch, HIVE-13443.WIP.nogen.patch, 
> HIVE-13443.patch, HIVE-13443.wo.13444.13675.nogen.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13870) Decimal vector is not resized correctly

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13870:

Status: Patch Available  (was: Open)

> Decimal vector is not resized correctly
> ---
>
> Key: HIVE-13870
> URL: https://issues.apache.org/jira/browse/HIVE-13870
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-13870.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13870) Decimal vector is not resized correctly

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13870:

Attachment: HIVE-13870.patch

Simple patch. [~prasanth_j] [~mmccline] the same patch that we have discussed 
before

> Decimal vector is not resized correctly
> ---
>
> Key: HIVE-13870
> URL: https://issues.apache.org/jira/browse/HIVE-13870
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-13870.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13818:
---
Fix Version/s: 2.1.0
Affects Version/s: 2.1.0
   Status: Patch Available  (was: Open)

> Fast Vector MapJoin Long hashtable has to handle all integral types
> ---
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, 
> HIVE-13818.1.patch, vector_bug.q, vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13818:
---
Attachment: HIVE-13818.1.patch

> Fast Vector MapJoin Long hashtable has to handle all integral types
> ---
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, 
> HIVE-13818.1.patch, vector_bug.q, vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-05-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315
 ] 

Sergey Shelukhin edited comment on HIVE-13084 at 5/27/16 1:36 AM:
--

Read the patch, it seems to make sense. I didn't find anything, which might be 
evening related ;) Thanks for the comments, the code appears to do what they 
say. Didn't review the q files.
+1 pending tests


was (Author: sershe):
Read the patch, it seems to make sense. I didn't find anything, which might be 
evening related ;) Didn't review the q files.
+1 pending tests

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, 
> HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-05-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315
 ] 

Sergey Shelukhin commented on HIVE-13084:
-

Read the patch, it seems to make sense. I didn't find anything which might be 
evening related ;) Didn't review the q files.
+1 pending tests

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, 
> HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-05-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303315#comment-15303315
 ] 

Sergey Shelukhin edited comment on HIVE-13084 at 5/27/16 1:36 AM:
--

Read the patch, it seems to make sense. I didn't find anything, which might be 
evening related ;) Didn't review the q files.
+1 pending tests


was (Author: sershe):
Read the patch, it seems to make sense. I didn't find anything which might be 
evening related ;) Didn't review the q files.
+1 pending tests

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, HIVE-13084.04.patch, HIVE-13084.05.patch, 
> HIVE-13084.06.patch, HIVE-13084.07.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303313#comment-15303313
 ] 

Mohit Sabharwal commented on HIVE-13860:


Updated TestSparkCliDriver.testCliDriver_join0 and 
TestSparkCliDriver.testCliDriver_outer_join_ppr as well.
i.e. regenerated to bring java8 version up to date with the java7 version.

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, 
> HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-13860:
---
Attachment: HIVE-13860-java8.patch

> Fix more json related JDK8 test failures
> 
>
> Key: HIVE-13860
> URL: https://issues.apache.org/jira/browse/HIVE-13860
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-13860-java8.patch, HIVE-13860-java8.patch, 
> HIVE-13860-java8.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13675) LLAP: add HMAC signatures to LLAPIF splits

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13675:

Attachment: HIVE-13675.02.patch

Backing up the rebase for now

> LLAP: add HMAC signatures to LLAPIF splits
> --
>
> Key: HIVE-13675
> URL: https://issues.apache.org/jira/browse/HIVE-13675
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13675.01.patch, HIVE-13675.02.patch, 
> HIVE-13675.WIP.patch, HIVE-13675.wo.13444.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13818) Fast Vector MapJoin Long hashtable has to handle all integral types

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13818:
---
Summary: Fast Vector MapJoin Long hashtable has to handle all integral 
types  (was: Fast Vector MapJoin not enhanced to use sortOrder when handling 
BinarySortable keys for Small Table?)

> Fast Vector MapJoin Long hashtable has to handle all integral types
> ---
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, 
> vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-13818:
--

Assignee: Gopal V  (was: Matt McCline)

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, 
> vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12721) Add UUID built in function

2016-05-26 Thread Jeremy Beard (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303293#comment-15303293
 ] 

Jeremy Beard commented on HIVE-12721:
-

I did it to mimic the existing built in functions, which for strings all seemed 
to return Text. Looking again now I see a couple that return String but Text is 
much more common.

> Add UUID built in function
> --
>
> Key: HIVE-12721
> URL: https://issues.apache.org/jira/browse/HIVE-12721
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Jeremy Beard
>Assignee: Jeremy Beard
> Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch
>
>
> A UUID function would be very useful for ETL jobs that need to generate 
> surrogate keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11345) Fix formatting of Show Compations/Transactions/Locks

2016-05-26 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11345:
--
Assignee: (was: Eugene Koifman)

> Fix formatting of Show Compations/Transactions/Locks
> 
>
> Key: HIVE-11345
> URL: https://issues.apache.org/jira/browse/HIVE-11345
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>
> all the columns of the output are variable length (in each row, based on 
> data) - makes it really difficult to read



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12267) Make Compaction jobs run on Tez

2016-05-26 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12267:
--
Assignee: (was: Eugene Koifman)

> Make Compaction jobs run on Tez
> ---
>
> Key: HIVE-12267
> URL: https://issues.apache.org/jira/browse/HIVE-12267
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>
> Currently all Compaction jobs run on MR.
> Should support running on Tez.
> add hive.compactor.engine which can be set to mr, tez or value of 
> hive.execution.engine property.  The latter would be the default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13248) Change date_add/date_sub/to_date functions to return Date type rather than String

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303280#comment-15303280
 ] 

Hive QA commented on HIVE-13248:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806234/HIVE-13248.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/398/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/398/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-398/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.lang.IllegalArgumentException: resource batch-exec.vm not found.
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12806234 - PreCommit-HIVE-MASTER-Build

> Change date_add/date_sub/to_date functions to return Date type rather than 
> String
> -
>
> Key: HIVE-13248
> URL: https://issues.apache.org/jira/browse/HIVE-13248
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13248.1.patch, HIVE-13248.2.patch, 
> HIVE-13248.3.patch
>
>
> Some of the original "date" related functions return string values rather 
> than Date values, because they were created before the Date type existed in 
> Hive. We can try to change these to return Date in the 2.x line.
> Date values should be implicitly convertible to String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock

2016-05-26 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303265#comment-15303265
 ] 

Eugene Koifman commented on HIVE-11956:
---

failed tests with age > 1 are not related
[~wzheng] could you review please

> SHOW LOCKS should indicate what acquired the lock
> -
>
> Key: HIVE-11956
> URL: https://issues.apache.org/jira/browse/HIVE-11956
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11956.patch
>
>
> This can be a queryId, Flume agent id, Storm bolt id, etc.  This would 
> dramatically help diagnosing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results

2016-05-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303262#comment-15303262
 ] 

Sergey Shelukhin commented on HIVE-11527:
-

+1. [~vgumashta] any comments? Otherwise I will commit soon

> bypass HiveServer2 thrift interface for query results
> -
>
> Key: HIVE-11527
> URL: https://issues.apache.org/jira/browse/HIVE-11527
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
> Attachments: HIVE-11527.WIP.patch
>
>
> Right now, HS2 reads query results and returns them to the caller via its 
> thrift API.
> There should be an option for HS2 to return some pointer to results (an HDFS 
> link?) and for the user to read the results directly off HDFS inside the 
> cluster, or via something like WebHDFS outside the cluster
> Review board link: https://reviews.apache.org/r/40867



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-26 Thread Nachiket Vaidya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303261#comment-15303261
 ] 

Nachiket Vaidya commented on HIVE-13836:


Thank you [~sushanth]. I created a jira HIVE-13869 and linked it up.
Can you please review the attached patch?

Thank you.

> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303254#comment-15303254
 ] 

Hive QA commented on HIVE-11956:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805953/HIVE-11956.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 98 failed/errored test(s), 10108 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorization_13.q-auto_sortmerge_join_13.q-tez_bmj_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_tests
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_joins_explain
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstCompa

[jira] [Updated] (HIVE-13444) LLAP: add HMAC signatures to LLAP; verify them on LLAP side

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13444:

Attachment: HIVE-13444.04.patch

Added the test

> LLAP: add HMAC signatures to LLAP; verify them on LLAP side
> ---
>
> Key: HIVE-13444
> URL: https://issues.apache.org/jira/browse/HIVE-13444
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13444.01.patch, HIVE-13444.02.patch, 
> HIVE-13444.03.patch, HIVE-13444.04.patch, HIVE-13444.WIP.patch, 
> HIVE-13444.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request

2016-05-26 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303240#comment-15303240
 ] 

Eugene Koifman commented on HIVE-13354:
---

+1 pending tests


> Add ability to specify Compaction options per table and per request
> ---
>
> Key: HIVE-13354
> URL: https://issues.apache.org/jira/browse/HIVE-13354
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Attachments: HIVE-13354.1.patch, 
> HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, HIVE-13354.3.patch
>
>
> Currently the are a few options that determine when automatic compaction is 
> triggered.  They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be 
> compacted more often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is 
> currently no way to control job parameters (like memory, for example) except 
> to specify it in hive-site.xml for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if 
> launched via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13863) Improve AnnotateWithStatistics with support for cartesian product

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303227#comment-15303227
 ] 

Ashutosh Chauhan commented on HIVE-13863:
-

+1 I am assuming it probably require updating few other golden files as well.

> Improve AnnotateWithStatistics with support for cartesian product
> -
>
> Key: HIVE-13863
> URL: https://issues.apache.org/jira/browse/HIVE-13863
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13863.patch
>
>
> Currently cartesian product stats based on cardinality of inputs are not 
> inferred correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13861) Fix up nullability issue that might be created by pull up constants rules

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303212#comment-15303212
 ] 

Ashutosh Chauhan commented on HIVE-13861:
-

+1 pending tests

> Fix up nullability issue that might be created by pull up constants rules
> -
>
> Key: HIVE-13861
> URL: https://issues.apache.org/jira/browse/HIVE-13861
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13861.patch
>
>
> When we pull up constants through Union or Sort operators, we might end up 
> rewriting the original expression into an expression whose schema has 
> different nullability properties for some of its columns.
> This results in AssertionError of the following kind:
> {noformat}
> ...
> org.apache.hive.service.cli.HiveSQLException: Error running query: 
> java.lang.AssertionError: Internal error: Cannot add expression of different 
> type to set:
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303209#comment-15303209
 ] 

Ashutosh Chauhan commented on HIVE-13808:
-

Can you create a RB with updated golden files for this?

> Use constant expressions to backtrack when we create ReduceSink
> ---
>
> Key: HIVE-13808
> URL: https://issues.apache.org/jira/browse/HIVE-13808
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13808.patch
>
>
> Follow-up of HIVE-13068.
> When we create a RS with constant expressions as keys/values, and immediately 
> after we create a SEL operator that backtracks the expressions from the RS. 
> Currently, we automatically create references for all the keys/values.
> Before, we could rely on Hive ConstantPropagate to propagate the constants to 
> the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get 
> exercised anymore. Thus, we can simply create constant expressions when we 
> create the SEL operator instead of a reference.
> Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> {noformat}
> Plan:
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> POSTHOOK: type: QUERY
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: alltypesorc
> Statistics: Num rows: 12288 Data size: 2641964 Basic stats: 
> COMPLETE Column stats: NONE
> Filter Operator
>   predicate: cdouble is null (type: boolean)
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: cstring1 (type: string), cint (type: int), 
> cfloat (type: float), csmallint (type: smallint), 
> COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string)
> outputColumnNames: _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: null (type: double), _col1 (type: string), 
> _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: 
> string)
>   sort order: ++
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
>   Execution mode: vectorized
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 
> (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: 
> float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE 
> Column stats: NONE
>   Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 10
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13849) Wrong plan for hive.optimize.sort.dynamic.partition=true

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303190#comment-15303190
 ] 

Ashutosh Chauhan commented on HIVE-13849:
-

+1

> Wrong plan for hive.optimize.sort.dynamic.partition=true
> 
>
> Key: HIVE-13849
> URL: https://issues.apache.org/jira/browse/HIVE-13849
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Attachments: HIVE-13849.patch
>
>
> To reproduce:
> {noformat}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.optimize.sort.dynamic.partition=true;
> CREATE TABLE non_acid(key string, value string) PARTITIONED BY(ds string, hr 
> int) CLUSTERED BY(key) INTO 2 BUCKETS STORED AS ORC;
> explain insert into table non_acid partition(ds,hr) select * from srcpart 
> sort by value;
> {noformat}
> CC'ed [~ashutoshc], [~ekoifman]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13617) LLAP: support non-vectorized execution in IO

2016-05-26 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13617:

Attachment: HIVE-13617.03.patch

Fixing the issue in CliDriver case

> LLAP: support non-vectorized execution in IO
> 
>
> Key: HIVE-13617
> URL: https://issues.apache.org/jira/browse/HIVE-13617
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13617-wo-11417.patch, HIVE-13617-wo-11417.patch, 
> HIVE-13617.01.patch, HIVE-13617.03.patch, HIVE-13617.patch, HIVE-13617.patch, 
> HIVE-15396-with-oi.patch
>
>
> Two approaches - a separate decoding path, into rows instead of VRBs; or 
> decoding VRBs into rows on a higher level (the original LlapInputFormat). I 
> think the latter might be better - it's not a hugely important path, and perf 
> in non-vectorized case is not the best anyway, so it's better to make do with 
> much less new code and architectural disruption. 
> Some ORC patches in progress introduce an easy to reuse (or so I hope, 
> anyway) VRB-to-row conversion, so we should just use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13868) Include derby.log file in the Hive ptest logs

2016-05-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña resolved HIVE-13868.

   Resolution: Fixed
Fix Version/s: 2.2.0

> Include derby.log file in the Hive ptest logs
> -
>
> Key: HIVE-13868
> URL: https://issues.apache.org/jira/browse/HIVE-13868
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 2.2.0
>
> Attachments: HIVE-13868.1.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13868) Include derby.log file in the Hive ptest logs

2016-05-26 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303105#comment-15303105
 ] 

Sergio Peña commented on HIVE-13868:


No need review for this patch.
I need to submit it in order to get the derby.log so that I debug the HMS 
errors we're seeing.

[~szehon] FYI

> Include derby.log file in the Hive ptest logs
> -
>
> Key: HIVE-13868
> URL: https://issues.apache.org/jira/browse/HIVE-13868
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-13868.1.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13858) LLAP: A preempted task can end up waiting on completeInitialization if some part of the executing code suppressed the interrupt

2016-05-26 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-13858:
--
Attachment: HIVE-13858.02.patch

Updated patch with review comments addressed.

> LLAP: A preempted task can end up waiting on completeInitialization if some 
> part of the executing code suppressed the interrupt
> ---
>
> Key: HIVE-13858
> URL: https://issues.apache.org/jira/browse/HIVE-13858
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
>  Labels: llap
> Attachments: HIVE-13858.01.patch, HIVE-13858.02.patch
>
>
> An interrupt along with a HiveProcessor.abort call is made when attempting to 
> preempt a task.
> In this specific case, the task was in the middle of HDFS IO - which 
> 'handled' the interrupt by retrying. As a result the interrupt status on the 
> thread was reset - so instead of skipping the future.get in 
> completeInitialization - the task ended up blocking there.
> End result - a single executor slot permanently blocked in LLAP. Depending on 
> what else is running - this can cause a cluster level deadlock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13868) Include derby.log file in the Hive ptest logs

2016-05-26 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-13868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13868:
---
Attachment: HIVE-13868.1.patch

> Include derby.log file in the Hive ptest logs
> -
>
> Key: HIVE-13868
> URL: https://issues.apache.org/jira/browse/HIVE-13868
> Project: Hive
>  Issue Type: Task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-13868.1.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-26 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303089#comment-15303089
 ] 

Sushanth Sowmyan commented on HIVE-13836:
-

I will agree. As long as we don't lose the underlying issue(and when you create 
a new jira for that, could you link it to this - that way, whoever works on 
that has an easy reproduction to work against), I'm okay with adding 
synchronization here to DbNotificationListener.



> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13867) restore HiveAuthorizer interface changes

2016-05-26 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-13867:
-
Description: 
TLDR: Some of the changes to hive authorizer interface made as part of 
HIVE-13360 are inappropriate and need to be restored.


Regarding the move of ip address from the query context object 
(HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place 
for it.
In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , 
every request for single session does not have to come via a single IP address.
Current assumption in hive code base is that the IP address is valid for the 
entire session. This might not hold true for ever.
A limitation in HS2 that it holds state for the session would currently force 
the user configure proxies and knox to remember which next Host it was using, 
because they need to have state to remember the HS2 instance to be used! But 
that is a limitation that ideally goes away some day, and when that happens, 
HiveAuthzContext would be the right place for keeping the IP address!

  was:
TLDR: Some of the changes to hive authorizer interface made as part of 
HIVE-13360 are inappropriate and need to be restored.
Pasting comments from Thejas in an email:
Regarding the plans to move ip address from the query context object 
(HiveAuthzContext) to HiveAuthenticationProvider. I don't think that is a clear 
right place for it.
In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , 
every request for single session does not have to come via a single IP address.
Current assumption in hive code base is that the IP address is valid for the 
entire session. This might not hold true for ever.
A limitation in HS2 that it holds state for the session would currently force 
the user configure proxies and knox to remember which next Host it was using, 
because they need to have state to remember the HS2 instance to be used! But 
that is a limitation that ideally goes away some day, and when that happens, 
HiveAuthzContext would be the right place for keeping the IP address!


> restore HiveAuthorizer interface changes
> 
>
> Key: HIVE-13867
> URL: https://issues.apache.org/jira/browse/HIVE-13867
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Priority: Blocker
>
> TLDR: Some of the changes to hive authorizer interface made as part of 
> HIVE-13360 are inappropriate and need to be restored.
> Regarding the move of ip address from the query context object 
> (HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place 
> for it.
> In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 
> , every request for single session does not have to come via a single IP 
> address.
> Current assumption in hive code base is that the IP address is valid for the 
> entire session. This might not hold true for ever.
> A limitation in HS2 that it holds state for the session would currently force 
> the user configure proxies and knox to remember which next Host it was using, 
> because they need to have state to remember the HS2 instance to be used! But 
> that is a limitation that ideally goes away some day, and when that happens, 
> HiveAuthzContext would be the right place for keeping the IP address!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13867) restore HiveAuthorizer interface changes

2016-05-26 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-13867:


Assignee: Thejas M Nair

> restore HiveAuthorizer interface changes
> 
>
> Key: HIVE-13867
> URL: https://issues.apache.org/jira/browse/HIVE-13867
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Blocker
>
> TLDR: Some of the changes to hive authorizer interface made as part of 
> HIVE-13360 are inappropriate and need to be restored.
> Regarding the move of ip address from the query context object 
> (HiveAuthzContext) to HiveAuthenticationProvider. That isn't the right place 
> for it.
> In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 
> , every request for single session does not have to come via a single IP 
> address.
> Current assumption in hive code base is that the IP address is valid for the 
> entire session. This might not hold true for ever.
> A limitation in HS2 that it holds state for the session would currently force 
> the user configure proxies and knox to remember which next Host it was using, 
> because they need to have state to remember the HS2 instance to be used! But 
> that is a limitation that ideally goes away some day, and when that happens, 
> HiveAuthzContext would be the right place for keeping the IP address!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303049#comment-15303049
 ] 

Thejas M Nair commented on HIVE-13749:
--

HIVE-3098 fixes it for metastore . 
This fix can be dangerous for the embedded metastore use case.

bq. I think because of my test being run as a single user. 
Single user shouldn't matter as the cache is based on the UGI object as I 
mentioned earlier. Testing using hive-cli might be better that would ensure 
creation on new metastore connection each time as well.

I assume you haven't seen this in other user environments. I suspect there is 
something unique about their environment that would be triggering this. You 
might want to check if their are using any specific plugins.
Is this with kerberos enabled ?

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13338) Differences in vectorized_casts.q output for vectorized and non-vectorized runs

2016-05-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303018#comment-15303018
 ] 

Prasanth Jayachandran commented on HIVE-13338:
--

lgtm, +1

> Differences in vectorized_casts.q output for vectorized and non-vectorized 
> runs
> ---
>
> Key: HIVE-13338
> URL: https://issues.apache.org/jira/browse/HIVE-13338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13338.01.patch, HIVE-13338.02.patch
>
>
> Turn off vectorization and you get different results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-26 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303001#comment-15303001
 ] 

Lefty Leverenz commented on HIVE-13269:
---

Doc note:  This adds *hive.optimize.filter.stats.reduction* to HiveConf.java, 
so it needs to be documented in the wiki for release 2.1.0.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]

Added a TODOC2.1 label.

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12721) Add UUID built in function

2016-05-26 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302992#comment-15302992
 ] 

Sergio Peña commented on HIVE-12721:


Test failures are not related.

[~jbeard] Why is {{public Text evaluate}} used instead of {{public String 
evaluate}} if at the end we convert to String?

The patch looks very simple. I just want to know if Text is needed in the class.

> Add UUID built in function
> --
>
> Key: HIVE-12721
> URL: https://issues.apache.org/jira/browse/HIVE-12721
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Jeremy Beard
>Assignee: Jeremy Beard
> Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch
>
>
> A UUID function would be very useful for ETL jobs that need to generate 
> surrogate keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302995#comment-15302995
 ] 

Ashutosh Chauhan commented on HIVE-13857:
-

+1 pending tests

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, 
> HIVE-13857.3.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13269) Simplify comparison expressions using column stats

2016-05-26 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-13269:
--
Labels: TODOC2.1  (was: )

> Simplify comparison expressions using column stats
> --
>
> Key: HIVE-13269
> URL: https://issues.apache.org/jira/browse/HIVE-13269
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13269.01.patch, HIVE-13269.02.patch, 
> HIVE-13269.03.patch, HIVE-13269.04.patch, HIVE-13269.patch, HIVE-13269.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302965#comment-15302965
 ] 

Naveen Gangam commented on HIVE-13749:
--

Oops just posted the patch to RB (https://reviews.apache.org/r/47918/) at the 
same time as this comment.
1) Isnt the shutdown() called when a HMS request is fulfilled and the executor 
thread is being released back to the pool? So any new calls would potentially 
have a new UGI and a new instance of HiveConf. Also, calling closeAll() just 
removes the cached element. At worst, the FileSystem object is re-cached on a 
miss.
2) The other fixes are to address a similar issue on the HS2 side where using 
the FileSystem APIs causes the Cache to grow. This issue is on the HMS side.
Regarding reproducing this locally, yes and no. I ran 100's of iterations of 
beeline executing a script that create a table and then drops it while randomly 
toggling the value of a hive conf property. For 300 iterations, I have gotten 
it to retain 60 instances which is not quite the same success as the customer 
is having. I think because of my test being run as a single user. Re-running 
the test with this fix, I have 8 instances retained but none in this particular 
cache.
I have run with debug around this code and during the drop table command, I can 
see an element being added to the cache. I am also waiting for logs from this 
customer who is running with some instrumentation + fix. I can confirm that 
from those logs too.

Alternatively, in checkTrashPurgeCombination() we could add a close() to this 
FileSystem. In my testcase, this has been the primary reason for the retained 
instances.
{code}
  HadoopShims.HdfsEncryptionShim shim =

ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf), 
hiveConf);
{code}

Thoughts? Thanks

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13857:
-
Attachment: HIVE-13857.3.patch

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, 
> HIVE-13857.3.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13857:
-
Attachment: (was: HIVE-13857.3.patch)

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, 
> HIVE-13857.3.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13857:
-
Attachment: HIVE-13857.3.patch

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch, 
> HIVE-13857.3.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302925#comment-15302925
 ] 

Thejas M Nair commented on HIVE-13749:
--

Regarding the patch
1. How do you make sure that files created by this ugi are not in use in other 
parts ? We need do the closing only after we are sure that the ugi object is no 
longer going to be used. 
2. I am not sure if this would fix the leak. As you can see we have patches 
that deal with the closing when UGI object is no longer used.
Are you able to reproduce this in your environment ? If not, you might want to 
add some debugging around code that adds entries in the cache, and see if the 
closing of files generated from those places is happening.
You might also want to see if the user is some some plugins that might be 
creating new UGI objects.



> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-13749:
-
Status: Patch Available  (was: Open)

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-13749:
-
Attachment: HIVE-13749.patch

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-05-26 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302889#comment-15302889
 ] 

Naveen Gangam commented on HIVE-13749:
--

perhaps in the HiveMetaStore.shutdown() we clear the cache for the current UGI. 
Make sense?  Could you please review the patch when you have a chance ?
I have had the customer disable the FileSystem caching by adding 
{{fs.hdfs.impl.disable.cache=true}} to the HMS configuration, the re-run the 
workloads. The same site that had 66000+ Configuration instances in their 
heapdump now has 80 instances and none of them are in Cache. So it is clear 
that the FileSystem.CACHE is the problem. 
Thanks

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13860) Fix more json related JDK8 test failures

2016-05-26 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302876#comment-15302876
 ] 

Hive QA commented on HIVE-13860:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12806415/HIVE-13860-java8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 106 failed/errored test(s), 9933 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestJdbcWithMiniHA - did not produce a TEST-*.xml file
TestJdbcWithMiniMr - did not produce a TEST-*.xml file
TestMiniTezCliDriver-constprog_dpp.q-dynamic_partition_pruning.q-vectorization_10.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-cte_4.q-vector_non_string_partition.q-delete_where_non_partitioned.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-order_null.q-vector_acid3.q-orc_merge10.q-and-12-more - 
did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file
TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_3.q-groupby7.q-auto_join17.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoinopt15.q-join39.q-avro_joins_native.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autoColumnStats_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_int_type_promotion
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynp

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-26 Thread Nachiket Vaidya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302875#comment-15302875
 ] 

Nachiket Vaidya commented on HIVE-13836:


[~sushanth] Thank you for reply.

I agree with you that the issue is deep inside.

The issue is easy to reproduce. I tried that and I got different stack trace.
{noformat}
2016-05-26 12:32:27,904 ERROR 
org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-7]: 
MetaException(message:java.lang.NullPointerException)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5535)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2308)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy14.add_partitions_req(Unknown Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partitions_req.getResult(ThriftHiveMetastore.java:9723)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partitions_req.getResult(ThriftHiveMetastore.java:9707)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1245)
at 
com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:424)
at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:372)
at 
org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:628)
at 
org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:324)
at 
org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:194)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:640)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1786)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
at org.datanucleus.store.query.Query.execute(Query.java:1654)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
at 
org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7534)
at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAddPartition(DbNotificationListener.java:168)
{noformat}

It is of course concurrency issue manifesting in different way.

It looks like db notification is using ObjectStore differently. Adding 
synchronization at db notification solves this issue.
Given that there is not much performance implication for using synchronization, 
it should be ok to fix it in db notification and then file separate jira to 
track ObjectStore issue.

Please let me know what do you think.

> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
>

[jira] [Commented] (HIVE-13818) Fast Vector MapJoin not enhanced to use sortOrder when handling BinarySortable keys for Small Table?

2016-05-26 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302844#comment-15302844
 ] 

Gopal V commented on HIVE-13818:


The bug is limited to Fast hashtables

{code}
hive.mapjoin.hybridgrace.hashtable=false;
hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true;
{code}

> Fast Vector MapJoin not enhanced to use sortOrder when handling 
> BinarySortable keys for Small Table?
> 
>
> Key: HIVE-13818
> URL: https://issues.apache.org/jira/browse/HIVE-13818
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13818.01.patch, HIVE-13818.02.patch, vector_bug.q, 
> vector_bug.q.out
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302831#comment-15302831
 ] 

Ashutosh Chauhan commented on HIVE-13857:
-

There are other callers which are passing in false. Can you also create a RB 
for this?

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13751) LlapOutputFormatService should have a configurable send buffer size

2016-05-26 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302813#comment-15302813
 ] 

Prasanth Jayachandran commented on HIVE-13751:
--

Yeah. This will go into 2.1.0

> LlapOutputFormatService should have a configurable send buffer size
> ---
>
> Key: HIVE-13751
> URL: https://issues.apache.org/jira/browse/HIVE-13751
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13751.1.patch, HIVE-13751.2.patch, 
> HIVE-13751.3.patch
>
>
> Netty channel buffer size is hard-coded 128KB now. It should be made 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13611) add jar causes beeline not to output log messages

2016-05-26 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-13611:


Assignee: Naveen Gangam

> add jar causes beeline not to output log messages
> -
>
> Key: HIVE-13611
> URL: https://issues.apache.org/jira/browse/HIVE-13611
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.1.0
>Reporter: Thomas Scott
>Assignee: Naveen Gangam
>Priority: Minor
>
> After adding a jar in beeline warning messages and job log ouptut are no 
> longer shown. This only occurs if you use short connection strings (e.g. 
> jdbc:hive2://). Example below:
> {code}
> 0: jdbc:hive2://nightly55-1.gce.cloudera.com:> !connect jdbc:hive2://
> Connecting to jdbc:hive2://
> Enter username for jdbc:hive2://: hive
> Enter password for jdbc:hive2://: 
> Connected to: Apache Hive (version 1.1.0-cdh5.5.4)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.4)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 1: jdbc:hive2://> select count(*) from sample_07 limit 1;
> INFO  : Number of reduce tasks determined at compile time: 1
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=
> INFO  : number of splits:1
> INFO  : Submitting tokens for job: job_1461621650734_0020
> INFO  : The url to track the job: 
> http://nightly55-1.gce.cloudera.com:8088/proxy/application_1461621650734_0020/
> INFO  : Starting Job = job_1461621650734_0020, Tracking URL = 
> http://nightly55-1.gce.cloudera.com:8088/proxy/application_1461621650734_0020/
> INFO  : Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill 
> job_1461621650734_0020
> INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of 
> reducers: 1
> INFO  : 2016-04-26 01:36:04,297 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2016-04-26 01:36:11,802 Stage-1 map = 100%,  reduce = 0%, Cumulative 
> CPU 1.52 sec
> INFO  : 2016-04-26 01:36:19,419 Stage-1 map = 100%,  reduce = 100%, 
> Cumulative CPU 3.25 sec
> INFO  : MapReduce Total cumulative CPU time: 3 seconds 250 msec
> INFO  : Ended Job = job_1461621650734_0020
> +--+--+
> | _c0  |
> +--+--+
> | 823  |
> +--+--+
> 1 row selected (25.908 seconds)
> 1: jdbc:hive2://> add jar hdfs://some_nn.com/tmp/somedir/some_jar.jar 
> 1: jdbc:hive2://> ;
> converting to local hdfs://some_nn.com/tmp/somedir/some_jar.jar
> Added [/tmp/93ca63a2-5019-4f37-b9b4-75f1740b53c8_resources/some_jar.jar] to 
> class path
> Added resources: [hdfs://some_nn.com/tmp/somedir/some_jar.jar]
> No rows affected (0.179 seconds)
> 1: jdbc:hive2://> select count(*) from sample_07 limit 1;
> +--+--+
> | _c0  |
> +--+--+
> | 823  |
> +--+--+
> 1: jdbc:hive2://> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13862) org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter falls back to ORM

2016-05-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302745#comment-15302745
 ] 

Sergey Shelukhin commented on HIVE-13862:
-

Hmm. I wonder how (and if ;)) it ever worked. Could the list result be 
DB-specific, or is this the bug for all DBs?
IIRC some methods use a call on the query object that forces a single result, 
that may be a better option here.

> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter
>  falls back to ORM 
> ---
>
> Key: HIVE-13862
> URL: https://issues.apache.org/jira/browse/HIVE-13862
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Rajat Khandelwal
> Fix For: 2.1.0
>
> Attachments: HIVE-13862.patch
>
>
> We are seeing following exception and calls fall back to ORM which make it 
> costly :
> {noformat}
>  WARN  org.apache.hadoop.hive.metastore.ObjectStore - Direct SQL failed, 
> falling back to ORM
> java.lang.ClassCastException: 
> org.datanucleus.store.rdbms.query.ForwardQueryResult cannot be cast to 
> java.lang.Number
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlInt(MetaStoreDirectSql.java:892)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:855)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getNumPartitionsViaSqlFilter(MetaStoreDirectSql.java:405)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2763)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$5.getSqlResult(ObjectStore.java:2755)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2606)
>  ~[hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilterInternal(ObjectStore.java:2770)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByFilter(ObjectStore.java:2746)
>  [hive-exec-2.1.2-inm-SNAPSHOT.jar:2.1.2-inm-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13844) Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class

2016-05-26 Thread Svetozar Ivanov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Svetozar Ivanov updated HIVE-13844:
---
Description: 
Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 
'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name is 
'org.apache.hadoop.hive.ql.index.AggregateIndexHandler'

{code}
  public static enum IndexType {
AGGREGATE_TABLE("aggregate", 
"org.apache.hadoop.hive.ql.AggregateIndexHandler"),
COMPACT_SUMMARY_TABLE("compact", 
"org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"),

BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler");

private IndexType(String indexType, String className) {
  indexTypeName = indexType;
  this.handlerClsName = className;
}

private final String indexTypeName;
private final String handlerClsName;

public String getName() {
  return indexTypeName;
}

public String getHandlerClsName() {
  return handlerClsName;
}
  }
  
{code}

Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't work 
in case of configured 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' 
as index handler. In hive server log is observed java.lang.NullPointerException.

  was:
Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 
'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name is 
'org.apache.hadoop.hive.ql.index.AggregateIndexHandler'

{code}
  public static enum IndexType {
AGGREGATE_TABLE("aggregate", 
"org.apache.hadoop.hive.ql.AggregateIndexHandler"),
COMPACT_SUMMARY_TABLE("compact", 
"org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"),

BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler");

private IndexType(String indexType, String className) {
  indexTypeName = indexType;
  this.handlerClsName = className;
}

private final String indexTypeName;
private final String handlerClsName;

public String getName() {
  return indexTypeName;
}

public String getHandlerClsName() {
  return handlerClsName;
}
  }
  
{code}

Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't work 
as we got java.lang.NullPointerException.


> Invalid index handler in org.apache.hadoop.hive.ql.index.HiveIndex class
> 
>
> Key: HIVE-13844
> URL: https://issues.apache.org/jira/browse/HIVE-13844
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing
>Affects Versions: 2.0.0
>Reporter: Svetozar Ivanov
>Priority: Minor
> Attachments: HIVE-13844.patch
>
>
> Class org.apache.hadoop.hive.ql.index.HiveIndex has invalid handler name 
> 'org.apache.hadoop.hive.ql.AggregateIndexHandler'. The actual FQ class name 
> is 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler'
> {code}
>   public static enum IndexType {
> AGGREGATE_TABLE("aggregate", 
> "org.apache.hadoop.hive.ql.AggregateIndexHandler"),
> COMPACT_SUMMARY_TABLE("compact", 
> "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler"),
> 
> BITMAP_TABLE("bitmap","org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler");
> private IndexType(String indexType, String className) {
>   indexTypeName = indexType;
>   this.handlerClsName = className;
> }
> private final String indexTypeName;
> private final String handlerClsName;
> public String getName() {
>   return indexTypeName;
> }
> public String getHandlerClsName() {
>   return handlerClsName;
> }
>   }
>   
> {code}
> Because all of the above statement like 'SHOW INDEXES ON MY_TABLE' doesn't 
> work in case of configured 
> 'org.apache.hadoop.hive.ql.index.AggregateIndexHandler' as index handler. In 
> hive server log is observed java.lang.NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13840) Orc split generation is reading file footers twice

2016-05-26 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302704#comment-15302704
 ] 

Owen O'Malley commented on HIVE-13840:
--

I commented in RB, but this looks fine.

+1

> Orc split generation is reading file footers twice
> --
>
> Key: HIVE-13840
> URL: https://issues.apache.org/jira/browse/HIVE-13840
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13840.1.patch, HIVE-13840.2.patch
>
>
> Recent refactorings to move orc out introduced a regression in split 
> generation. This leads to reading the orc file footers twice during split 
> generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11693) CommonMergeJoinOperator throws exception with tez

2016-05-26 Thread Selina Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302701#comment-15302701
 ] 

Selina Zhang commented on HIVE-11693:
-

[~rajesh.balamohan], we hit the same issue recently. But I think the patch you 
attached did not fix the root problem. 

The issue is actually CommonMergeJoinOperator only set big table position when 
it has inputs for big table. 

{code:title=CommonMergeJoinOperator.java}
  @Override
  public void process(Object row, int tag) throws HiveException {
posBigTable = (byte) conf.getBigTablePosition();
...
{code}

If the input is empty, the above method will not be called. In the query you 
listed, a subquery is involved. The generated table is tagged as 0, while the 
left table is tagged as 1.  GenTezWork.java set the big table position as 1 for 
both reduce work and CommonJoinOperator. In reduce phase, when 
ReduceRecordProcessor got executed, it retrieves the record from big table:

{code:title=ReduceRecordProcessor.java}
@Override
  void run() throws Exception {

// run the operator pipeline
while (sources[bigTablePosition].pushRecord()) {
}
  }
{code}

The big table position here is 1. If the input from the big table is empty, 
this is the only place pushRecord() be called to read big table. However, 
because the CommonMergeJoinOperator missed set big table position, in closeOp() 
part, it will think tag 1 is small table, so another pushRecord() is called to 
retrieve table content. Then we see the exception listed in this JIRA. 

Please let me know if my analysis has problem. If you think it is correct, can 
you update the patch?

Thanks

> CommonMergeJoinOperator throws exception with tez
> -
>
> Key: HIVE-11693
> URL: https://issues.apache.org/jira/browse/HIVE-11693
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
> Attachments: HIVE-11693.1.patch
>
>
> Got this when executing a simple query with latest hive build + tez latest 
> version.
> {noformat}
> Error: Failure while running task: 
> attempt_1439860407967_0291_2_03_45_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:316)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:482)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:434)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:384)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:61

[jira] [Commented] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302698#comment-15302698
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13857:
--

[~ashutoshc] where ever recursion=false, this makes sense and I have modified 
it in patch #2. When recursion=true, I dont think sending the Status object of 
top level directory will be of much help, so I have retained the behavior there.

Thanks
Hari

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13857) insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-13857:
-
Attachment: HIVE-13857.2.patch

> insert overwrite select from some table fails throwing 
> org.apache.hadoop.security.AccessControlException - II
> -
>
> Key: HIVE-13857
> URL: https://issues.apache.org/jira/browse/HIVE-13857
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-13857.1.patch, HIVE-13857.2.patch
>
>
> HIVE-13810 missed a fix, tracking it here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13566) Auto-gather column stats - phase 1

2016-05-26 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302676#comment-15302676
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-13566:
--

The commit for this jira removed the fix for HIVE-13810. I will add them back 
as part of HIVE-13857

Thanks
Hari

> Auto-gather column stats - phase 1
> --
>
> Key: HIVE-13566
> URL: https://issues.apache.org/jira/browse/HIVE-13566
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13566.01.patch, HIVE-13566.02.patch, 
> HIVE-13566.03.patch
>
>
> This jira adds code and tests for auto-gather column stats. Golden file 
> update will be done in phase 2 - HIVE-11160



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13855) select INPUTFILENAME throws NPE exception

2016-05-26 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302666#comment-15302666
 ] 

Yongzhi Chen commented on HIVE-13855:
-

The change looks fine. +1

> select INPUT__FILE__NAME throws NPE exception
> -
>
> Key: HIVE-13855
> URL: https://issues.apache.org/jira/browse/HIVE-13855
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13855.1.patch
>
>
> The following query executes successfully
> select INPUT__FILE__NAME from src limit 1;
> But the following NPE is thrown
> {noformat}
> 16/05/25 16:49:49 ERROR exec.Utilities: Failed to load plan: null: 
> java.lang.NullPointerException
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:407)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:299)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:315)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$1.doNext(FetchOperator.java:340)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$1.doNext(FetchOperator.java:331)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:484)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:144)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1884)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-26 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302652#comment-15302652
 ] 

Sushanth Sowmyan commented on HIVE-13836:
-

Thanks for the catch, [~vaidyand].

I have a couple of thoughts - firstly, since the notion of a Notification Log 
Table is dependent on the RawStore implementation, the lock might belong more 
in implementations of RawStore such as ObjectStore, rather than in 
DbNotificationListener.

Secondly, when I went and looked at ObjectStore, I see that we're correctly 
calling openTransaction()/commitTransaction()/rollbackTransaction() which 
should serve the same purposes as the lock manages, and if you're getting an 
error that states "Transaction has already started", it's likely that we're 
hitting a deeper bug with transaction semantics(with nesting allowed for) in 
ObjectStore. Adding a lock in DbNotificationListener will fix this bug in 
DbNotificationListener, but we have some other issue that we're not discovering.

[~alangates], thoughts?

> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12721) Add UUID built in function

2016-05-26 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302648#comment-15302648
 ] 

Sean Busbey commented on HIVE-12721:


can a committer with access to the QA job relaunch it to see if these failures 
are related?

> Add UUID built in function
> --
>
> Key: HIVE-12721
> URL: https://issues.apache.org/jira/browse/HIVE-12721
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Jeremy Beard
>Assignee: Jeremy Beard
> Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch
>
>
> A UUID function would be very useful for ETL jobs that need to generate 
> surrogate keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12721) Add UUID built in function

2016-05-26 Thread Jeremy Beard (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302638#comment-15302638
 ] 

Jeremy Beard commented on HIVE-12721:
-

I don't think so but I can't check because the test result pages seem to have 
been purged.

> Add UUID built in function
> --
>
> Key: HIVE-12721
> URL: https://issues.apache.org/jira/browse/HIVE-12721
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Jeremy Beard
>Assignee: Jeremy Beard
> Attachments: HIVE-12721.1.patch, HIVE-12721.2.patch, HIVE-12721.patch
>
>
> A UUID function would be very useful for ETL jobs that need to generate 
> surrogate keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 192 matches

Mail list logo