[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827561#comment-15827561
 ] 

Hive QA commented on HIVE-13014:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847947/HIVE-13014.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 375 failed/errored test(s), 10824 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.common.TestBlobStorageUtils.testValidAndInvalidFileSystems
 (batchId=240)
org.apache.hadoop.hive.io.TestHadoopFileStatus.testHadoopFileStatusAclEntries 
(batchId=190)

[jira] [Updated] (HIVE-15657) publish information to the vertex description field

2017-01-17 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15657:
--
Description: 
This can include information like the tables being processed, specific 
operations, etc. Maybe a sub-part of the explain plan?

This can then show up on the Tez UI.


  was:
This can include information like the tables being processed, specific 
operations, etc. Maybe a sub-part of the explain plan?



> publish information to the vertex description field
> ---
>
> Key: HIVE-15657
> URL: https://issues.apache.org/jira/browse/HIVE-15657
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>
> This can include information like the tables being processed, specific 
> operations, etc. Maybe a sub-part of the explain plan?
> This can then show up on the Tez UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2017-01-17 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15269:
--
Attachment: (was: HIVE-15269.13.patch)

> Dynamic Min-Max runtime-filtering for Tez
> -
>
> Key: HIVE-15269
> URL: https://issues.apache.org/jira/browse/HIVE-15269
> Project: Hive
>  Issue Type: New Feature
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15269.10.patch, HIVE-15269.11.patch, 
> HIVE-15269.12.patch, HIVE-15269.1.patch, HIVE-15269.2.patch, 
> HIVE-15269.3.patch, HIVE-15269.4.patch, HIVE-15269.5.patch, 
> HIVE-15269.6.patch, HIVE-15269.7.patch, HIVE-15269.8.patch, HIVE-15269.9.patch
>
>
> If a dimension table and fact table are joined:
> {noformat}
> select *
> from store join store_sales on (store.id = store_sales.store_id)
> where store.s_store_name = 'My Store'
> {noformat}
> One optimization that can be done is to get the min/max store id values that 
> come out of the scan/filter of the store table, and send this min/max value 
> (via Tez edge) to the task which is scanning the store_sales table.
> We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where 
> this predicate can be pushed down to the storage handler (for example for ORC 
> formats). Pushing a min/max predicate to the ORC reader would allow us to 
> avoid having to entire whole row groups during the table scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15269) Dynamic Min-Max runtime-filtering for Tez

2017-01-17 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15269:
--
Attachment: HIVE-15269.13.patch

Integrated bloom filters in semi join reduction.


> Dynamic Min-Max runtime-filtering for Tez
> -
>
> Key: HIVE-15269
> URL: https://issues.apache.org/jira/browse/HIVE-15269
> Project: Hive
>  Issue Type: New Feature
>Reporter: Jason Dere
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15269.10.patch, HIVE-15269.11.patch, 
> HIVE-15269.12.patch, HIVE-15269.13.patch, HIVE-15269.1.patch, 
> HIVE-15269.2.patch, HIVE-15269.3.patch, HIVE-15269.4.patch, 
> HIVE-15269.5.patch, HIVE-15269.6.patch, HIVE-15269.7.patch, 
> HIVE-15269.8.patch, HIVE-15269.9.patch
>
>
> If a dimension table and fact table are joined:
> {noformat}
> select *
> from store join store_sales on (store.id = store_sales.store_id)
> where store.s_store_name = 'My Store'
> {noformat}
> One optimization that can be done is to get the min/max store id values that 
> come out of the scan/filter of the store table, and send this min/max value 
> (via Tez edge) to the task which is scanning the store_sales table.
> We can add a BETWEEN(min, max) predicate to the store_sales TableScan, where 
> this predicate can be pushed down to the storage handler (for example for ORC 
> formats). Pushing a min/max predicate to the ORC reader would allow us to 
> avoid having to entire whole row groups during the table scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15627) Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those in supportedGenericUDFs

2017-01-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827545#comment-15827545
 ] 

Matt McCline edited comment on HIVE-15627 at 1/18/17 7:33 AM:
--

TestMiniLlapLocalCliDriver --> orc_llap.q.out needs Q out file update.

However, vector_udf1.q with prior commit of HIVE-15588 has uncovered a bug:
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
decode(encode(c2,'US-ASCII'),'US-ASCII')
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:134)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:784)
... 19 more
Caused by: java.lang.RuntimeException: Unhandled object type binary
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setOutputCol(VectorUDFAdaptor.java:334)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:210)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:154)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:121)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:117)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:132)
... 22 more
{code}

vector_if_expr.q didn't fail on taptop -- assuming it is spurious.


was (Author: mmccline):
TestMiniLlapLocalCliDriver --> orc_llap.q.out needs Q out file update.

However, vector_udf1.q with prior commit of HIVE-15588 has uncovered a bug:
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
decode(encode(c2,'US-ASCII'),'US-ASCII')
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:134)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:784)
... 19 more
Caused by: java.lang.RuntimeException: Unhandled object type binary
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setOutputCol(VectorUDFAdaptor.java:334)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:210)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:154)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:121)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:117)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:132)
... 22 more
{code}

> Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those 
> in supportedGenericUDFs
> -
>
> Key: HIVE-15627
> URL: https://issues.apache.org/jira/browse/HIVE-15627
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15627.01.patch, HIVE-15627.02.patch, 
> HIVE-15627.03.patch, HIVE-15627.04.patch, HIVE-15627.05.patch
>
>
> Missed this when doing HIVE-14336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15627) Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those in supportedGenericUDFs

2017-01-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827545#comment-15827545
 ] 

Matt McCline commented on HIVE-15627:
-

TestMiniLlapLocalCliDriver --> orc_llap.q.out needs Q out file update.

However, vector_udf1.q with prior commit of HIVE-15588 has uncovered a bug:
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
decode(encode(c2,'US-ASCII'),'US-ASCII')
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:134)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:784)
... 19 more
Caused by: java.lang.RuntimeException: Unhandled object type binary
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setOutputCol(VectorUDFAdaptor.java:334)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:210)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:154)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:121)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:117)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:132)
... 22 more
{code}

> Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those 
> in supportedGenericUDFs
> -
>
> Key: HIVE-15627
> URL: https://issues.apache.org/jira/browse/HIVE-15627
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15627.01.patch, HIVE-15627.02.patch, 
> HIVE-15627.03.patch, HIVE-15627.04.patch, HIVE-15627.05.patch
>
>
> Missed this when doing HIVE-14336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827517#comment-15827517
 ] 

anishek commented on HIVE-15656:


[~thejas] please review.

> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15656.patch
>
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-15656:
---
Status: Patch Available  (was: Open)

> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15656.patch
>
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-15656:
---
Attachment: HIVE-15656.patch

> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15656.patch
>
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-15656:
---
Affects Version/s: 2.2.0
Fix Version/s: 2.2.0

> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827513#comment-15827513
 ] 

ASF GitHub Bot commented on HIVE-15656:
---

GitHub user anishek opened a pull request:

https://github.com/apache/hive/pull/134

HIVE-15656: Place powermock in correct dependency management section root 
pom.xml

Also correcting the dependency for shims/common/pom.xml which was 
originally getting it from the root pom.xml  -->   
section. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/anishek/hive HIVE-15656

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #134


commit ff3f05b4c2baa5f7daa4531f828d863a9b2a5365
Author: Anishek Agarwal 
Date:   2017-01-18T05:50:29Z

HIVE-15656: Place powermock in correct dependency management section root 
pom.xml




> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15627) Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those in supportedGenericUDFs

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827511#comment-15827511
 ] 

Hive QA commented on HIVE-15627:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847684/HIVE-15627.05.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 376 failed/errored test(s), 10821 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters]
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)

[jira] [Updated] (HIVE-14707) ACID: Insert shuffle sort-merges on blank KEY

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14707:
--
Attachment: HIVE-14707.22.patch

> ACID: Insert shuffle sort-merges on blank KEY
> -
>
> Key: HIVE-14707
> URL: https://issues.apache.org/jira/browse/HIVE-14707
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Eugene Koifman
> Attachments: HIVE-14707.01.patch, HIVE-14707.02.patch, 
> HIVE-14707.03.patch, HIVE-14707.04.patch, HIVE-14707.05.patch, 
> HIVE-14707.06.patch, HIVE-14707.08.patch, HIVE-14707.09.patch, 
> HIVE-14707.10.patch, HIVE-14707.11.patch, HIVE-14707.13.patch, 
> HIVE-14707.14.patch, HIVE-14707.16.patch, HIVE-14707.17.patch, 
> HIVE-14707.18.patch, HIVE-14707.19.patch, HIVE-14707.19.patch, 
> HIVE-14707.20.patch, HIVE-14707.21.patch, HIVE-14707.22.patch
>
>
> The ACID insert codepath uses a sorted shuffle, while they key used for 
> shuffle is always 0 bytes long.
> {code}
> hive (sales_acid)> explain insert into sales values(1, 2, 
> '3400---009', 1, null);
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: gopal_20160906172626_80261c4c-79cc-4e02-87fe-3133be404e55:2
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: values__tmp__table__2
>   Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: tmp_values_col1 (type: string), 
> tmp_values_col2 (type: string), tmp_values_col3 (type: string), 
> tmp_values_col4 (type: string), tmp_values_col5 (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4
> Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: UDFToLong(_col1) (type: 
> bigint)
>   Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
> Execution mode: vectorized, llap
> LLAP IO: no inputs
> {code}
> Note the missing "+" / "-" in the Sort Order fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827473#comment-15827473
 ] 

Hive QA commented on HIVE-15472:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847949/HIVE-15472.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 371 failed/errored test(s), 10814 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.common.TestBlobStorageUtils.testValidAndInvalidFileSystems
 (batchId=240)
org.apache.hadoop.hive.io.TestHadoopFileStatus.testHadoopFileStatusAclEntries 
(batchId=190)

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827470#comment-15827470
 ] 

Xuefu Zhang commented on HIVE-15580:


[~Ferd], Functionally, I don't see anything bad because groupByKey was used in 
Hive for aggregation. Hive's groupby operator is able to process one row at a 
time with this patch. Performance wise, I'm not sure if this will improve or 
degrade. That depends on the performance difference of groupByKey() + value 
iterator and repartitionAndSortWithinPartitions() + dummy value iterator. It 
would be great if you guys can find out.

The obvious benefit of this change is that Hive on Spark overcomes the 
unbounded memory usage of groupByKey(). The patch also solves the problem in 
HIVE-15527.

Please note that this patch is WIP. We will improve it, for example getting 
ride of the dummy value iterator created per row.

I manually ran all spark tests with this patch, and there was only one test 
failure which needs investigation.

> Replace Spark's groupByKey operator with something with bounded memory
> --
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-15656:
---
Description: 
As part of committing  HIVE-15550 to master,  powermock was included in the 
root pom.xml. This should not be the case. This lead to build failures fixed in 
HIVE-15648.


  was:
As part of committing  HIVE-15550 to master,  powermock was included in the 
root pom.xml. This should not be the case. This lead to build failures fixed in 
HIVE-15648.
This jira is to remove the powermock from the root pom.xml


> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15656) Place powermock in correct dependency management section root pom.xml

2017-01-17 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-15656:
---
Summary: Place powermock in correct dependency management section root 
pom.xml  (was: Remove Powermock from root pom.xml)

> Place powermock in correct dependency management section root pom.xml
> -
>
> Key: HIVE-15656
> URL: https://issues.apache.org/jira/browse/HIVE-15656
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
>
> As part of committing  HIVE-15550 to master,  powermock was included in the 
> root pom.xml. This should not be the case. This lead to build failures fixed 
> in HIVE-15648.
> This jira is to remove the powermock from the root pom.xml



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15588) Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reuse

2017-01-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15588:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master.  Thanks Gopal for nailing the root cause and reviewing the 
change.

> Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc 
> to prevent wrong reuse
> ---
>
> Key: HIVE-15588
> URL: https://issues.apache.org/jira/browse/HIVE-15588
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15588.01.patch, HIVE-15588.02.patch, 
> HIVE-15588.03.patch, HIVE-15588.04.patch, HIVE-15588.05.patch, 
> HIVE-15588.06.patch
>
>
> Make sure we don't deallocate a scratch column too quickly and cause result 
> corruption due to scratch column reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15588) Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reuse

2017-01-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15588:

Fix Version/s: 2.2.0

> Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc 
> to prevent wrong reuse
> ---
>
> Key: HIVE-15588
> URL: https://issues.apache.org/jira/browse/HIVE-15588
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15588.01.patch, HIVE-15588.02.patch, 
> HIVE-15588.03.patch, HIVE-15588.04.patch, HIVE-15588.05.patch, 
> HIVE-15588.06.patch
>
>
> Make sure we don't deallocate a scratch column too quickly and cause result 
> corruption due to scratch column reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15648) Hive throws compilation error due to $powermock.version not being present in root pom

2017-01-17 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827457#comment-15827457
 ] 

anishek commented on HIVE-15648:


New Jira @ HIVE-15656

> Hive throws compilation error due to $powermock.version not being present in 
> root pom
> -
>
> Key: HIVE-15648
> URL: https://issues.apache.org/jira/browse/HIVE-15648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-15648.1.patch
>
>
> Looks like caused by HIVE-15550



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15588) Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reuse

2017-01-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15588:

Attachment: HIVE-15588.06.patch

Fixup vector_coalesce_3.q.out golden files.

> Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc 
> to prevent wrong reuse
> ---
>
> Key: HIVE-15588
> URL: https://issues.apache.org/jira/browse/HIVE-15588
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15588.01.patch, HIVE-15588.02.patch, 
> HIVE-15588.03.patch, HIVE-15588.04.patch, HIVE-15588.05.patch, 
> HIVE-15588.06.patch
>
>
> Make sure we don't deallocate a scratch column too quickly and cause result 
> corruption due to scratch column reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827447#comment-15827447
 ] 

Thejas M Nair commented on HIVE-15579:
--

Just a small nit - add " user" to this error message is redundant.
ie, can you please replace -
  LOG.error("Error while setting delegation token for " + proxyUser + " user.", 
e);
with 
  LOG.error("Error while setting delegation token for " + proxyUser, e);

> Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
> ---
>
> Key: HIVE-15579
> URL: https://issues.apache.org/jira/browse/HIVE-15579
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Nanda kumar
> Attachments: HIVE-15579.000.patch, HIVE-15579.001.patch, 
> HIVE-15579.002.patch
>
>
> Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would 
> be useful to have similar feature for hive metastore client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15648) Hive throws compilation error due to $powermock.version not being present in root pom

2017-01-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827440#comment-15827440
 ] 

Thejas M Nair commented on HIVE-15648:
--

It turns out the inclusion of powermock in base pom.xml was an error in 
applying the patch. My guess is that I must have done a 'patch -p1' which 
applied the beeline/pom.xml changes to base pom.xml while it failed to apply 
rest of the changes, and later used 'patch -p0' to apply the whole patch.
Thanks for pointing this out [~anishek]. Can you please create a new jira to 
remove it from base pom.xml ?



> Hive throws compilation error due to $powermock.version not being present in 
> root pom
> -
>
> Key: HIVE-15648
> URL: https://issues.apache.org/jira/browse/HIVE-15648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-15648.1.patch
>
>
> Looks like caused by HIVE-15550



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15648) Hive throws compilation error due to $powermock.version not being present in root pom

2017-01-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827440#comment-15827440
 ] 

Thejas M Nair edited comment on HIVE-15648 at 1/18/17 5:18 AM:
---

It turns out the inclusion of powermock in base pom.xml was an error in 
applying the patch. My guess is that I must have done a 'patch -p1' which 
applied the beeline/pom.xml changes to base pom.xml while it failed to apply 
rest of the changes, and later used 'patch -p0' to apply the whole patch.
Thanks for pointing this out [~anishek]. 
[~anishek] Can you please create a new jira to remove it from base pom.xml ?




was (Author: thejas):
It turns out the inclusion of powermock in base pom.xml was an error in 
applying the patch. My guess is that I must have done a 'patch -p1' which 
applied the beeline/pom.xml changes to base pom.xml while it failed to apply 
rest of the changes, and later used 'patch -p0' to apply the whole patch.
Thanks for pointing this out [~anishek]. Can you please create a new jira to 
remove it from base pom.xml ?



> Hive throws compilation error due to $powermock.version not being present in 
> root pom
> -
>
> Key: HIVE-15648
> URL: https://issues.apache.org/jira/browse/HIVE-15648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-15648.1.patch
>
>
> Looks like caused by HIVE-15550



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15588) Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reuse

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827434#comment-15827434
 ] 

Hive QA commented on HIVE-15588:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847683/HIVE-15588.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 373 failed/errored test(s), 10805 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_3] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_3]
 (batchId=148)

[jira] [Updated] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-17 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15579:
-
Status: Patch Available  (was: Open)

> Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
> ---
>
> Key: HIVE-15579
> URL: https://issues.apache.org/jira/browse/HIVE-15579
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Nanda kumar
> Attachments: HIVE-15579.000.patch, HIVE-15579.001.patch, 
> HIVE-15579.002.patch
>
>
> Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would 
> be useful to have similar feature for hive metastore client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15645) Tez session pool may restart sessions in a wrong queue

2017-01-17 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827428#comment-15827428
 ] 

Gunther Hagleitner commented on HIVE-15645:
---

+1

> Tez session pool may restart sessions in a wrong queue
> --
>
> Key: HIVE-15645
> URL: https://issues.apache.org/jira/browse/HIVE-15645
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15645.01.patch, HIVE-15645.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-17 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827424#comment-15827424
 ] 

Nanda kumar commented on HIVE-15579:


Thanks for the review [~thejas]. 
As per the comment, retained the exception in the log message in 
[^HIVE-15579.002.patch]

> Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
> ---
>
> Key: HIVE-15579
> URL: https://issues.apache.org/jira/browse/HIVE-15579
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Nanda kumar
> Attachments: HIVE-15579.000.patch, HIVE-15579.001.patch, 
> HIVE-15579.002.patch
>
>
> Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would 
> be useful to have similar feature for hive metastore client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-17 Thread Nanda kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HIVE-15579:
---
Attachment: HIVE-15579.002.patch

> Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
> ---
>
> Key: HIVE-15579
> URL: https://issues.apache.org/jira/browse/HIVE-15579
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Nanda kumar
> Attachments: HIVE-15579.000.patch, HIVE-15579.001.patch, 
> HIVE-15579.002.patch
>
>
> Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would 
> be useful to have similar feature for hive metastore client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14707) ACID: Insert shuffle sort-merges on blank KEY

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827406#comment-15827406
 ] 

Hive QA commented on HIVE-14707:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847948/HIVE-14707.21.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 374 failed/errored test(s), 10821 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=149)

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-17 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827399#comment-15827399
 ] 

Ferdinand Xu commented on HIVE-15580:
-

Hi [~xuefuz], the main change is about replacing *groupByKey* with 
*repartitionAndSortWithinPartitions*. Just help me to have a better understand. 
Before this patch:
e.g. GroupByShuffle will lead to the following result:
K1 -> iterator of {V11,V12,V13...}
K2 -> iterator of {V21,V22,V23...}
...

With this patch:
K1 -> V11
K1 -> V12
K1 -> V13
...
K2 -> V21
...

And we process them one by one without fetching the value from iterator. If so, 
is there any side effect by changing this?


> Replace Spark's groupByKey operator with something with bounded memory
> --
>
> Key: HIVE-15580
> URL: https://issues.apache.org/jira/browse/HIVE-15580
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15655) Optimizer: Allow config option to disable n-way JOIN merging

2017-01-17 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-15655:
--

Assignee: Gopal V

> Optimizer: Allow config option to disable n-way JOIN merging 
> -
>
> Key: HIVE-15655
> URL: https://issues.apache.org/jira/browse/HIVE-15655
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> N-way Joins in Tez produce bad runtime plans whenever they are left-outer 
> joins with map-joins.
> This is something which should have a safety setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15638) ArrayIndexOutOfBoundsException when output Columns for UDTF are pruned

2017-01-17 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827388#comment-15827388
 ] 

Nemon Lou commented on HIVE-15638:
--

The following query will pass(adding a 'select * ' before UDTF hwrl) :
{noformat}
set hive.auto.convert.join=false;
select substring(c.start_time,1,10) create_date, 
tt.data_id,tt.word_type,tt.primary_word,tt.primary_nature,tt.primary_offset,tt.related_word,tt.related_nature,tt.related_offset
 
from (
select * from (
select hwrl(data_dt,src,data_id,tag_id,entity_src,pos_tagging)
as 
(data_dt,data_src,data_id,word_type,primary_word,primary_nature,primary_offset,related_word,related_nature,related_offset)
from (
select a.data_dt,a.src,a.data_id,a.tag_id,a.entity_src,b.pos_tagging
from tb_a a, tb_b b
where a.key like 'CP%' 
and a.data_dt='20160901'
and a.data_id=b.data_id
and b.src='04'
) t
) ttt
) tt, (select key,start_time from tb_c where data_dt='20160901') c 
where tt.data_id=c.key 
;
{noformat}

> ArrayIndexOutOfBoundsException when output Columns for UDTF are pruned 
> ---
>
> Key: HIVE-15638
> URL: https://issues.apache.org/jira/browse/HIVE-15638
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Nemon Lou
>
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 151
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:364)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:200)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:525)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:494)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:180)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:174)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:499)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:160)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException: 151
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:416)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:878)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:149)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
>   ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 151
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:183)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:202)
>   at 
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
>   at 
> 

[jira] [Updated] (HIVE-15654) TezJobMonitor is not writing some headers to log file

2017-01-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15654:
-
Status: Patch Available  (was: Open)

> TezJobMonitor is not writing some headers to log file
> -
>
> Key: HIVE-15654
> URL: https://issues.apache.org/jira/browse/HIVE-15654
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15654.1.patch
>
>
> Some headers printed by TezJobMonitor goes only to console and not to log 
> files. This results in empty headers when HS2 operation logging is enabled. 
> Something like
> {code}
> INFO  : LLAP IO Summary
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 13010  0  131.78KB 0B
>   0B   0B 0.06s
> INFO  : 
> --
> INFO  : 
> INFO  : FileSystem Counters Summary
> INFO  : 
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 1320B 0  0   
>   1.84KB 0
> INFO  :  Reducer 2  0B 0  0   
>   0B 0
> INFO  : 
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827378#comment-15827378
 ] 

Xuefu Zhang commented on HIVE-15527:


Hi [~Ferd], [~dapengsun], We found a better fix, and the patch here is likely 
abandoned. The new fix is in HIVE-15580 and the patch there addresses both 
issues. Please feel free to try the patch there and provide your feedback. 
Thanks.

> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>   return new Tuple2(key, 
> curValues);
> }
> {code}
> Since the output from sortByKey is already sorted on key, it's possible to 
> backup the value iterable using the same input iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Dapeng Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827376#comment-15827376
 ] 

Dapeng Sun edited comment on HIVE-15527 at 1/18/17 3:32 AM:


Thank [~csun] and [~Ferd], here is the detail log:
{noformat}
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.
{noformat}


was (Author: dapengsun):
Thank [~csun] and [~Ferd], here is the detail log:
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.


> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>   

[jira] [Commented] (HIVE-15623) Use customized version of netty for llap

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827377#comment-15827377
 ] 

Hive QA commented on HIVE-15623:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847397/HIVE-15623.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 370 failed/errored test(s), 10819 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.common.TestBlobStorageUtils.testValidAndInvalidFileSystems
 (batchId=240)
org.apache.hadoop.hive.io.TestHadoopFileStatus.testHadoopFileStatusAclEntries 
(batchId=190)

[jira] [Commented] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Dapeng Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827376#comment-15827376
 ] 

Dapeng Sun commented on HIVE-15527:
---

Thank [~csun] and [~Ferd], here is the detail log:
17/01/17 xx:xx:xx INFO client.RemoteDriver: Failed to run job 

java.lang.NumberFormatException: null
at java.lang.Long.parseLong(Long.java:552)
at java.lang.Long.parseLong(Long.java:631)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:141)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:109)
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:335)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/01/17 xx:xx:xx INFO client.RemoteDriver: Shutting down remote driver.


> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>   return new Tuple2(key, 
> curValues);
> }
> {code}
> Since the output from sortByKey is already sorted on key, it's possible to 
> backup the value iterable using the same input iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15527) Memory usage is unbound in SortByShuffler for Spark

2017-01-17 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827369#comment-15827369
 ] 

Ferdinand Xu commented on HIVE-15527:
-

Hi [~csun], thank you for working on this issue and we can reproduce OOM caused 
by Arraylist when data increased 100TB at a skewed data. We're trying to use 
your patch to see whether it works. BTW, a minor issue exists in your patch. 
If SPARK_SHUFFLE_BUFFER_SIZE is not set, the default long value will be passed 
to parseLong(String) method.
{noformat}
long maxBufferSize = 
Long.parseLong(this.jobConf.get(HiveConf.ConfVars.SPARK_SHUFFLE_BUFFER_SIZE.varname));
{noformat}


> Memory usage is unbound in SortByShuffler for Spark
> ---
>
> Key: HIVE-15527
> URL: https://issues.apache.org/jira/browse/HIVE-15527
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Chao Sun
> Attachments: HIVE-15527.0.patch, HIVE-15527.0.patch, 
> HIVE-15527.1.patch, HIVE-15527.2.patch, HIVE-15527.3.patch, 
> HIVE-15527.4.patch, HIVE-15527.5.patch, HIVE-15527.6.patch, 
> HIVE-15527.7.patch, HIVE-15527.8.patch, HIVE-15527.patch
>
>
> In SortByShuffler.java, an ArrayList is used to back the iterator for values 
> that have the same key in shuffled result produced by spark transformation 
> sortByKey. It's possible that memory can be exhausted because of a large key 
> group.
> {code}
> @Override
> public Tuple2 next() {
>   // TODO: implement this by accumulating rows with the same key 
> into a list.
>   // Note that this list needs to improved to prevent excessive 
> memory usage, but this
>   // can be done in later phase.
>   while (it.hasNext()) {
> Tuple2 pair = it.next();
> if (curKey != null && !curKey.equals(pair._1())) {
>   HiveKey key = curKey;
>   List values = curValues;
>   curKey = pair._1();
>   curValues = new ArrayList();
>   curValues.add(pair._2());
>   return new Tuple2(key, 
> values);
> }
> curKey = pair._1();
> curValues.add(pair._2());
>   }
>   if (curKey == null) {
> throw new NoSuchElementException();
>   }
>   // if we get here, this should be the last element we have
>   HiveKey key = curKey;
>   curKey = null;
>   return new Tuple2(key, 
> curValues);
> }
> {code}
> Since the output from sortByKey is already sorted on key, it's possible to 
> backup the value iterable using the same input iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel

2017-01-17 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827365#comment-15827365
 ] 

Sahil Takiar commented on HIVE-15546:
-

Thanks [~sershe] I'll take a look into those classes.

[~spena] can you take a look at this patch?

> Optimize Utilities.getInputPaths() so each listStatus of a partition is done 
> in parallel
> 
>
> Key: HIVE-15546
> URL: https://issues.apache.org/jira/browse/HIVE-15546
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15546.1.patch, HIVE-15546.2.patch, 
> HIVE-15546.3.patch, HIVE-15546.4.patch
>
>
> When running on blobstores (like S3) where metadata operations (like 
> listStatus) are costly, Utilities.getInputPaths() can add significant 
> overhead when setting up the input paths for an MR / Spark / Tez job.
> The method performs a listStatus on all input paths in order to check if the 
> path is empty. If the path is empty, a dummy file is created for the given 
> partition. This is all done sequentially. This can be really slow when there 
> are a lot of empty partitions. Even when all partitions have input data, this 
> can take a long time.
> We should either:
> (1) Just remove the logic to check if each input path is empty, and handle 
> any edge cases accordingly.
> (2) Multi-thread the listStatus calls



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel

2017-01-17 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15546:

Attachment: HIVE-15546.4.patch

> Optimize Utilities.getInputPaths() so each listStatus of a partition is done 
> in parallel
> 
>
> Key: HIVE-15546
> URL: https://issues.apache.org/jira/browse/HIVE-15546
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15546.1.patch, HIVE-15546.2.patch, 
> HIVE-15546.3.patch, HIVE-15546.4.patch
>
>
> When running on blobstores (like S3) where metadata operations (like 
> listStatus) are costly, Utilities.getInputPaths() can add significant 
> overhead when setting up the input paths for an MR / Spark / Tez job.
> The method performs a listStatus on all input paths in order to check if the 
> path is empty. If the path is empty, a dummy file is created for the given 
> partition. This is all done sequentially. This can be really slow when there 
> are a lot of empty partitions. Even when all partitions have input data, this 
> can take a long time.
> We should either:
> (1) Just remove the logic to check if each input path is empty, and handle 
> any edge cases accordingly.
> (2) Multi-thread the listStatus calls



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15654) TezJobMonitor is not writing some headers to log file

2017-01-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827359#comment-15827359
 ] 

Prasanth Jayachandran commented on HIVE-15654:
--

logInfo writes to log file only whereas printInfo writes to console and log 
file. The headers are written only to console inside reprintLineWithColorAsBold 
method.

> TezJobMonitor is not writing some headers to log file
> -
>
> Key: HIVE-15654
> URL: https://issues.apache.org/jira/browse/HIVE-15654
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15654.1.patch
>
>
> Some headers printed by TezJobMonitor goes only to console and not to log 
> files. This results in empty headers when HS2 operation logging is enabled. 
> Something like
> {code}
> INFO  : LLAP IO Summary
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 13010  0  131.78KB 0B
>   0B   0B 0.06s
> INFO  : 
> --
> INFO  : 
> INFO  : FileSystem Counters Summary
> INFO  : 
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 1320B 0  0   
>   1.84KB 0
> INFO  :  Reducer 2  0B 0  0   
>   0B 0
> INFO  : 
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15654) TezJobMonitor is not writing some headers to log file

2017-01-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827355#comment-15827355
 ] 

Sergey Shelukhin commented on HIVE-15654:
-

+1; what's the difference between logInfo and printInfo?

> TezJobMonitor is not writing some headers to log file
> -
>
> Key: HIVE-15654
> URL: https://issues.apache.org/jira/browse/HIVE-15654
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15654.1.patch
>
>
> Some headers printed by TezJobMonitor goes only to console and not to log 
> files. This results in empty headers when HS2 operation logging is enabled. 
> Something like
> {code}
> INFO  : LLAP IO Summary
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 13010  0  131.78KB 0B
>   0B   0B 0.06s
> INFO  : 
> --
> INFO  : 
> INFO  : FileSystem Counters Summary
> INFO  : 
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 1320B 0  0   
>   1.84KB 0
> INFO  :  Reducer 2  0B 0  0   
>   0B 0
> INFO  : 
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15654) TezJobMonitor is not writing some headers to log file

2017-01-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15654:
-
Attachment: HIVE-15654.1.patch

> TezJobMonitor is not writing some headers to log file
> -
>
> Key: HIVE-15654
> URL: https://issues.apache.org/jira/browse/HIVE-15654
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15654.1.patch
>
>
> Some headers printed by TezJobMonitor goes only to console and not to log 
> files. This results in empty headers when HS2 operation logging is enabled. 
> Something like
> {code}
> INFO  : LLAP IO Summary
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 13010  0  131.78KB 0B
>   0B   0B 0.06s
> INFO  : 
> --
> INFO  : 
> INFO  : FileSystem Counters Summary
> INFO  : 
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 1320B 0  0   
>   1.84KB 0
> INFO  :  Reducer 2  0B 0  0   
>   0B 0
> INFO  : 
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15654) TezJobMonitor is not writing some headers to log file

2017-01-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15654:
-
Summary: TezJobMonitor is not writing some headers to log file  (was: 
TezJobMonitor is not some headers to log file)

> TezJobMonitor is not writing some headers to log file
> -
>
> Key: HIVE-15654
> URL: https://issues.apache.org/jira/browse/HIVE-15654
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
>
> Some headers printed by TezJobMonitor goes only to console and not to log 
> files. This results in empty headers when HS2 operation logging is enabled. 
> Something like
> {code}
> INFO  : LLAP IO Summary
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 13010  0  131.78KB 0B
>   0B   0B 0.06s
> INFO  : 
> --
> INFO  : 
> INFO  : FileSystem Counters Summary
> INFO  : 
> INFO  : 
> --
> INFO  : 
> --
> INFO  :  Map 1320B 0  0   
>   1.84KB 0
> INFO  :  Reducer 2  0B 0  0   
>   0B 0
> INFO  : 
> --
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15645) Tez session pool may restart sessions in a wrong queue

2017-01-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15645:

Attachment: HIVE-15645.01.patch

Updated the logging in the patch.

> Tez session pool may restart sessions in a wrong queue
> --
>
> Key: HIVE-15645
> URL: https://issues.apache.org/jira/browse/HIVE-15645
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15645.01.patch, HIVE-15645.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827337#comment-15827337
 ] 

Hive QA commented on HIVE-15649:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847940/HIVE-15649.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 372 failed/errored test(s), 10821 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_stats] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=139)

[jira] [Updated] (HIVE-15591) Hive can not use "," in quoted column name

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15591:
---
Attachment: HIVE-15591.01.patch

> Hive can not use "," in quoted column name
> --
>
> Key: HIVE-15591
> URL: https://issues.apache.org/jira/browse/HIVE-15591
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15591.01.patch
>
>
> As reported by [~cartershanklin]
> hive> create table test (`x,y` int);
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements 
> while columns.types has 1 elements!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15591) Hive can not use "," in quoted column name

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15591:
---
Status: Patch Available  (was: Open)

> Hive can not use "," in quoted column name
> --
>
> Key: HIVE-15591
> URL: https://issues.apache.org/jira/browse/HIVE-15591
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15591.01.patch
>
>
> As reported by [~cartershanklin]
> hive> create table test (`x,y` int);
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements 
> while columns.types has 1 elements!)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-17 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827334#comment-15827334
 ] 

Alexander Behm commented on HIVE-15653:
---

Note that this problem seems to be specific to unpartitioned tables.
Partitioned tables work ok as far as I can tell.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Priority: Critical
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15519) BitSet not computed properly for ColumnBuffer subset

2017-01-17 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15519:
--
Attachment: HIVE-15519.6.patch

> BitSet not computed properly for ColumnBuffer subset
> 
>
> Key: HIVE-15519
> URL: https://issues.apache.org/jira/browse/HIVE-15519
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, JDBC
>Reporter: Bharat Viswanadham
>Assignee: Rui Li
>Priority: Critical
> Attachments: data_type_test(1).txt, HIVE-15519.1.patch, 
> HIVE-15519.2.patch, HIVE-15519.3.patch, HIVE-15519.4.patch, 
> HIVE-15519.5-branch-1.patch, HIVE-15519.6.patch
>
>
> Hive decimal type column precision is returning as zero, even though column 
> has precision set.
> Example: col67 decimal(18,2) scale is returning as zero for that column.
> Tried with below program.
> {code}
>System.out.println("Opening connection");   
> Class.forName("org.apache.hive.jdbc.HiveDriver");
>Connection con = 
> DriverManager.getConnection("jdbc:hive2://x.x.x.x:1/default");
>   DatabaseMetaData dbMeta = con.getMetaData();
>ResultSet rs = dbMeta.getColumns(null, "DEFAULT", "data_type_test",null);
>  while (rs.next()) {
> if (rs.getString("COLUMN_NAME").equalsIgnoreCase("col48") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col67") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col68") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col122")){
>  System.out.println(rs.getString("COLUMN_NAME") + "\t" + 
> rs.getString("COLUMN_SIZE") + "\t" + rs.getInt("DECIMAL_DIGITS"));
> }
>}
>rs.close();
>con.close();
>   } catch (Exception e) {
>e.printStackTrace();
>;
>   }
> {code}
> Default fetch size is 50. if any column no is under 50 with decimal type, 
> precision is returning properly, when the column no is greater than 50, scale 
> is returning as zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15519) BitSet not computed properly for ColumnBuffer subset

2017-01-17 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-15519:
--
Attachment: (was: HIVE-15519.6.patch)

> BitSet not computed properly for ColumnBuffer subset
> 
>
> Key: HIVE-15519
> URL: https://issues.apache.org/jira/browse/HIVE-15519
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, JDBC
>Reporter: Bharat Viswanadham
>Assignee: Rui Li
>Priority: Critical
> Attachments: data_type_test(1).txt, HIVE-15519.1.patch, 
> HIVE-15519.2.patch, HIVE-15519.3.patch, HIVE-15519.4.patch, 
> HIVE-15519.5-branch-1.patch, HIVE-15519.6.patch
>
>
> Hive decimal type column precision is returning as zero, even though column 
> has precision set.
> Example: col67 decimal(18,2) scale is returning as zero for that column.
> Tried with below program.
> {code}
>System.out.println("Opening connection");   
> Class.forName("org.apache.hive.jdbc.HiveDriver");
>Connection con = 
> DriverManager.getConnection("jdbc:hive2://x.x.x.x:1/default");
>   DatabaseMetaData dbMeta = con.getMetaData();
>ResultSet rs = dbMeta.getColumns(null, "DEFAULT", "data_type_test",null);
>  while (rs.next()) {
> if (rs.getString("COLUMN_NAME").equalsIgnoreCase("col48") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col67") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col68") || 
> rs.getString("COLUMN_NAME").equalsIgnoreCase("col122")){
>  System.out.println(rs.getString("COLUMN_NAME") + "\t" + 
> rs.getString("COLUMN_SIZE") + "\t" + rs.getInt("DECIMAL_DIGITS"));
> }
>}
>rs.close();
>con.close();
>   } catch (Exception e) {
>e.printStackTrace();
>;
>   }
> {code}
> Default fetch size is 50. if any column no is under 50 with decimal type, 
> precision is returning properly, when the column no is greater than 50, scale 
> is returning as zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827226#comment-15827226
 ] 

Hive QA commented on HIVE-15621:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847941/HIVE-15621.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 372 failed/errored test(s), 10819 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=151)

[jira] [Commented] (HIVE-15626) beeline exits on ctrl-c instead of canceling the query

2017-01-17 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827218#comment-15827218
 ] 

Vihang Karajgaonkar commented on HIVE-15626:


Hi [~sershe] .. Can you give an example? I am seeing it happening in the master 
branch as well. Is this issue specific to 1.2.1 or master too? Do you know a 
branch where this is not happening?

> beeline exits on ctrl-c instead of canceling the query
> --
>
> Key: HIVE-15626
> URL: https://issues.apache.org/jira/browse/HIVE-15626
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergey Shelukhin
>Assignee: Vihang Karajgaonkar
>
> I am seeing this in 1.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Open  (was: Patch Available)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Attachment: (was: HIVE-15160.01.patch)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Attachment: HIVE-15160.01.patch

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Patch Available  (was: Open)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Status: Patch Available  (was: Open)

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Attachment: (was: HIVE-15646.01.patch)

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Status: Open  (was: Patch Available)

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Attachment: HIVE-15646.01.patch

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15531) Hive breaks Hadoop commons logging with log4j2

2017-01-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15531:
-
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Closing it for now as hive enforces the classpath ordering which will avoid the 
issue. Will revisit later if we run into issues. 

> Hive breaks Hadoop commons logging with log4j2
> --
>
> Key: HIVE-15531
> URL: https://issues.apache.org/jira/browse/HIVE-15531
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Dhiraj Kumar
>Assignee: Dhiraj Kumar
>Priority: Minor
> Attachments: HIVE-15531.patch
>
>
> Hadoop (2.7), which is using Commons-logging is not compatible with log4j2 
> without bridge. 
> The bridge is missing in Hive. 
> This leads to a problem whereby commons-logging initialises a log4j (1.2) 
> version Logger, does not configure it properly since configuration for it is 
> missing and sends logging output to stdout (the default). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15652) Optimize(reduce) the number of alter calls made to fix repl.last.id

2017-01-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15652:

Status: Patch Available  (was: Open)

> Optimize(reduce) the number of alter calls made to fix repl.last.id
> ---
>
> Key: HIVE-15652
> URL: https://issues.apache.org/jira/browse/HIVE-15652
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15652.patch
>
>
> Per code review from HIVE-15534, we might be doing alters to parent objects 
> to set repl.last.id when it is not necessary, since some future event might 
> make this alter redundant.
> There are 3 cases where this might happen:
> a) After a CREATE_TABLE event - any prior reference to that table does not 
> need an ALTER, since CREATE_TABLE will have a repl.last.id come with it.
> b) After a DROP_TABLE event - any prior reference to that table is 
> irrelevant, and thus, no alter is needed.
> c) After an ALTER_TABLE event, since that dump will itself do a metadata 
> update that will get the latest repl.last.id along with this event.
> In each of these cases, we can remove the alter call needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15652) Optimize(reduce) the number of alter calls made to fix repl.last.id

2017-01-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15652:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-14841

> Optimize(reduce) the number of alter calls made to fix repl.last.id
> ---
>
> Key: HIVE-15652
> URL: https://issues.apache.org/jira/browse/HIVE-15652
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15652.patch
>
>
> Per code review from HIVE-15534, we might be doing alters to parent objects 
> to set repl.last.id when it is not necessary, since some future event might 
> make this alter redundant.
> There are 3 cases where this might happen:
> a) After a CREATE_TABLE event - any prior reference to that table does not 
> need an ALTER, since CREATE_TABLE will have a repl.last.id come with it.
> b) After a DROP_TABLE event - any prior reference to that table is 
> irrelevant, and thus, no alter is needed.
> c) After an ALTER_TABLE event, since that dump will itself do a metadata 
> update that will get the latest repl.last.id along with this event.
> In each of these cases, we can remove the alter call needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Patch Available  (was: Open)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15160) Can't order by an unselected column

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15160:
---
Status: Open  (was: Patch Available)

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15578) Simplify IdentifiersParser

2017-01-17 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827160#comment-15827160
 ] 

Pengcheng Xiong commented on HIVE-15578:


[~ashutoshc], could u please review? The failed ones are due to legal golden 
file updates.

> Simplify IdentifiersParser
> --
>
> Key: HIVE-15578
> URL: https://issues.apache.org/jira/browse/HIVE-15578
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15578.01.patch, HIVE-15578.02.patch
>
>
> before: 1.72M LOC in IdentifiersParser, after: 1.41M



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Status: Open  (was: Patch Available)

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15646) Column level lineage is not available for table Views

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15646:
---
Status: Patch Available  (was: Open)

> Column level lineage is not available for table Views
> -
>
> Key: HIVE-15646
> URL: https://issues.apache.org/jira/browse/HIVE-15646
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15646.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10487) remove non-ISO restriction that projections in a union have identical column names

2017-01-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-10487.

Resolution: Not A Problem

closed as not a problem. please feel free to reopen if it still exists. thanks.

> remove non-ISO restriction that projections in a union have identical column 
> names
> --
>
> Key: HIVE-10487
> URL: https://issues.apache.org/jira/browse/HIVE-10487
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 0.13.1
>Reporter: N Campbell
>Priority: Critical
>
> While documented 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union
> an application should be able to perform a union query where the projections  
> are union compatible which does not include the projected column names being 
> identical which Hive imposes vs ISO-SQL 20xx.
> i.e 
> rejected
> select c1 from t1 union all select c2 from t2 
> Schema of both sides of union should match. _u1-subquery2
> accepted
> select c1 from t1 union all select c2 c1 from t2 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10487) remove non-ISO restriction that projections in a union have identical column names

2017-01-17 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827154#comment-15827154
 ] 

Pengcheng Xiong commented on HIVE-10487:


This is not a problem any more on current hive master. I tried cbo=false and 
true. both of them work. intersect, except also work with cbo=true.

> remove non-ISO restriction that projections in a union have identical column 
> names
> --
>
> Key: HIVE-10487
> URL: https://issues.apache.org/jira/browse/HIVE-10487
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 0.13.1
>Reporter: N Campbell
>Priority: Critical
>
> While documented 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union
> an application should be able to perform a union query where the projections  
> are union compatible which does not include the projected column names being 
> identical which Hive imposes vs ISO-SQL 20xx.
> i.e 
> rejected
> select c1 from t1 union all select c2 from t2 
> Schema of both sides of union should match. _u1-subquery2
> accepted
> select c1 from t1 union all select c2 c1 from t2 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15478) Add file + checksum list for create table/partition during notification creation (whenever relevant)

2017-01-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827152#comment-15827152
 ] 

Hive QA commented on HIVE-15478:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12847769/HIVE-15478.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 373 failed/errored test(s), 10789 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=234)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection
 (batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge 
(batchId=165)
org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser 
(batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance
 (batchId=165)
org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput
 (batchId=165)
org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse
 (batchId=165)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part]
 (batchId=148)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=123)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=97)

[jira] [Updated] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-15472:
--
Status: Patch Available  (was: Open)

> JDBC: Standalone jar is missing ZK dependencies
> ---
>
> Key: HIVE-15472
> URL: https://issues.apache.org/jira/browse/HIVE-15472
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-15472.1.patch, HIVE-15472.2.patch
>
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/curator/RetryPolicy
>   at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:514)
>   at org.apache.hive.jdbc.Utils.parseURL(Utils.java:434)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:132)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at JDBCExecutor.getConnection(JDBCExecutor.java:65)
>   at JDBCExecutor.executeStatement(JDBCExecutor.java:104)
>   at JDBCExecutor.executeSQLFile(JDBCExecutor.java:81)
>   at JDBCExecutor.main(JDBCExecutor.java:183)
> Caused by: java.lang.ClassNotFoundException: org.apache.curator.RetryPolicy
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-15472:
--
Status: Open  (was: Patch Available)

> JDBC: Standalone jar is missing ZK dependencies
> ---
>
> Key: HIVE-15472
> URL: https://issues.apache.org/jira/browse/HIVE-15472
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-15472.1.patch, HIVE-15472.2.patch
>
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/curator/RetryPolicy
>   at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:514)
>   at org.apache.hive.jdbc.Utils.parseURL(Utils.java:434)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:132)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at JDBCExecutor.getConnection(JDBCExecutor.java:65)
>   at JDBCExecutor.executeStatement(JDBCExecutor.java:104)
>   at JDBCExecutor.executeSQLFile(JDBCExecutor.java:81)
>   at JDBCExecutor.main(JDBCExecutor.java:183)
> Caused by: java.lang.ClassNotFoundException: org.apache.curator.RetryPolicy
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15652) Optimize(reduce) the number of alter calls made to fix repl.last.id

2017-01-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15652:

Attachment: HIVE-15652.patch

Patch attached, this changes the total number of Alters to update repl.last.id 
across TestReplicationScenarios from 60 to 44.

> Optimize(reduce) the number of alter calls made to fix repl.last.id
> ---
>
> Key: HIVE-15652
> URL: https://issues.apache.org/jira/browse/HIVE-15652
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15652.patch
>
>
> Per code review from HIVE-15534, we might be doing alters to parent objects 
> to set repl.last.id when it is not necessary, since some future event might 
> make this alter redundant.
> There are 3 cases where this might happen:
> a) After a CREATE_TABLE event - any prior reference to that table does not 
> need an ALTER, since CREATE_TABLE will have a repl.last.id come with it.
> b) After a DROP_TABLE event - any prior reference to that table is 
> irrelevant, and thus, no alter is needed.
> c) After an ALTER_TABLE event, since that dump will itself do a metadata 
> update that will get the latest repl.last.id along with this event.
> In each of these cases, we can remove the alter call needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-15472:
--
Attachment: HIVE-15472.2.patch

> JDBC: Standalone jar is missing ZK dependencies
> ---
>
> Key: HIVE-15472
> URL: https://issues.apache.org/jira/browse/HIVE-15472
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-15472.1.patch, HIVE-15472.2.patch
>
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/curator/RetryPolicy
>   at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:514)
>   at org.apache.hive.jdbc.Utils.parseURL(Utils.java:434)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:132)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at JDBCExecutor.getConnection(JDBCExecutor.java:65)
>   at JDBCExecutor.executeStatement(JDBCExecutor.java:104)
>   at JDBCExecutor.executeSQLFile(JDBCExecutor.java:81)
>   at JDBCExecutor.main(JDBCExecutor.java:183)
> Caused by: java.lang.ClassNotFoundException: org.apache.curator.RetryPolicy
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15472) JDBC: Standalone jar is missing ZK dependencies

2017-01-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-15472:
--
Attachment: (was: HIVE-15472.2.patch)

> JDBC: Standalone jar is missing ZK dependencies
> ---
>
> Key: HIVE-15472
> URL: https://issues.apache.org/jira/browse/HIVE-15472
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Tao Li
> Attachments: HIVE-15472.1.patch
>
>
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/curator/RetryPolicy
>   at org.apache.hive.jdbc.Utils.configureConnParams(Utils.java:514)
>   at org.apache.hive.jdbc.Utils.parseURL(Utils.java:434)
>   at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:132)
>   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at JDBCExecutor.getConnection(JDBCExecutor.java:65)
>   at JDBCExecutor.executeStatement(JDBCExecutor.java:104)
>   at JDBCExecutor.executeSQLFile(JDBCExecutor.java:81)
>   at JDBCExecutor.main(JDBCExecutor.java:183)
> Caused by: java.lang.ClassNotFoundException: org.apache.curator.RetryPolicy
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13014:
--
Attachment: HIVE-13014.04.patch

> RetryingMetaStoreClient is retrying too aggresievley
> 
>
> Key: HIVE-13014
> URL: https://issues.apache.org/jira/browse/HIVE-13014
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, 
> HIVE-13014.03.patch, HIVE-13014.04.patch
>
>
> Not all metastore operations are idempotent.  For example, commit_txn() 
> consists of 
> 1. request from client to server
> 2. server action
> 3. ack to client
> If network connection is broken after (or during) 2 but before 3 happens, 
> RetryingMetastoreClient will retry the operation thus causing an attempt to 
> commit the same txn twice (sometimes in concurrently)
> The 2nd attempt is guaranteed to fail and thus return an error to the caller 
> (which doesn't know the operation is being retried), while the first attempt 
> has actually succeeded.  Thus the caller thinks commit failed and will likely 
> attempt to redo the transactions - not what we want in most cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14707) ACID: Insert shuffle sort-merges on blank KEY

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14707:
--
Attachment: HIVE-14707.21.patch

> ACID: Insert shuffle sort-merges on blank KEY
> -
>
> Key: HIVE-14707
> URL: https://issues.apache.org/jira/browse/HIVE-14707
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Eugene Koifman
> Attachments: HIVE-14707.01.patch, HIVE-14707.02.patch, 
> HIVE-14707.03.patch, HIVE-14707.04.patch, HIVE-14707.05.patch, 
> HIVE-14707.06.patch, HIVE-14707.08.patch, HIVE-14707.09.patch, 
> HIVE-14707.10.patch, HIVE-14707.11.patch, HIVE-14707.13.patch, 
> HIVE-14707.14.patch, HIVE-14707.16.patch, HIVE-14707.17.patch, 
> HIVE-14707.18.patch, HIVE-14707.19.patch, HIVE-14707.19.patch, 
> HIVE-14707.20.patch, HIVE-14707.21.patch
>
>
> The ACID insert codepath uses a sorted shuffle, while they key used for 
> shuffle is always 0 bytes long.
> {code}
> hive (sales_acid)> explain insert into sales values(1, 2, 
> '3400---009', 1, null);
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: gopal_20160906172626_80261c4c-79cc-4e02-87fe-3133be404e55:2
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: values__tmp__table__2
>   Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: tmp_values_col1 (type: string), 
> tmp_values_col2 (type: string), tmp_values_col3 (type: string), 
> tmp_values_col4 (type: string), tmp_values_col5 (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3, _col4
> Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: UDFToLong(_col1) (type: 
> bigint)
>   Statistics: Num rows: 1 Data size: 28 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: string), _col1 (type: 
> string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
> Execution mode: vectorized, llap
> LLAP IO: no inputs
> {code}
> Note the missing "+" / "-" in the Sort Order fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15534) Update db/table repl.last.id at the end of REPL LOAD of a batch of events

2017-01-17 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827143#comment-15827143
 ] 

Sushanth Sowmyan commented on HIVE-15534:
-

[~daijy], I created HIVE-15652 to track the update after your review comments.

> Update db/table repl.last.id at the end of REPL LOAD of a batch of events
> -
>
> Key: HIVE-15534
> URL: https://issues.apache.org/jira/browse/HIVE-15534
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 2.2.0
>
> Attachments: HIVE-15534.patch
>
>
> Tracking TODO task in ReplSemanticAnalyzer :
> {noformat}
> // TODO : Over here, we need to track a 
> Map for every db updated
> // and update repl.last.id for each, if this is a wh-level load, and 
> if it is a db-level load,
> // then a single repl.last.id update, and if this is a tbl-lvl load 
> which does not alter the
> // table itself, we'll need to update repl.last.id for that as well.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15588) Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc to prevent wrong reuse

2017-01-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15588:

Status: Patch Available  (was: In Progress)

> Vectorization: Fix deallocation of scratch columns in VectorUDFCoalesce, etc 
> to prevent wrong reuse
> ---
>
> Key: HIVE-15588
> URL: https://issues.apache.org/jira/browse/HIVE-15588
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15588.01.patch, HIVE-15588.02.patch, 
> HIVE-15588.03.patch, HIVE-15588.04.patch, HIVE-15588.05.patch
>
>
> Make sure we don't deallocate a scratch column too quickly and cause result 
> corruption due to scratch column reuse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15627) Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those in supportedGenericUDFs

2017-01-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15627:

Status: Patch Available  (was: In Progress)

> Make hive.vectorized.adaptor.usage.mode=all vectorize all UDFs not just those 
> in supportedGenericUDFs
> -
>
> Key: HIVE-15627
> URL: https://issues.apache.org/jira/browse/HIVE-15627
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15627.01.patch, HIVE-15627.02.patch, 
> HIVE-15627.03.patch, HIVE-15627.04.patch, HIVE-15627.05.patch
>
>
> Missed this when doing HIVE-14336.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15623) Use customized version of netty for llap

2017-01-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15623:
-
Status: Patch Available  (was: Open)

> Use customized version of netty for llap
> 
>
> Key: HIVE-15623
> URL: https://issues.apache.org/jira/browse/HIVE-15623
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15623.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client

2017-01-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827088#comment-15827088
 ] 

Thejas M Nair commented on HIVE-15579:
--

Can you retain the exception in log message like previous patch ? 
ie use - LOG.error("Error while setting delegation token for " + proxyUser, e);

> Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
> ---
>
> Key: HIVE-15579
> URL: https://issues.apache.org/jira/browse/HIVE-15579
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Nanda kumar
> Attachments: HIVE-15579.000.patch, HIVE-15579.001.patch
>
>
> Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would 
> be useful to have similar feature for hive metastore client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15621:
-
Status: Patch Available  (was: Open)

> Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP
> --
>
> Key: HIVE-15621
> URL: https://issues.apache.org/jira/browse/HIVE-15621
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15621.1.patch, HIVE-15621.2.patch, 
> HIVE-15621.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15621:
-
Status: Open  (was: Patch Available)

> Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP
> --
>
> Key: HIVE-15621
> URL: https://issues.apache.org/jira/browse/HIVE-15621
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15621.1.patch, HIVE-15621.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15621) Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP

2017-01-17 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15621:
-
Attachment: HIVE-15621.3.patch

> Use Hive's own JvmPauseMonitor instead of Hadoop's in LLAP
> --
>
> Key: HIVE-15621
> URL: https://issues.apache.org/jira/browse/HIVE-15621
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15621.1.patch, HIVE-15621.2.patch, 
> HIVE-15621.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Status: Patch Available  (was: Open)

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Attachment: HIVE-15649.patch

Added a  test too, however on MiniLlap/Local it doesn't repro the issue that 
we've seen on some cluster where analyze table ... for columns resulted in null 
columnIds.

[~prasanth_j] can you take a look?

[~mmccline] FYI I think some schema evolution paths may also assume non-null 
column list... e.g. buildConversionFileTypesArray seems to assume in isOk path 
that readerIncludes are not null even though they could be.

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15550) fix arglist logging in schematool

2017-01-17 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827029#comment-15827029
 ] 

Thejas M Nair commented on HIVE-15550:
--

Thanks [~leftylev] for pointing the mistake and the example!
I have now updated the errata.txt .


> fix arglist logging in schematool
> -
>
> Key: HIVE-15550
> URL: https://issues.apache.org/jira/browse/HIVE-15550
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.1.1
>Reporter: anishek
>Assignee: anishek
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15550.1.patch
>
>
> In DEBUG mode schemaTool prints the password to log file.
> This is also seen if the user includes --verbose option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14946) Optimizations

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-14946.
---
   Resolution: Duplicate
Fix Version/s: 2.2.0

addressed by HIVE-15539

> Optimizations
> -
>
> Key: HIVE-14946
> URL: https://issues.apache.org/jira/browse/HIVE-14946
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning, Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 2.2.0
>
>
> For example, if there is only WHEN NOT MATCHED clause, the base generating 
> expression can be an INNER JOIN
> Generally, we should make sure the optimizer is able to work with the plan 
> for Merge statement
> Various WHEN clauses can have "extra" predicates.  In some cases they may be 
> pushable.
> The "source" can be an arbitrary expression - in particular it may include 
> joins which together with the join introduced by Merge itself may need to be 
> reordered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14946) Optimizations

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-14946:
-

Assignee: Eugene Koifman

> Optimizations
> -
>
> Key: HIVE-14946
> URL: https://issues.apache.org/jira/browse/HIVE-14946
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning, Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> For example, if there is only WHEN NOT MATCHED clause, the base generating 
> expression can be an INNER JOIN
> Generally, we should make sure the optimizer is able to work with the plan 
> for Merge statement
> Various WHEN clauses can have "extra" predicates.  In some cases they may be 
> pushable.
> The "source" can be an arbitrary expression - in particular it may include 
> joins which together with the join introduced by Merge itself may need to be 
> reordered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2017-01-17 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: HIVE-13680.7.patch

Attached a new patch with cleaner negotiation and support for versioned 
plug-ins.

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.2.patch, HIVE-13680.3.patch, 
> HIVE-13680.4.patch, HIVE-13680.6.patch, HIVE-13680.7.patch, HIVE-13680.patch, 
> proposal.pdf, SnappyCompDe.zip
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15534) Update db/table repl.last.id at the end of REPL LOAD of a batch of events

2017-01-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15534:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks, [~daijy], committed to master.

> Update db/table repl.last.id at the end of REPL LOAD of a batch of events
> -
>
> Key: HIVE-15534
> URL: https://issues.apache.org/jira/browse/HIVE-15534
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 2.2.0
>
> Attachments: HIVE-15534.patch
>
>
> Tracking TODO task in ReplSemanticAnalyzer :
> {noformat}
> // TODO : Over here, we need to track a 
> Map for every db updated
> // and update repl.last.id for each, if this is a wh-level load, and 
> if it is a db-level load,
> // then a single repl.last.id update, and if this is a tbl-lvl load 
> which does not alter the
> // table itself, we'll need to update repl.last.id for that as well.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.

2017-01-17 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14926:

Resolution: Won't Fix
  Assignee: (was: Aihua Xu)
Status: Resolved  (was: Patch Available)

> Keep Schema in consistent state where schemaTool fails or succeeds.  
> -
>
> Key: HIVE-14926
> URL: https://issues.apache.org/jira/browse/HIVE-14926
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Reporter: Aihua Xu
> Attachments: HIVE-14926.1.patch, HIVE-14926.2.patch
>
>
> SchemaTool uses autocommit right now when executing the upgrade or init 
> scripts. Seems we should use database transaction to commit or roll back to 
> keep schema consistent.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized

2017-01-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-15573:
-

Assignee: Eugene Koifman

> Vectorization: ACID shuffle ReduceSink is not specialized 
> --
>
> Key: HIVE-15573
> URL: https://issues.apache.org/jira/browse/HIVE-15573
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions, Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Eugene Koifman
> Attachments: screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing 
> requirements demanding the writable hashcode for the shuffles.
> {code}
> boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
> if (!useUniformHash) {
>   return false;
> }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much 
> faster.
> {code}
> Reduce Output Operator
>   sort order: 
>   Map-reduce partition columns: _col0 (type: bigint)
>   value expressions:  
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15643) remove use of default charset in FastHiveDecimal

2017-01-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826882#comment-15826882
 ] 

Prasanth Jayachandran commented on HIVE-15643:
--

[~owen.omalley] changing it to UTF8 would break bloom filters because of 
hashcode difference. We might need a separate stream for UTF8, make the readers 
use UTF8 by default and for old files use default charset stream. 

> remove use of default charset in FastHiveDecimal
> 
>
> Key: HIVE-15643
> URL: https://issues.apache.org/jira/browse/HIVE-15643
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Edward Capriolo
>
> HIVE-15335 introduced some new uses of String.getBytes(), which uses the 
> default char set. These need to be replaced with the version that always uses 
> UTF8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15650) LLAP: Set perflogger to DEBUG level for llap daemons

2017-01-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826873#comment-15826873
 ] 

Prasanth Jayachandran commented on HIVE-15650:
--

[~gopalv] can you please review this patch?

> LLAP: Set perflogger to DEBUG level for llap daemons
> 
>
> Key: HIVE-15650
> URL: https://issues.apache.org/jira/browse/HIVE-15650
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Logging
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15650.1.patch
>
>
> During Hive2 dev, the PerfLogger was moved to DEBUG levels only making it 
> impossible to debug timings from LLAP logs without manually editing 
> log4j2.properties and redeploying LLAP.
> Enable PerfLogger by default on LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15650) LLAP: Set perflogger to DEBUG level for llap daemons

2017-01-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15650:
-
Attachment: HIVE-15650.1.patch

> LLAP: Set perflogger to DEBUG level for llap daemons
> 
>
> Key: HIVE-15650
> URL: https://issues.apache.org/jira/browse/HIVE-15650
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Logging
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15650.1.patch
>
>
> During Hive2 dev, the PerfLogger was moved to DEBUG levels only making it 
> impossible to debug timings from LLAP logs without manually editing 
> log4j2.properties and redeploying LLAP.
> Enable PerfLogger by default on LLAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >