[jira] [Updated] (HIVE-13873) Column pruning for nested fields

2016-10-12 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-13873:

Attachment: HIVE-13873.3.patch

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-13873.1.patch, HIVE-13873.2.patch, 
> HIVE-13873.3.patch, HIVE-13873.patch, HIVE-13873.wip.patch
>
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570915#comment-15570915
 ] 

Hive QA commented on HIVE-14822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832972/HIVE-14822.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10568 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1519/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1519/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832972 - PreCommit-HIVE-Build

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12670) Fix tests failing due to invalid ConnectionDriverName

2016-10-12 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu resolved HIVE-12670.

Resolution: Fixed

I believe this was fixed by HIVE-12685.

> Fix tests failing due to invalid ConnectionDriverName
> -
>
> Key: HIVE-12670
> URL: https://issues.apache.org/jira/browse/HIVE-12670
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> Some unit tests fail when run outside the ptest environment (i.e. when run 
> individually on the local box like mvn test -Dtest=TestSessionHooks) with the 
> following error:
> {code}
> Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the 
> "BONECP" plugin to create a ConnectionPool gave an error : The specified 
> datastore driver ("hive-site.xml") was not found in the CLASSPATH. Please 
> check your CLASSPATH specification, and the name of the driver.
> {code}
> This is because to support TestHiveConf, we override 
> {{javax.jdo.option.ConnectionDriverName}} in  test hive-site file 
> (common/src/test/resources/hive-site.xml). However, this override gets 
> applied for all tests. The overriden value is invalid, which causes other 
> tests that attempt to initialize CliService to fail.
> Instead, we should use a property exclusively used for testing like 
> {{hive.test.dummystats.aggregator}} so that overriding it does not affect 
> other tests.
> Not sure why these tests pass in ptest, presumably because some other test 
> that comes before overrides {{javax.jdo.option.ConnectionDriverName}} to a 
> sensible value.
> Tests failing:
> TestSessionHooks
> TestPlainSaslHelper 
> TestSessionGlobalInitFile



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570817#comment-15570817
 ] 

Hive QA commented on HIVE-11394:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833021/HIVE-11394.092.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10530 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1518/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1518/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833021 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   

[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570693#comment-15570693
 ] 

Hive QA commented on HIVE-14925:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832969/HIVE-14925.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10560 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_1]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_batchsize]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repair]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1517/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1517/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1517/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832969 - PreCommit-HIVE-Build

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Ratheesh Kamoor
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3236) allow column names to be prefixed by table alias in select all queries

2016-10-12 Thread Roger Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570672#comment-15570672
 ] 

Roger Shi commented on HIVE-3236:
-

Any update on this JIRA?

> allow column names to be prefixed by table alias in select all queries
> --
>
> Key: HIVE-3236
> URL: https://issues.apache.org/jira/browse/HIVE-3236
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.9.1, 0.10.0
>Reporter: Keegan Mosley
>Priority: Minor
> Attachments: HIVE-3236.1.patch.txt
>
>
> When using "CREATE TABLE x AS SELECT ..." where the select joins tables with 
> hundreds of columns it is not a simple task to resolve duplicate column name 
> exceptions (particularly with self-joins). The user must either manually 
> specify aliases for all duplicate columns (potentially hundreds) or write a 
> script to generate the data set in a separate select query, then create the 
> table and load the data in.
> There should be some conf flag that would allow queries like
> "create table joined as select one.\*, two.\* from mytable one join mytable 
> two on (one.duplicate_field = two.duplicate_field1);"
> to create a table with columns one_duplicate_field and two_duplicate_field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570618#comment-15570618
 ] 

Rajesh Balamohan commented on HIVE-14925:
-

This was tried with with partitions in S3. One of the main reason to make it 
multi-threaded is to improve the runtime for systems like S3 and azure.

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Ratheesh Kamoor
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset

2016-10-12 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570613#comment-15570613
 ] 

Rajesh Balamohan commented on HIVE-14803:
-

Thanks [~sseth], [~pxiong]. I will fix the patch and post it.

> S3: Stats gathering for insert queries can be expensive for partitioned 
> dataset
> ---
>
> Key: HIVE-14803
> URL: https://issues.apache.org/jira/browse/HIVE-14803
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14803.1.patch
>
>
> StatsTask's aggregateStats populates stats details for all partitions by 
> checking the file sizes which turns out to be expensive when larger number of 
> partitions are inserted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Status: Patch Available  (was: In Progress)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> 

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Status: In Progress  (was: Patch Available)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> 

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: HIVE-11394.092.patch

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce 

[jira] [Commented] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570576#comment-15570576
 ] 

Hive QA commented on HIVE-14822:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832972/HIVE-14822.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10568 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1516/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1516/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1516/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832972 - PreCommit-HIVE-Build

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570469#comment-15570469
 ] 

Hive QA commented on HIVE-14373:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832952/HIVE-14373.06.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10560 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1515/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1515/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1515/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832952 - PreCommit-HIVE-Build

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14884) Test result cleanup before 2.1.1 release

2016-10-12 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570460#comment-15570460
 ] 

Matt McCline commented on HIVE-14884:
-

Test report is unavailable.  What branch is it?

> Test result cleanup before 2.1.1 release
> 
>
> Key: HIVE-14884
> URL: https://issues.apache.org/jira/browse/HIVE-14884
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14884-branch-2.1.patch, 
> HIVE-14884.2-branch-2.1.patch
>
>
> There are multiple tests are failing on 2.1 branch.
> Before releasing 2.1.1 it would be good to clean up this list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14933) include argparse with LLAP scripts to support antique Python versions

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570361#comment-15570361
 ] 

Hive QA commented on HIVE-14933:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832979/HIVE-14933.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10560 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1514/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1514/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1514/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832979 - PreCommit-HIVE-Build

> include argparse with LLAP scripts to support antique Python versions
> -
>
> Key: HIVE-14933
> URL: https://issues.apache.org/jira/browse/HIVE-14933
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14933.01.patch, HIVE-14933.patch
>
>
> The module is a standalone file, and it's under Python license that is 
> compatible with Apache. In the long term we should probably just move 
> LlapServiceDriver code entirely to Java, as right now it's a combination of 
> part-py, part-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14928) Analyze table no scan mess up schema

2016-10-12 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-14928:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Integrated to master branch. Thanks Prasanth for the review.

> Analyze table no scan mess up schema
> 
>
> Key: HIVE-14928
> URL: https://issues.apache.org/jira/browse/HIVE-14928
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.2.0
>
> Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch
>
>
> StatsNoJobTask uses static variables partUpdates and  table to track stats 
> changes. If multiple analyze no scan tasks run at the same time, then 
> table/partition schema could mess up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570328#comment-15570328
 ] 

Ratheesh Kamoor commented on HIVE-14925:


Are you trying with partitions in hdfs? You may not run into issues if threads 
are fast enough to finish execution before recursive call happens, File systems 
like S3 will clearly shows error due to n/w latency. 

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Ratheesh Kamoor
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570317#comment-15570317
 ] 

Rajesh Balamohan commented on HIVE-14925:
-

It would be helpful to have the repro for this. We have tried with 10K 
partitions and with 10 & 15 threads in MSCK which worked fine without issues.

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Ratheesh Kamoor
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14941) Renable stats_filemetadata.q test case

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570312#comment-15570312
 ] 

Prasanth Jayachandran commented on HIVE-14941:
--

cc [~sershe]

> Renable stats_filemetadata.q test case
> --
>
> Key: HIVE-14941
> URL: https://issues.apache.org/jira/browse/HIVE-14941
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore
>Reporter: Prasanth Jayachandran
>
> stats_filemetadata.q is disabled in HIVE-14940 because of slow initialization 
> time for hbase metastore time. We might have to add a new cli driver with 
> hbase as metastore and re-enable this test. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Status: Patch Available  (was: Open)

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
Status: Patch Available  (was: Open)

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
Attachment: HIVE-14940.1.patch

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Attachment: HIVE-12765.03.patch

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570298#comment-15570298
 ] 

Prasanth Jayachandran commented on HIVE-14938:
--

lgtm, +1

> Add deployed ptest properties file to repo, update to remove isolated tests
> ---
>
> Key: HIVE-14938
> URL: https://issues.apache.org/jira/browse/HIVE-14938
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14938.part1.patch, HIVE-14938.part2.patch
>
>
> The intent is to checkin the original file, and then modify it to remove 
> isolated tests (and move relevant ones to the skipBatching list), which 
> normally lead to stragglers, and sub-optimal resource utilization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-14940:


Assignee: Prasanth Jayachandran

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570292#comment-15570292
 ] 

Prasanth Jayachandran commented on HIVE-14940:
--

[~sseth] can you please take a look?

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12765) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12765:
---
Status: Open  (was: Patch Available)

rebase the patch

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all)
> ---
>
> Key: HIVE-12765
> URL: https://issues.apache.org/jira/browse/HIVE-12765
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12765.01.patch, HIVE-12765.02.patch, 
> HIVE-12765.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14925:
---
Assignee: Ratheesh Kamoor  (was: Rajesh Balamohan)

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Ratheesh Kamoor
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570225#comment-15570225
 ] 

Ratheesh Kamoor commented on HIVE-14925:


Done. This first time I am using RB tool, please let me know if I need to 
provide more info. Thx

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Rajesh Balamohan
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14876) make the number of rows to fetch from various HS2 clients/servers configurable

2016-10-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570219#comment-15570219
 ] 

Thejas M Nair commented on HIVE-14876:
--

bq. it's easy to decrease if one gets an OOM.
Figuring out the reason for OOM is not easy, specially if you have many queries 
running against HS2.
Do you have any numbers on performance difference between 1k and 10k fetch size 
?

cc [~gopalv] [~ziyangz]


> make the number of rows to fetch from various HS2 clients/servers configurable
> --
>
> Key: HIVE-14876
> URL: https://issues.apache.org/jira/browse/HIVE-14876
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14876.01.patch, HIVE-14876.patch
>
>
> Right now, it's hardcoded to a variety of values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14928) Analyze table no scan mess up schema

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570212#comment-15570212
 ] 

Hive QA commented on HIVE-14928:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832900/HIVE-14928.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10558 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1513/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1513/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1513/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832900 - PreCommit-HIVE-Build

> Analyze table no scan mess up schema
> 
>
> Key: HIVE-14928
> URL: https://issues.apache.org/jira/browse/HIVE-14928
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: HIVE-14928.1.patch, HIVE-14928.2.patch
>
>
> StatsNoJobTask uses static variables partUpdates and  table to track stats 
> changes. If multiple analyze no scan tasks run at the same time, then 
> table/partition schema could mess up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14876) make the number of rows to fetch from various HS2 clients/servers configurable

2016-10-12 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570138#comment-15570138
 ] 

Sergey Shelukhin commented on HIVE-14876:
-

The general one.

> make the number of rows to fetch from various HS2 clients/servers configurable
> --
>
> Key: HIVE-14876
> URL: https://issues.apache.org/jira/browse/HIVE-14876
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14876.01.patch, HIVE-14876.patch
>
>
> Right now, it's hardcoded to a variety of values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14876) make the number of rows to fetch from various HS2 clients/servers configurable

2016-10-12 Thread Ziyang Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570127#comment-15570127
 ] 

Ziyang Zhao commented on HIVE-14876:


Sorry, is the  
"HIVE_SERVER2_RESULTSET_DEFAULT_FETCH_SIZE("hive.server2.resultset.default.fetch.size"...)"
 in patch HIVE-14876.01.patch the same as 
"hive.server2.thrift.resultset.default.fetch.size" mentioned here? I mean is 
this a config only for ThriftJDBCBinarySerDe or a general one?

> make the number of rows to fetch from various HS2 clients/servers configurable
> --
>
> Key: HIVE-14876
> URL: https://issues.apache.org/jira/browse/HIVE-14876
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14876.01.patch, HIVE-14876.patch
>
>
> Right now, it's hardcoded to a variety of values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2016-10-12 Thread Ziyang Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ziyang Zhao reassigned HIVE-14901:
--

Assignee: Ziyang Zhao

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Ziyang Zhao
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570075#comment-15570075
 ] 

Hive QA commented on HIVE-11394:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832970/HIVE-11394.091.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 77 failed/errored test(s), 10530 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_adaptor_usage_mode]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_partition_diff_num_cols]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_grouping_sets]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partition_diff_num_cols]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]

[jira] [Commented] (HIVE-14884) Test result cleanup before 2.1.1 release

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570057#comment-15570057
 ] 

Prasanth Jayachandran commented on HIVE-14884:
--

All schema evolution tests are failing. [~mmccline] should have more idea on 
that. 

> Test result cleanup before 2.1.1 release
> 
>
> Key: HIVE-14884
> URL: https://issues.apache.org/jira/browse/HIVE-14884
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14884-branch-2.1.patch, 
> HIVE-14884.2-branch-2.1.patch
>
>
> There are multiple tests are failing on 2.1 branch.
> Before releasing 2.1.1 it would be good to clean up this list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14925:
---
Assignee: Rajesh Balamohan  (was: Pengcheng Xiong)

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Rajesh Balamohan
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570022#comment-15570022
 ] 

Pengcheng Xiong commented on HIVE-14925:


that is fast... i was planning to do this today... Could u create a RB for it? 
Thanks.

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569940#comment-15569940
 ] 

Prasanth Jayachandran commented on HIVE-14940:
--

cc. [~sseth]

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-12 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14799:
---
Attachment: HIVE-14799.6.patch

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.6.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.

2016-10-12 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14926:

Attachment: (was: HIVE-14926.1.patch)

> Keep Schema in consistent state where schemaTool fails or succeeds.  
> -
>
> Key: HIVE-14926
> URL: https://issues.apache.org/jira/browse/HIVE-14926
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> SchemaTool uses autocommit right now when executing the upgrade or init 
> scripts. Seems we should use database transaction to commit or roll back to 
> keep schema consistent.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569884#comment-15569884
 ] 

Mohit Sabharwal commented on HIVE-13966:


Thanks [~alangates]! 

[~rahul9269] is not available to work on this patch, so one of us can take it 
over. Happy to take
it over if you'd like.

Couple quick comments:

1) Looks like changes to AlterHandler (and HiveAlterHandler)
are not really needed ? The listener(s) are anyways getting
invoked in HMSHandler.alter_table_core (after the alterHandler.alterTable 
call). 
So invocations in HiveAlterHandler seem to be duplicates.

2) Some cleanup items like lots of extra imports in
HiveMetaStore.java and location of apache license in DummyTransactionalListener

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-12 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14921:
-
Attachment: HIVE-14921.2.patch

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14933) include argparse with LLAP scripts to support antique Python versions

2016-10-12 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14933:

Attachment: HIVE-14933.01.patch

Added the license. Not sure if it's the right way to add it... do we have 
someone who can comment on that?

> include argparse with LLAP scripts to support antique Python versions
> -
>
> Key: HIVE-14933
> URL: https://issues.apache.org/jira/browse/HIVE-14933
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14933.01.patch, HIVE-14933.patch
>
>
> The module is a standalone file, and it's under Python license that is 
> compatible with Apache. In the long term we should probably just move 
> LlapServiceDriver code entirely to Java, as right now it's a combination of 
> part-py, part-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.10

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569901#comment-15569901
 ] 

Hive QA commented on HIVE-13316:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832847/HIVE-13316.05.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1511/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1511/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1511/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/llap-tez/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-llap-tez/2.2.0-SNAPSHOT/hive-llap-tez-2.2.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Spark Remote Client 2.2.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-client ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
spark-client ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-client 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ spark-client ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 15 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] 

[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569892#comment-15569892
 ] 

Hive QA commented on HIVE-14887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832844/HIVE-14887.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10631 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1510/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1510/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1510/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832844 - PreCommit-HIVE-Build

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Sravya Tirukkovalur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569879#comment-15569879
 ] 

Sravya Tirukkovalur commented on HIVE-13966:


New approach seems good to me - not introducing a new interface + change in the 
for loop.


> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Fix Version/s: 2.2.0

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Affects Version/s: 2.1.0

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.

2016-10-12 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14926:

Attachment: HIVE-14926.1.patch

> Keep Schema in consistent state where schemaTool fails or succeeds.  
> -
>
> Key: HIVE-14926
> URL: https://issues.apache.org/jira/browse/HIVE-14926
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-14926.1.patch
>
>
> SchemaTool uses autocommit right now when executing the upgrade or init 
> scripts. Seems we should use database transaction to commit or roll back to 
> keep schema consistent.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Component/s: Parser

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS

2016-10-12 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14872:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14872) Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS

2016-10-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569871#comment-15569871
 ] 

Pengcheng Xiong commented on HIVE-14872:


update the golden file. Double check that it passed. pushed to master. Thanks 
[~ashutoshc] for the review.

> Remove the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
> -
>
> Key: HIVE-14872
> URL: https://issues.apache.org/jira/browse/HIVE-14872
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14872.01.patch, HIVE-14872.02.patch
>
>
> The main purpose for the configuration of 
> HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a 
> lot of reserved key words has been used as identifiers in the previous 
> releases. We already have had several releases with this configuration. Now 
> when I tried to add new set operators to the parser, ANTLR is always 
> complaining "code too large". I think it is time to remove this 
> configuration. (1) It will simplify the parser logic and largely reduce the 
> size of generated parser code; (2) it leave space for new features, 
> especially those which require parser changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14822:
---
Attachment: HIVE-14822.06.patch

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: HIVE-11394.091.patch

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
>

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11394:

Attachment: (was: HIVE-11394.091.patch)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution mode: vectorized, llap
> Reduce Vectorization:
> enabled: 

[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.10

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569767#comment-15569767
 ] 

Hive QA commented on HIVE-13316:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832847/HIVE-13316.05.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1509/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1509/

Messages:
{noformat}
 This message was trimmed, see log for full details 

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 15 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.2.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/spark-client/2.2.0-SNAPSHOT/spark-client-2.2.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 2.2.0-SNAPSHOT
[INFO] 
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.pom
 (16 KB at 76.7 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/calcite/calcite/1.10.0/calcite-1.10.0.pom
 (36 

[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569754#comment-15569754
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832840/HIVE-14921.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10601 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_table_invalidate_column_stats]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[newline]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_merge10]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1508/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1508/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832840 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569746#comment-15569746
 ] 

Ratheesh Kamoor commented on HIVE-14925:


[~pxiong] I moved the logic in inline callable to an external class so that 
code can be reused in with multi-threaded and non-multi threaded scenario. 
Also, it will fix the issues of thread lock. Could you please review. Tested 
with very large partitions (5K+) we have and worked fine. 

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratheesh Kamoor updated HIVE-14925:
---
Attachment: HIVE-14925.patch

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14925) MSCK repair table hang while running with multi threading enabled

2016-10-12 Thread Ratheesh Kamoor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratheesh Kamoor updated HIVE-14925:
---
Fix Version/s: 2.2.0
 Release Note: 
Issue: MSCK is failing in multithreaded execution

Solution:
  - Moved Path processor logic to an external class which will avoid code 
duplication and it will be used in both multi-threaded and single threaded 
execution. 
   Status: Patch Available  (was: Open)

> MSCK repair table hang while running with multi threading enabled
> -
>
> Key: HIVE-14925
> URL: https://issues.apache.org/jira/browse/HIVE-14925
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14925.patch
>
>
> MSCK REPAIR TABLE hanging while running with multi-threading enabled 
> (default). I think it is because of a major design flaw in how thread pool 
> implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This 
> method has a thread pool which register Callable but callable makes a 
> recursive call to checkPartitionDirs method again. This code will hang when 
> number of directories is more than thread pool size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569628#comment-15569628
 ] 

Hive QA commented on HIVE-11394:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832883/HIVE-11394.091.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10601 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_adaptor_usage_mode]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_between_in]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_cast_constant]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_char_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_empty_where]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_orderby_5]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_partition_diff_num_cols]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_reduce_groupby_decimal]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_string_concat]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_0]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_13]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_aggregate_9]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_binary_join_groupby]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_aggregate]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_precision]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_distinct_2]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby4]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby6]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_3]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_grouping_sets]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_include_no_sel]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_orderby_5]

[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-12 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14822:
---
Attachment: HIVE-14822.05.patch

Updating the patch with the changes suggested.

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-13966:
--
Assignee: Rahul Sharma  (was: Alan Gates)

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-10-12 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569597#comment-15569597
 ] 

Alan Gates commented on HIVE-13966:
---

Assigned back to Rahul as I didn't intend to take over the JIRA, I just had to 
assign it to myself to upload a patch.

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
> Attachments: HIVE-13966.1.patch, HIVE-13966.2.patch, 
> HIVE-13966.3.patch, HIVE-13966.pdf
>
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14906) HMS should support an API to get consistent atomic snapshot associated with a Notification ID.

2016-10-12 Thread Sravya Tirukkovalur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569504#comment-15569504
 ] 

Sravya Tirukkovalur commented on HIVE-14906:


Seems like if we do the following, we should be able to support an atomic 
getSnapshot() API:
- Set transaction level to "repeatable-read", so that all reads within a 
transaction would be from a single generation point. In other words, concurrent 
writes would not affect the state of the read.
- Make all the reads of snapshot building function part of the same transaction.

> HMS should support an API to get consistent atomic snapshot associated with a 
> Notification ID.
> --
>
> Key: HIVE-14906
> URL: https://issues.apache.org/jira/browse/HIVE-14906
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sravya Tirukkovalur
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569478#comment-15569478
 ] 

Deepak Jaiswal commented on HIVE-14929:
---

Thanks Vaibhav.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569473#comment-15569473
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

Patch looks good. I just saw the latest test report and it doesn't add any 
overhead. +1 from my side.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569450#comment-15569450
 ] 

Thomas Poepping commented on HIVE-14373:


Have two +1s on RB. Awaiting precommit tests, then patch should be good to go

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Open  (was: Patch Available)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569441#comment-15569441
 ] 

Deepak Jaiswal commented on HIVE-14929:
---

Sure. I will refresh my code and try that.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569439#comment-15569439
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

[~djaiswal] Nevermind, looks like the patch just had a fresh QA run. Please 
ignore my comment about rerunning. I'll take a look at the patch shortly.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569432#comment-15569432
 ] 

Vaibhav Gumashta commented on HIVE-14929:
-

[~djaiswal] Can you submit again for QA run? There were some changes that went 
in {{TestJdbcDriver2}} yesterday, which brought down the running time to 
~60-70s. Want to be sure the new tests don't affect that in a major way. I'll 
also take a look at the patch shortly.

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14929) Adding JDBC test for query cancellation scenario

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569426#comment-15569426
 ] 

Hive QA commented on HIVE-14929:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832746/HIVE-14929.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10640 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1506/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1506/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832746 - PreCommit-HIVE-Build

> Adding JDBC test for query cancellation scenario
> 
>
> Key: HIVE-14929
> URL: https://issues.apache.org/jira/browse/HIVE-14929
> Project: Hive
>  Issue Type: Test
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-14929.1.patch, HIVE-14929.2.patch
>
>
> There is some functional testing for query cancellation using JDBC which is 
> missing in unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14835) Improve ptest2 build time

2016-10-12 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569418#comment-15569418
 ] 

Prasanth Jayachandran commented on HIVE-14835:
--

No. This patch is breaking ptest. Will apply it again when the queue is close 
to empty and will debug it further. 

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Thomas Poepping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Status: Patch Available  (was: Open)

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Thomas Poepping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Attachment: HIVE-14373.06.patch

Attach new patch, addressed comments from RB

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Thomas Poepping (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-14373:
---
Status: Open  (was: Patch Available)

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.06.patch, 
> HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-10-12 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569399#comment-15569399
 ] 

Thomas Poepping commented on HIVE-14373:


[~spena] I responded to your comments on RB. I would like to open a separate 
JIRA after the submission of this one that will change the qtests to run on Tez 
by default, rather than running on MR. What do you think?

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Thomas Poepping
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.05.patch, HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source

2016-10-12 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12458:

Fix Version/s: 2.2.0

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14803) S3: Stats gathering for insert queries can be expensive for partitioned dataset

2016-10-12 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569355#comment-15569355
 ] 

Pengcheng Xiong commented on HIVE-14803:


Thanks [~sseth] for digging this out. [~rajesh.balamohan], it seems that we 
really have some problem in this patch. It looks like the stats are missing. In 
the explain plan, if the row of src table is 29 rather than 500, that usually 
means stats are missing. Could u take another look and upload a new patch? And, 
there is also a problem of the thread pool. People may set the 
mv.files.thread=0. In that case, threadpool will be null. Thanks.

> S3: Stats gathering for insert queries can be expensive for partitioned 
> dataset
> ---
>
> Key: HIVE-14803
> URL: https://issues.apache.org/jira/browse/HIVE-14803
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14803.1.patch
>
>
> StatsTask's aggregateStats populates stats details for all partitions by 
> checking the file sizes which turns out to be expensive when larger number of 
> partitions are inserted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14933) include argparse with LLAP scripts to support antique Python versions

2016-10-12 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14933:

Status: Patch Available  (was: Open)

> include argparse with LLAP scripts to support antique Python versions
> -
>
> Key: HIVE-14933
> URL: https://issues.apache.org/jira/browse/HIVE-14933
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14933.patch
>
>
> The module is a standalone file, and it's under Python license that is 
> compatible with Apache. In the long term we should probably just move 
> LlapServiceDriver code entirely to Java, as right now it's a combination of 
> part-py, part-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-12 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569322#comment-15569322
 ] 

Vineet Garg commented on HIVE-14913:


RB Link: https://reviews.apache.org/r/52708/

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source

2016-10-12 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569312#comment-15569312
 ] 

Thejas M Nair commented on HIVE-12458:
--

+1

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12458) remove identity_udf.jar from source

2016-10-12 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569295#comment-15569295
 ] 

Vaibhav Gumashta commented on HIVE-12458:
-

[~thejas] I've removed the code that used this jar (in tests) as part of the 
work on improving test cases. Can you review this?

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12458) remove identity_udf.jar from source

2016-10-12 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-12458:

Attachment: HIVE-12458.1.patch

> remove identity_udf.jar from source
> ---
>
> Key: HIVE-12458
> URL: https://issues.apache.org/jira/browse/HIVE-12458
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-12458.1.patch
>
>
> We should not be checking in jars into the source repo.
> We could use hive-contrib jar like its used in 
> ./ql/src/test/queries/clientpositive/add_jar_pfile.q 
> add jar 
> pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2

2016-10-12 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14761:

Affects Version/s: 2.1.0

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14761) Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2

2016-10-12 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14761:


Committed to master. Thanks [~sseth].

> Remove TestJdbcWithMiniMr after merging tests with TestJdbcWithMiniHS2
> --
>
> Key: HIVE-14761
> URL: https://issues.apache.org/jira/browse/HIVE-14761
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14761.1.patch
>
>
> Currently 2 min 30 sec



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14721) Fix TestJdbcWithMiniHS2 runtime

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569263#comment-15569263
 ] 

Hive QA commented on HIVE-14721:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832823/HIVE-14721.7.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1505/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1505/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:09.998
+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1505/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:10.000
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion 
steps (Ashutosh Chauhan via Pengcheng Xiong)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 04b303b HIVE-14922 : Add perf logging for post job completion 
steps (Ashutosh Chauhan via Pengcheng Xiong)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-12 17:07:11.083
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
error: 
a/itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/JdbcWithMiniKdcSQLAuthTest.java:
 No such file or directory
error: 
a/itests/hive-unit/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java: No 
such file or directory
error: 
a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java: 
No such file or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832823 - PreCommit-HIVE-Build

> Fix TestJdbcWithMiniHS2 runtime
> ---
>
> Key: HIVE-14721
> URL: https://issues.apache.org/jira/browse/HIVE-14721
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
> Attachments: HIVE-14721.1.patch, HIVE-14721.2.patch, 
> HIVE-14721.3.patch, HIVE-14721.3.patch, HIVE-14721.3.patch, 
> HIVE-14721.4.patch, HIVE-14721.4.patch, HIVE-14721.5.patch, 
> HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.6.patch, HIVE-14721.7.patch
>
>
> Currently 450s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14799) Query operation are not thread safe during its cancellation

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569254#comment-15569254
 ] 

Hive QA commented on HIVE-14799:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832818/HIVE-14799.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10636 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testTaskStatus
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1504/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1504/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832818 - PreCommit-HIVE-Build

> Query operation are not thread safe during its cancellation
> ---
>
> Key: HIVE-14799
> URL: https://issues.apache.org/jira/browse/HIVE-14799
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-14799.1.patch, HIVE-14799.2.patch, 
> HIVE-14799.3.patch, HIVE-14799.4.patch, HIVE-14799.5.patch, 
> HIVE-14799.5.patch, HIVE-14799.6.patch, HIVE-14799.patch
>
>
> When a query is cancelled either via Beeline (Ctrl-C) or API call 
> TCLIService.Client.CancelOperation, SQLOperation.cancel is invoked in a 
> different thread from that running the query to close/destroy its 
> encapsulated Driver object. Both SQLOperation and Driver are not thread-safe 
> which could sometimes result in Runtime exceptions like NPE. The errors from 
> the running query are not handled properly therefore probably causing some 
> stuffs (files, locks etc) not being cleaned after the query termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests

2016-10-12 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14938:
--
Attachment: HIVE-14938.part2.patch

Revision on top of the first patch with changes to remove isolation, add 
batching for spark tests and encryptedhdfs tests, skipBatching for others. This 
includes changes made by [~prasanth_j] and me for internal runs, to improve the 
runtimes.

[~prasanth_j], [~spena] - could you please take a look for sanity, before I 
commit these changes, and update the deployed ptest instance.

> Add deployed ptest properties file to repo, update to remove isolated tests
> ---
>
> Key: HIVE-14938
> URL: https://issues.apache.org/jira/browse/HIVE-14938
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14938.part1.patch, HIVE-14938.part2.patch
>
>
> The intent is to checkin the original file, and then modify it to remove 
> isolated tests (and move relevant ones to the skipBatching list), which 
> normally lead to stragglers, and sub-optimal resource utilization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14938) Add deployed ptest properties file to repo, update to remove isolated tests

2016-10-12 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14938:
--
Attachment: HIVE-14938.part1.patch

Initial config file - the existing one being used.

> Add deployed ptest properties file to repo, update to remove isolated tests
> ---
>
> Key: HIVE-14938
> URL: https://issues.apache.org/jira/browse/HIVE-14938
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14938.part1.patch
>
>
> The intent is to checkin the original file, and then modify it to remove 
> isolated tests (and move relevant ones to the skipBatching list), which 
> normally lead to stragglers, and sub-optimal resource utilization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14827) Micro benchmark for Parquet vectorized reader

2016-10-12 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-14827:
---

Assignee: Sahil Takiar

> Micro benchmark for Parquet vectorized reader
> -
>
> Key: HIVE-14827
> URL: https://issues.apache.org/jira/browse/HIVE-14827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Sahil Takiar
>
> We need a microbenchmark to evaluate the throughput and execution time for 
> Parquet vectorized reader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14539) Run additional tests from the module directory

2016-10-12 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14539.
---
Resolution: Done

As part of HIVE-14540

> Run additional tests from the module directory
> --
>
> Key: HIVE-14539
> URL: https://issues.apache.org/jira/browse/HIVE-14539
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> There's still close to 400 tests which run from the wrong directory (and end 
> up checking for file changes on more modules than required)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14835) Improve ptest2 build time

2016-10-12 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569154#comment-15569154
 ] 

Siddharth Seth commented on HIVE-14835:
---

[~prasanth_j] - did this go in again?

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14835) Improve ptest2 build time

2016-10-12 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569154#comment-15569154
 ] 

Siddharth Seth edited comment on HIVE-14835 at 10/12/16 4:20 PM:
-

[~prasanth_j] - did this go in again? Can the jira be closed.


was (Author: sseth):
[~prasanth_j] - did this go in again?

> Improve ptest2 build time
> -
>
> Key: HIVE-14835
> URL: https://issues.apache.org/jira/browse/HIVE-14835
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14835.1.patch
>
>
> NO PRECOMMIT TESTS
> 2 things can be improved
> 1) ptest2 always downloads jars for compiling its own directory which takes 
> about 1m30s which should take only 5s with cache jars. The reason for that is 
> maven.repo.local is pointing to a path under WORKSPACE which will be cleaned 
> by jenkins for every run.
> 2) For hive build we can make use of parallel build and quite the output of 
> build which should shave off another 15-30s. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11957) SHOW TRANSACTIONS should show queryID/agent id of the creator

2016-10-12 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569142#comment-15569142
 ] 

Wei Zheng commented on HIVE-11957:
--

[~ekoifman] Can you take a look?
SHOW TRANSACTIONS now output like this:
{code}
hive> show transactions;
OK
Transaction ID  Transaction State   Started TimeLast Heartbeat Time 
UserHostname
16  OPENMon Oct 10 11:26:14 PDT 2016Mon Oct 10 11:26:14 PDT 2016
wzheng  weimac.local
Time taken: 0.028 seconds, Fetched: 2 row(s)
{code}

> SHOW TRANSACTIONS should show queryID/agent id of the creator
> -
>
> Key: HIVE-11957
> URL: https://issues.apache.org/jira/browse/HIVE-11957
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Attachments: HIVE-11957.1.patch, HIVE-11957.2.patch, 
> HIVE-11957.3.patch, HIVE-11957.4.patch, HIVE-11957.5.patch
>
>
> this would be very useful for debugging
> should also include heartbeat/create timestamps
> would be nice to support some filtering/sorting options, like sort by create 
> time, agent id. filter by table, database, etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-12 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569135#comment-15569135
 ] 

Ashutosh Chauhan commented on HIVE-14913:
-

[~vgarg] Can you add RB link for this?

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14916) Reduce the memory requirements for Spark tests

2016-10-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569113#comment-15569113
 ] 

Hive QA commented on HIVE-14916:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12832814/HIVE-14916.003.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 10636 tests passed

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1503/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1503/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12832814 - PreCommit-HIVE-Build

> Reduce the memory requirements for Spark tests
> --
>
> Key: HIVE-14916
> URL: https://issues.apache.org/jira/browse/HIVE-14916
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Dapeng Sun
> Attachments: HIVE-14916.001.patch, HIVE-14916.002.patch, 
> HIVE-14916.003.patch
>
>
> As HIVE-14887, we need to reduce the memory requirements for Spark tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >