[jira] [Commented] (HIVE-14688) Hive drop call fails in presence of TDE

2017-07-05 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075976#comment-16075976
 ] 

Wei Zheng commented on HIVE-14688:
--

[~thejas] Looks like HIVE-11418 was recently committed to master which is 
solving the same problem. So this ticket becomes duplicate of HIVE-11418. It 
won't work for older 2.x hadoop versions, as was also discussed in HIVE-11418.

> Hive drop call fails in presence of TDE
> ---
>
> Key: HIVE-14688
> URL: https://issues.apache.org/jira/browse/HIVE-14688
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Deepesh Khandelwal
>Assignee: Wei Zheng
> Attachments: HIVE-14688.1.patch, HIVE-14688.2.patch, 
> HIVE-14688.3.patch, HIVE-14688.4.patch
>
>
> This should be committed to when Hive moves to Hadoop 2.8
> In Hadoop 2.8.0 TDE trash collection was fixed through HDFS-8831. This 
> enables us to make drop table calls for Hive managed tables where Hive 
> metastore warehouse directory is in encrypted zone. However even with the 
> feature in HDFS, Hive drop table currently fail:
> {noformat}
> $ hdfs crypto -listZones
> /apps/hive/warehouse  key2 
> $ hdfs dfs -ls /apps/hive/warehouse
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2016-09-01 02:54 
> /apps/hive/warehouse/.Trash
> hive> create table abc(a string, b int);
> OK
> Time taken: 5.538 seconds
> hive> dfs -ls /apps/hive/warehouse;
> Found 2 items
> drwxrwxrwt   - hdfs   hdfs  0 2016-09-01 02:54 
> /apps/hive/warehouse/.Trash
> drwxrwxrwx   - deepesh hdfs  0 2016-09-01 17:15 
> /apps/hive/warehouse/abc
> hive> drop table if exists abc;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to drop 
> default.abc because it is in an encryption zone and trash is enabled.  Use 
> PURGE option to skip trash.)
> {noformat}
> The problem lies here:
> {code:title=metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java}
> private void checkTrashPurgeCombination(Path pathToData, String objectName, 
> boolean ifPurge)
> ...
>   if (trashEnabled) {
> try {
>   HadoopShims.HdfsEncryptionShim shim =
> 
> ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf),
>  hiveConf);
>   if (shim.isPathEncrypted(pathToData)) {
> throw new MetaException("Unable to drop " + objectName + " 
> because it is in an encryption zone" +
>   " and trash is enabled.  Use PURGE option to skip trash.");
>   }
> } catch (IOException ex) {
>   MetaException e = new MetaException(ex.getMessage());
>   e.initCause(ex);
>   throw e;
> }
>   }
> {code}
> As we can see that we are making an assumption that delete wouldn't be 
> successful in encrypted zone. We need to modify this logic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075963#comment-16075963
 ] 

Hive QA commented on HIVE-16993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875836/HIVE-17008.8.patch

{color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importAll (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneDb (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneFunc 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneRole 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTableNonPartitioned
 (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTablePartitioned
 (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importSecurity 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importTablesWithConstraints
 (batchId=208)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5903/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5903/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5903/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875836 - PreCommit-HIVE-Build

> ThriftHiveMetastore.create_database can fail if the locationUri is not set
> --
>
> Key: HIVE-16993
> URL: https://issues.apache.org/jira/browse/HIVE-16993
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, 
> HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, 
> HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch, HIVE-17008.8.patch
>
>
> Calling 
> [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078]
>  with a database with an unset {{locationUri}} field through the C++ 
> implementation fails with:
> {code}
> MetaException(message=java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
> {code}
> The 
> [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270]
>  Thrift field is 'default requiredness (implicit)', and Thrift [does not 
> specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] 
> whether unset default requiredness fields are encoded.  Empirically, the Java 
> generated code [does not write the 
> {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942]
>  when the field is unset, while the C++ generated code 
> [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890].
> The MetaStore treats the field as optional, and [fills in a default 
> value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871]
>  if the field is unset.
> The end result is that when the C++ implementation sends a {{Database}} 
> without the field set, it actually writes an empty string, and the MetaStore 
> treats it as a set field (non-null), and then calls a {{Path}} API which 
> rejects the empty string.  The fix is simple: make the 

[jira] [Commented] (HIVE-10495) Hive index creation code throws NPE if index table is null

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075910#comment-16075910
 ] 

Hive QA commented on HIVE-10495:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12755107/HIVE-10495.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5902/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5902/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5902/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12755107 - PreCommit-HIVE-Build

> Hive index creation code throws NPE if index table is null
> --
>
> Key: HIVE-10495
> URL: https://issues.apache.org/jira/browse/HIVE-10495
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch
>
>
> The stack trace would be:
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
> at java.lang.reflect.Method.invoke(Method.java:611)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
> at $Proxy9.add_index(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17018) Small table is converted to map join even the total size of small tables exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)

2017-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075871#comment-16075871
 ] 

Chao Sun commented on HIVE-17018:
-

[~kellyzly] I'm not sure if I understand what you described. Can you come up 
with a small example query that demonstrates the problem? thanks.

> Small table is converted to map join even the total size of small tables 
> exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)
> -
>
> Key: HIVE-17018
> URL: https://issues.apache.org/jira/browse/HIVE-17018
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
>
>  we use "hive.auto.convert.join.noconditionaltask.size" as the threshold. it 
> means  the sum of size for n-1 of the tables/partitions for a n-way join is 
> smaller than it, it will be converted to a map join. for example, A join B 
> join C join D join E. Big table is A(100M), small tables are 
> B(10M),C(10M),D(10M),E(10M).  If we set 
> hive.auto.convert.join.noconditionaltask.size=20M. In current code, E,D,B 
> will be converted to map join but C will not be converted to map join. In my 
> understanding, because hive.auto.convert.join.noconditionaltask.size can only 
> contain E and D, so C and B should not be converted to map join.  
> Let's explain more why E can be converted to map join.
> in current code, 
> [SparkMapJoinOptimizer#getConnectedMapJoinSize|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L364]
>  calculates all the mapjoins  in the parent path and child path. The search 
> stops when encountering [UnionOperator or 
> ReduceOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L381].
>  Because C is not converted to map join because {{connectedMapJoinSize + 
> totalSize) > maxSize}} [see 
> code|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L330].The
>  RS before the join of C remains. When calculating whether B will be 
> converted to map join, {{getConnectedMapJoinSize}} returns 0 as encountering 
> [RS 
> |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#409]
>  and causes  {{connectedMapJoinSize + totalSize) < maxSize}} matches.
> [~xuefuz] or [~jxiang]: can you help see whether this is a bug or not  as you 
> are more familiar with SparkJoinOptimizer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17049) hive doesn't support chinese comments for columns

2017-07-05 Thread liugaopeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liugaopeng updated HIVE-17049:
--
Description: 
1.  alter table stg.test_chinese change chinesetitle chinesetitle tinyint 
comment '中文';
2.  desc stg.test_chinese;
Result: chinese comment "中文" becase "??"

also, if i modify the comment via hive view, it will still display the messy 
code "??".

I did some testing, but cannot fix it, such as:
1. change the hive.COLUMNS_V2 to UTF-8 chartset.
2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url 

i found some ideas that need to apply some patch to fix it, but seems they all 
effects in 0.x version, i use the 1.2.1 version.

Please give some guidance.

  was:
1.  alter table stg.test_chinese change chinesetitle chinesetitle tinyint 
comment '中文';
2.  desc stg.test_chinese;
Result: chinese comment "中文" becase "??"

also, if i modify the comment via hive view, it will still display the messy 
code "??".

I did some testing, but cannot fix it, such as:
1. change the hive.COLUMNS_V2 to UTF-8 chartset.
2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url 

i found some ideas that need to apply some patch to fix it, but seems they all 
effects in 0.x version, i use the 1.2.1 version.

Please give some guidence.


> hive doesn't support chinese comments for columns
> -
>
> Key: HIVE-17049
> URL: https://issues.apache.org/jira/browse/HIVE-17049
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 1.2.1
> Environment: hive 1.2.1 in HDP 
>Reporter: liugaopeng
>
> 1.  alter table stg.test_chinese change chinesetitle chinesetitle tinyint 
> comment '中文';
> 2.  desc stg.test_chinese;
> Result: chinese comment "中文" becase "??"
> also, if i modify the comment via hive view, it will still display the messy 
> code "??".
> I did some testing, but cannot fix it, such as:
> 1. change the hive.COLUMNS_V2 to UTF-8 chartset.
> 2. append the characterEncoding=UTF-8 to hive_to_mysqlmetadata url 
> i found some ideas that need to apply some patch to fix it, but seems they 
> all effects in 0.x version, i use the 1.2.1 version.
> Please give some guidance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075863#comment-16075863
 ] 

Hive QA commented on HIVE-16832:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875831/HIVE-16832.18.patch

{color:green}SUCCESS:{color} +1 due to 12 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10847 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5901/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5901/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5901/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875831 - PreCommit-HIVE-Build

> duplicate ROW__ID possible in multi insert into transactional table
> ---
>
> Key: HIVE-16832
> URL: https://issues.apache.org/jira/browse/HIVE-16832
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, 
> HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, 
> HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, 
> HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, 
> HIVE-16832.16.patch, HIVE-16832.17.patch, HIVE-16832.18.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075852#comment-16075852
 ] 

Chao Sun edited comment on HIVE-17010 at 7/6/17 3:27 AM:
-

Ah I see. Sometimes the stats estimation could generate negative values, in 
which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size. 
One case I observed previously:
{code}
not ((P1 or P2) or P3)
{code}
When no column stats are available, Hive will simply divide the # of input rows 
by 2 for each predicate evaluation. Suppose the total input rows is 10, then 
{{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value 
from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The 
operator {{not}}, on the other hand, will subtract the value of its associated 
expression from the total input rows. Therefore in the end you will get {{10 - 
30 = -20}}.

For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but 
either way should be fine.


was (Author: csun):
Ah I see. Sometimes the stats estimation could generate negative values, in 
which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size 
could be. One case I observed previously:
{code}
not ((P1 or P2) or P3)
{code}
When no column stats are available, Hive will simply divide the # of input rows 
by 2 for each predicate evaluation. Suppose the total input rows is 10, then 
{{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value 
from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The 
operator {{not}}, on the other hand, will subtract the value of its associated 
expression from the total input rows. Therefore in the end you will get {{10 - 
30 = -20}}.

For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but 
either way should be fine.

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14688) Hive drop call fails in presence of TDE

2017-07-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075862#comment-16075862
 ] 

Thejas M Nair commented on HIVE-14688:
--

[~wzheng]
HIVE-16402 has the changes to update hadoop dependency to 2.8.0.
What would happen if hive with this change is used against older 2.x hadoop 
versions ?
Hive is still supposed to work against older 2.x versions as well. If it 
results in another error from hadoop for the user, I think the change is file.

cc [~spena]

> Hive drop call fails in presence of TDE
> ---
>
> Key: HIVE-14688
> URL: https://issues.apache.org/jira/browse/HIVE-14688
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Deepesh Khandelwal
>Assignee: Wei Zheng
> Attachments: HIVE-14688.1.patch, HIVE-14688.2.patch, 
> HIVE-14688.3.patch, HIVE-14688.4.patch
>
>
> This should be committed to when Hive moves to Hadoop 2.8
> In Hadoop 2.8.0 TDE trash collection was fixed through HDFS-8831. This 
> enables us to make drop table calls for Hive managed tables where Hive 
> metastore warehouse directory is in encrypted zone. However even with the 
> feature in HDFS, Hive drop table currently fail:
> {noformat}
> $ hdfs crypto -listZones
> /apps/hive/warehouse  key2 
> $ hdfs dfs -ls /apps/hive/warehouse
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2016-09-01 02:54 
> /apps/hive/warehouse/.Trash
> hive> create table abc(a string, b int);
> OK
> Time taken: 5.538 seconds
> hive> dfs -ls /apps/hive/warehouse;
> Found 2 items
> drwxrwxrwt   - hdfs   hdfs  0 2016-09-01 02:54 
> /apps/hive/warehouse/.Trash
> drwxrwxrwx   - deepesh hdfs  0 2016-09-01 17:15 
> /apps/hive/warehouse/abc
> hive> drop table if exists abc;
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to drop 
> default.abc because it is in an encryption zone and trash is enabled.  Use 
> PURGE option to skip trash.)
> {noformat}
> The problem lies here:
> {code:title=metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java}
> private void checkTrashPurgeCombination(Path pathToData, String objectName, 
> boolean ifPurge)
> ...
>   if (trashEnabled) {
> try {
>   HadoopShims.HdfsEncryptionShim shim =
> 
> ShimLoader.getHadoopShims().createHdfsEncryptionShim(FileSystem.get(hiveConf),
>  hiveConf);
>   if (shim.isPathEncrypted(pathToData)) {
> throw new MetaException("Unable to drop " + objectName + " 
> because it is in an encryption zone" +
>   " and trash is enabled.  Use PURGE option to skip trash.");
>   }
> } catch (IOException ex) {
>   MetaException e = new MetaException(ex.getMessage());
>   e.initCause(ex);
>   throw e;
> }
>   }
> {code}
> As we can see that we are making an assumption that delete wouldn't be 
> successful in encrypted zone. We need to modify this logic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075852#comment-16075852
 ] 

Chao Sun commented on HIVE-17010:
-

Ah I see. Sometimes the stats estimation could generate negative values, in 
which case Hive will use {{Long.MAX_VALUE}} for both # of rows and data size 
could be. One case I observed previously:
{code}
not ((P1 or P2) or P3)
{code}
When no column stats are available, Hive will simply divide the # of input rows 
by 2 for each predicate evaluation. Suppose the total input rows is 10, then 
{{P1}}, {{P2}} and {{P3}} will yield 5 respectively. Operator {{or}} adds value 
from both sides so the expression {{((P1 or P2) or P3)}} generates 30 rows. The 
operator {{not}}, on the other hand, will subtract the value of its associated 
expression from the total input rows. Therefore in the end you will get {{10 - 
30 = -20}}.

For the solution you proposed, I'm inclined to use {{StatsUtils.safeAdd}}, but 
either way should be fine.

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075839#comment-16075839
 ] 

liyunzhang_intel commented on HIVE-17010:
-

[~csun]: 
the explain of the query17 without HIVE-17010.patch is in 
[link|https://issues.apache.org/jira/secure/attachment/12875204/query17_explain.log].
  Reduce3's datasize  is 9223372036854775807 
{code}
   Reducer 3 
Reduce Operator Tree:
  Join Operator
condition map:
 Inner Join 0 to 1
keys:
  0 _col28 (type: bigint), _col27 (type: bigint)
  1 cs_bill_customer_sk (type: bigint), cs_item_sk (type: 
bigint)
outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, 
_col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82
Statistics: Num rows: 9223372036854775807 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
Reduce Output Operator
  key expressions: _col22 (type: bigint)
  sort order: +
  Map-reduce partition columns: _col22 (type: bigint)
  Statistics: Num rows: 9223372036854775807 Data size: 
9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL
  value expressions: _col1 (type: bigint), _col2 (type: 
bigint), _col6 (type: bigint), _col8 (type: bigint), _col9 (type: int), _col27 
(type: bigint), _col28 (type: bigint), _col34 (type: bigint), _col35 (type: 
int), _col45 (type: bigint), _col51 (type: bigint), _col63 (type: bigint), 
_col66 (type: int), _col82 
{code}

Map9's datasize is 1022672 
{code}
Map 9 
Map Operator Tree:
TableScan
  alias: d1
  filterExpr: (d_date_sk is not null and (d_quarter_name = 
'2000Q1')) (type: boolean)
  Statistics: Num rows: 73049 Data size: 2045372 Basic stats: 
COMPLETE Column stats: NONE
  Filter Operator
predicate: (d_date_sk is not null and (d_quarter_name = 
'2000Q1')) (type: boolean)
Statistics: Num rows: 36524 Data size: 1022672 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  key expressions: d_date_sk (type: bigint)
  sort order: +
  Map-reduce partition columns: d_date_sk (type: bigint)
  Statistics: Num rows: 36524 Data size: 1022672 Basic 
stats: COMPLETE Column stats: NONE
{code}
There is a join of Map 9 and Reducer3
{code}
Reducer 4 <- Map 9 (PARTITION-LEVEL SORT, 1), Reducer 3 
(PARTITION-LEVEL SORT, 1)
{code}
9223372036854775807 + 1022672 
cause the problem

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16100) Dynamic Sorted Partition optimizer loses sibling operators

2017-07-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075834#comment-16075834
 ] 

Ashutosh Chauhan commented on HIVE-16100:
-

[~gopalv] You may include a testcase from HIVE-17020 in this patch.

> Dynamic Sorted Partition optimizer loses sibling operators
> --
>
> Key: HIVE-16100
> URL: https://issues.apache.org/jira/browse/HIVE-16100
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.2.1, 2.1.1, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-16100.1.patch, HIVE-16100.2.patch, 
> HIVE-16100.2.patch, HIVE-16100.3.patch
>
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java#L173
> {code}
>   // unlink connection between FS and its parent
>   fsParent = fsOp.getParentOperators().get(0);
>   fsParent.getChildOperators().clear();
> {code}
> The optimizer discards any cases where the fsParent has another SEL child 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075826#comment-16075826
 ] 

Chao Sun commented on HIVE-17010:
-

With long you need to have ~9000 PB of {{numberOfBytes}} for the overflow to 
happen. It's interesting that this can occur with 3TB of input data. I'm just 
wondering if there's any bug in the code that caused this.

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075824#comment-16075824
 ] 

liyunzhang_intel commented on HIVE-17010:
-

[~lirui],[~csun], [~ferd]: in HIVE-17010.patch, use double to replace long type 
to solve the problem
similar bug was found in HIVE-8689.
in HIVE-8689, it use 
[StatsUtils.safeAdd|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1626]
 to solve the problem. So which solution is better? 1. use double to replace 
Long 2. use StatsUtils.safeAdd , please give me your suggestion

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-16922:
---
Attachment: (was: HIVE-16922.1.patch)

> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075817#comment-16075817
 ] 

Hive QA commented on HIVE-17047:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875826/HIVE-17047.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5900/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5900/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5900/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-07-06 02:32:03.136
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-5900/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-07-06 02:32:03.140
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 2a718a1 HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with 
just precision specified (Thomas Friedrich, reviewed by Gunther Hagleitner)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 2a718a1 HIVE-10616: TypeInfoUtils doesn't handle DECIMAL with 
just precision specified (Thomas Friedrich, reviewed by Gunther Hagleitner)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-07-06 02:32:08.056
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java:236
error: ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java: patch 
does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875826 - PreCommit-HIVE-Build

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17047.1.patch
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075815#comment-16075815
 ] 

Hive QA commented on HIVE-16974:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875812/HIVE-16974.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5899/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5899/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5899/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875812 - PreCommit-HIVE-Build

> Change the sort key for the schema tool validator to be 
> 
>
> Key: HIVE-16974
> URL: https://issues.apache.org/jira/browse/HIVE-16974
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-16974.patch, HIVE-16974.patch
>
>
> In HIVE-16729, we introduced ordering of results/failures returned by 
> schematool's validators. This allows fault injection testing to expect 
> results that can be verified. However, they were sorted on NAME values which 
> in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK 
> name column value, the result could be different depending on the backend 
> database(if they sort NULLs first or last).
> So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-16922:
---
Status: Open  (was: Patch Available)

> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
> Attachments: HIVE-16922.1.patch
>
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-16922:
---
Status: Patch Available  (was: In Progress)

> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
> Attachments: HIVE-16922.1.patch
>
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li updated HIVE-16922:
---
Attachment: HIVE-16922.1.patch

The patch is based on master branch.

> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
> Attachments: HIVE-16922.1.patch
>
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16922 started by Bing Li.
--
> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075808#comment-16075808
 ] 

Rui Li commented on HIVE-17010:
---

[~csun], we use a long to compute the sum of multiple longs. I guess that's in 
general a dangerous operation.

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2017-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075801#comment-16075801
 ] 

Rui Li commented on HIVE-17020:
---

[~ashutoshc], the following query can reproduce the issue:
{code}
explain from (select key from src cluster by key) a
  insert overwrite table d1 select a.key
  insert overwrite table d2 select a.key cluster by a.key;
{code}
The insert to table d1 will be lost.

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist

2017-07-05 Thread Dan Burkert (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075790#comment-16075790
 ] 

Dan Burkert commented on HIVE-17008:


I'm observing these events through the [notification log 
API|https://github.com/danburkert/hive/blob/master/metastore/if/hive_metastore.thrift#L1546-L1549].
  Is it expected that {{ThriftHiveMetastore.get_next_notification}} returns 
events for failed DDL operations?  There isn't any way to discern whether or 
not the event failed just from the {{NotificationEvent}} struct.

> Where exactly is the NPE - I assume somewhere down the notifyEvent stack ?

The NPE is thrown inside the notification event listener, because {{db}} can be 
null on [this 
line|https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java?utf8=%E2%9C%93#L1139].

> HiveMetastore.drop_database can return NPE if database does not exist
> -
>
> Key: HIVE-17008
> URL: https://issues.apache.org/jira/browse/HIVE-17008
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17008.0.patch
>
>
> When dropping a non-existent database, the HMS will still fire registered 
> {{DROP_DATABASE}} event listeners.  This results in an NPE when the listeners 
> attempt to deref the {{null}} database parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075778#comment-16075778
 ] 

liyunzhang_intel commented on HIVE-17010:
-

[~csun]: found the problem on 3TB data. Actually the biggest table of tpc-ds 
"store_sales" does not exceed the max value of Long type(2^63-1). But 
[TPC-DS/query17|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query17.sql]
 is a query with many join.   We 
 use 
[numberOfBytes|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
 to collect the numberOfBytes of sibling of specified RS.  Here the sibling of 
specified RS maybe the result of join of big tables. The result execeed the max 
value of Long type.

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-17010:

Description: 
We use 
[numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
 to collect the numberOfBytes of sibling of specified RS. We use Long type and 
it happens overflow when the data is too big. After happening this situation, 
the parallelism is decided by 
[sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
 if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond is 
a dymamic value which is decided by spark runtime. For example, the value of 
sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility that 
the value may be 1. The may problem here is the overflow of addition of Long 
type.  You can reproduce the overflow problem by following code
{code}
public static void main(String[] args) {
  long a1= 9223372036854775807L;
  long a2=1022672;

  long res = a1+a2;
  System.out.println(res);  //-9223372036853753137

  BigInteger b1= BigInteger.valueOf(a1);
  BigInteger b2 = BigInteger.valueOf(a2);

  BigInteger bigRes = b1.add(b2);

  System.out.println(bigRes); //9223372036855798479

}
{code}

  was:
[link title|http://example.com] We use 
[numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
 to collect the numberOfBytes of sibling of specified RS. We use Long type and 
it happens overflow when the data is too big. After happening this situation, 
the parallelism is decided by 
[sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
 if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond is 
a dymamic value which is decided by spark runtime. For example, the value of 
sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility that 
the value may be 1. The may problem here is the overflow of addition of Long 
type.  You can reproduce the overflow problem by following code
{code}
public static void main(String[] args) {
  long a1= 9223372036854775807L;
  long a2=1022672;

  long res = a1+a2;
  System.out.println(res);  //-9223372036853753137

  BigInteger b1= BigInteger.valueOf(a1);
  BigInteger b2 = BigInteger.valueOf(a2);

  BigInteger bigRes = b1.add(b2);

  System.out.println(bigRes); //9223372036855798479

}
{code}


> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075769#comment-16075769
 ] 

Mohit Sabharwal commented on HIVE-17048:


Good to add the operation type to TestHs2Hooks and TestHs2HooksWithMiniKdc unit 
tests as well (see HIVE-8338)
LGTM, otherwise.

> Pass HiveOperation info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext
> ---
>
> Key: HIVE-17048
> URL: https://issues.apache.org/jira/browse/HIVE-17048
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17048.1.patch
>
>
> Currently hive passes the following info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext (see 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
>  But the operation type (HiveOperation) is also needed in some cases, e.g., 
> when integrating with Sentry. 
> {noformat}
> hookCtx.setConf(conf);
> hookCtx.setUserName(userName);
> hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
> hookCtx.setCommand(command);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075763#comment-16075763
 ] 

Mohit Sabharwal commented on HIVE-17008:


Thanks, [~dan_impala_9180]. There are two flavors of listeners in each DDL 
operation, one which runs in the same transaction as the DDL event (notify only 
upon success) and one which run outside the transaction (can notify failed DDL 
operations as well).  Where exactly is the NPE - I assume somewhere down the 
notifyEvent stack ?

+ [~spena], who has worked on this recently.

> HiveMetastore.drop_database can return NPE if database does not exist
> -
>
> Key: HIVE-17008
> URL: https://issues.apache.org/jira/browse/HIVE-17008
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17008.0.patch
>
>
> When dropping a non-existent database, the HMS will still fire registered 
> {{DROP_DATABASE}} event listeners.  This results in an NPE when the listeners 
> attempt to deref the {{null}} database parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17045) Add HyperLogLog as an UDAF

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075760#comment-16075760
 ] 

Hive QA commented on HIVE-17045:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875790/HIVE-17045.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10832 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] 
(batchId=69)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5898/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5898/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5898/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875790 - PreCommit-HIVE-Build

> Add HyperLogLog as an UDAF
> --
>
> Key: HIVE-17045
> URL: https://issues.apache.org/jira/browse/HIVE-17045
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17045.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2017-07-05 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10616:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Failures do not look related. I've committed to master.

> TypeInfoUtils doesn't handle DECIMAL with just precision specified
> --
>
> Key: HIVE-10616
> URL: https://issues.apache.org/jira/browse/HIVE-10616
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch
>
>
> The parseType method in TypeInfoUtils doesn't handle decimal types with just 
> precision specified although that's a valid type definition. 
> As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
> decimal(10,0) for any decimal() string. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2017-07-05 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-10616:
-

Assignee: Thomas Friedrich  (was: Jason Dere)

> TypeInfoUtils doesn't handle DECIMAL with just precision specified
> --
>
> Key: HIVE-10616
> URL: https://issues.apache.org/jira/browse/HIVE-10616
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
> Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch
>
>
> The parseType method in TypeInfoUtils doesn't handle decimal types with just 
> precision specified although that's a valid type definition. 
> As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
> decimal(10,0) for any decimal() string. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075702#comment-16075702
 ] 

Aihua Xu edited comment on HIVE-17048 at 7/6/17 12:41 AM:
--

patch-1: simple fix to pass HiveOperation through the context.


was (Author: aihuaxu):
patch-1: simple fix to pass HiveOperation to the context.

> Pass HiveOperation info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext
> ---
>
> Key: HIVE-17048
> URL: https://issues.apache.org/jira/browse/HIVE-17048
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17048.1.patch
>
>
> Currently hive passes the following info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext (see 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
>  But the operation type (HiveOperation) is also needed in some cases, e.g., 
> when integrating with Sentry. 
> {noformat}
> hookCtx.setConf(conf);
> hookCtx.setUserName(userName);
> hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
> hookCtx.setCommand(command);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17048:

Status: Patch Available  (was: Open)

patch-1: simple fix to pass HiveOperation to the context.

> Pass HiveOperation info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext
> ---
>
> Key: HIVE-17048
> URL: https://issues.apache.org/jira/browse/HIVE-17048
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17048.1.patch
>
>
> Currently hive passes the following info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext (see 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
>  But the operation type (HiveOperation) is also needed in some cases, e.g., 
> when integrating with Sentry. 
> {noformat}
> hookCtx.setConf(conf);
> hookCtx.setUserName(userName);
> hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
> hookCtx.setCommand(command);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-17048:

Attachment: HIVE-17048.1.patch

> Pass HiveOperation info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext
> ---
>
> Key: HIVE-17048
> URL: https://issues.apache.org/jira/browse/HIVE-17048
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-17048.1.patch
>
>
> Currently hive passes the following info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext (see 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
>  But the operation type (HiveOperation) is also needed in some cases, e.g., 
> when integrating with Sentry. 
> {noformat}
> hookCtx.setConf(conf);
> hookCtx.setUserName(userName);
> hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
> hookCtx.setCommand(command);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-10616) TypeInfoUtils doesn't handle DECIMAL with just precision specified

2017-07-05 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075688#comment-16075688
 ] 

Gunther Hagleitner commented on HIVE-10616:
---

Failures do not look related - [~jdere]?

> TypeInfoUtils doesn't handle DECIMAL with just precision specified
> --
>
> Key: HIVE-10616
> URL: https://issues.apache.org/jira/browse/HIVE-10616
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.0.0
>Reporter: Thomas Friedrich
>Assignee: Jason Dere
>Priority: Minor
> Attachments: HIVE-10616.1.patch, HIVE-10616.2.patch
>
>
> The parseType method in TypeInfoUtils doesn't handle decimal types with just 
> precision specified although that's a valid type definition. 
> As a result, TypeInfoUtils.getTypeInfoFromTypeString will always return 
> decimal(10,0) for any decimal() string. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16935) Hive should strip comments from input before choosing which CommandProcessor to run.

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075687#comment-16075687
 ] 

Hive QA commented on HIVE-16935:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875796/HIVE-16935.4.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=237)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=101)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5897/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5897/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875796 - PreCommit-HIVE-Build

> Hive should strip comments from input before choosing which CommandProcessor 
> to run.
> 
>
> Key: HIVE-16935
> URL: https://issues.apache.org/jira/browse/HIVE-16935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-16935.1.patch, HIVE-16935.2.patch, 
> HIVE-16935.3.patch, HIVE-16935.4.patch
>
>
> While using Beeswax, Hue fails to execute statement with following error:
> Error while compiling statement: FAILED: ParseException line 3:4 missing 
> KW_ROLE at 'a' near 'a' line 3:5 missing EOF at '=' near 'a'
> {quote}
> -- comment
> SET a=1;
> SELECT 1;
> {quote}
> The same code works in Beeline and in Impala.
> The same code fails in CliDriver 
>  
> h2. Background
> Hive deals with sql comments (“-- to end of line”) in different places.
> Some clients attempt to strip comments. For example BeeLine was recently 
> enhanced in https://issues.apache.org/jira/browse/HIVE-13864 to strip 
> comments from multi-line commands before they are executed.
> Other clients such as Hue or Jdbc do not strip comments before sending text.
> Some tests such as TestCliDriver strip comments before running tests.
> When Hive gets a command the CommandProcessorFactory looks at the text to 
> determine which CommandProcessor should handle the command. In the bug case 
> the correct CommandProcessor is SetProcessor, but the comments confuse the 
> CommandProcessorFactory and so the command is treated as sql. Hive’s sql 
> parser understands and ignores comments, but it does not understand the set 
> commands usually handled by SetProcessor and so we get the ParseException 
> shown above.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Dan Burkert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Burkert updated HIVE-16993:
---
Attachment: HIVE-17008.8.patch

> ThriftHiveMetastore.create_database can fail if the locationUri is not set
> --
>
> Key: HIVE-16993
> URL: https://issues.apache.org/jira/browse/HIVE-16993
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, 
> HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, 
> HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch, HIVE-17008.8.patch
>
>
> Calling 
> [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078]
>  with a database with an unset {{locationUri}} field through the C++ 
> implementation fails with:
> {code}
> MetaException(message=java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
> {code}
> The 
> [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270]
>  Thrift field is 'default requiredness (implicit)', and Thrift [does not 
> specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] 
> whether unset default requiredness fields are encoded.  Empirically, the Java 
> generated code [does not write the 
> {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942]
>  when the field is unset, while the C++ generated code 
> [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890].
> The MetaStore treats the field as optional, and [fills in a default 
> value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871]
>  if the field is unset.
> The end result is that when the C++ implementation sends a {{Database}} 
> without the field set, it actually writes an empty string, and the MetaStore 
> treats it as a set field (non-null), and then calls a {{Path}} API which 
> rejects the empty string.  The fix is simple: make the {{locationUri}} field 
> optional in metastore.thrift.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-17048:
---


> Pass HiveOperation info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext
> ---
>
> Key: HIVE-17048
> URL: https://issues.apache.org/jira/browse/HIVE-17048
> Project: Hive
>  Issue Type: Improvement
>  Components: Hooks
>Affects Versions: 2.1.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> Currently hive passes the following info to HiveSemanticAnalyzerHook through 
> HiveSemanticAnalyzerHookContext (see 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
>  But the operation type (HiveOperation) is also needed in some cases, e.g., 
> when integrating with Sentry. 
> {noformat}
> hookCtx.setConf(conf);
> hookCtx.setUserName(userName);
> hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
> hookCtx.setCommand(command);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-10495) Hive index creation code throws NPE if index table is null

2017-07-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075657#comment-16075657
 ] 

Ashutosh Chauhan commented on HIVE-10495:
-

This can still result in NPE in startFunction, if indexTable is null. Also, 
there is similar logic for endFunction() We shall refactor this null check so 
that its useful for both function. [~libing] Would you like to update your 
patch with that change? 

> Hive index creation code throws NPE if index table is null
> --
>
> Key: HIVE-10495
> URL: https://issues.apache.org/jira/browse/HIVE-10495
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
> Attachments: HIVE-10495.1.patch, HIVE-10495.2.patch
>
>
> The stack trace would be:
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_index(HiveMetaStore.java:2870)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
> at java.lang.reflect.Method.invoke(Method.java:611)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
> at $Proxy9.add_index(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createIndex(HiveMetaStoreClient.java:962)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17036) Lineage: Minor CPU/Mem optimization for lineage transform

2017-07-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075644#comment-16075644
 ] 

Ashutosh Chauhan commented on HIVE-17036:
-

+1

> Lineage: Minor CPU/Mem optimization for lineage transform
> -
>
> Key: HIVE-17036
> URL: https://issues.apache.org/jira/browse/HIVE-17036
> Project: Hive
>  Issue Type: Improvement
>  Components: lineage
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17036.1.patch, prof_1.png, prof_2.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table

2017-07-05 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16832:
--
Attachment: HIVE-16832.18.patch

> duplicate ROW__ID possible in multi insert into transactional table
> ---
>
> Key: HIVE-16832
> URL: https://issues.apache.org/jira/browse/HIVE-16832
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, 
> HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, 
> HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, 
> HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, 
> HIVE-16832.16.patch, HIVE-16832.17.patch, HIVE-16832.18.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075626#comment-16075626
 ] 

Hive QA commented on HIVE-16832:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875809/HIVE-16832.17.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5896/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5896/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5896/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-07-05 23:32:57.513
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-5896/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-07-05 23:32:57.516
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at c39b879 HIVE-16893: move replication dump related work in 
semantic analysis phase to execution phase using a task (Anishek Agarwal, 
reviewed by Sankar Hariappan, Daniel Dai)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at c39b879 HIVE-16893: move replication dump related work in 
semantic analysis phase to execution phase using a task (Anishek Agarwal, 
reviewed by Sankar Hariappan, Daniel Dai)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-07-05 23:33:02.202
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: patch -p0
patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
patching file 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java
patching file 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java
patching file 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorImpl.java
patching file 
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
patching file 
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/TestMutations.java
patching file 
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
patching file 
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestMutatorImpl.java
patching file ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/RecordIdentifier.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java
patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
patching file 
ql/src/java/org/apache/hadoop/hive/ql/parse/UpdateDeleteSemanticAnalyzer.java
patching file ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java
patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java
patching file ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
patching file 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2WithSplitUpdate.java
patching file 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2WithSplitUpdateAndVectorization.java
patching file 

[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075621#comment-16075621
 ] 

Hive QA commented on HIVE-16993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875800/HIVE-17008.7.patch

{color:green}SUCCESS:{color} +1 due to 10 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10817 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge9]
 (batchId=167)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=101)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importAll (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneDb (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneFunc 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneRole 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTableNonPartitioned
 (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importOneTablePartitioned
 (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importSecurity 
(batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.importTablesWithConstraints
 (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.parallel (batchId=208)
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.parallelOdd (batchId=208)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=224)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5895/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5895/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5895/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875800 - PreCommit-HIVE-Build

> ThriftHiveMetastore.create_database can fail if the locationUri is not set
> --
>
> Key: HIVE-16993
> URL: https://issues.apache.org/jira/browse/HIVE-16993
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, 
> HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, 
> HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch
>
>
> Calling 
> [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078]
>  with a database with an unset {{locationUri}} field through the C++ 
> implementation fails with:
> {code}
> MetaException(message=java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
> {code}
> The 
> [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270]
>  Thrift field is 'default requiredness (implicit)', and Thrift [does not 
> specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] 
> whether unset default requiredness fields are encoded.  Empirically, the Java 
> generated code [does not write the 
> {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942]
>  when the field is unset, while the C++ generated code 
> [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890].
> The MetaStore treats the field as optional, and [fills in a default 
> value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871]
>  if the field is 

[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-05 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075593#comment-16075593
 ] 

Mithun Radhakrishnan commented on HIVE-16908:
-

Yes, this is what I was afraid of. The intention of 
{{testTableSchemaPropagation()}} was to simulate table-propagation across 
different clusters/HCat instances, as Apache Falcon (or similar projects) do. I 
wonder if this change dilutes that intention. :/ I do recognize that the static 
state in {{ObjectStore}} makes this problematic. I'm trying to figure out an 
alternative.

Question: If the target metastore instance were accessed through a different 
classloader, their states would be isolated, right? Would that be an acceptable 
solution?

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16844) Fix Connection leak in ObjectStore when new Conf object is used

2017-07-05 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075583#comment-16075583
 ] 

Mithun Radhakrishnan commented on HIVE-16844:
-

Sorry to resurrect this discussion. I was pondering over the solution on 
HIVE-16908, and wondered whether the solution here is complete. Here's the code 
to {{ObjectStore::setConf()}}:

{code:java|title=ObjectStore.java}
  @Override
  @SuppressWarnings("nls")
  public void setConf(Configuration conf) {
// Although an instance of ObjectStore is accessed by one thread, there may
// be many threads with ObjectStore instances. So the static variables
// pmf and prop need to be protected with locks.
pmfPropLock.lock();
try {
  isInitialized = false;
  hiveConf = conf;
  configureSSL(conf);
  Properties propsFromConf = getDataSourceProps(conf);
  boolean propsChanged = !propsFromConf.equals(prop);

  if (propsChanged) {
if (pmf != null){
  clearOutPmfClassLoaderCache(pmf);
  // close the underlying connection pool to avoid leaks
  pmf.close();
}
pmf = null;
prop = null;
  }
...
  }
{code}

Note that {{pmfPropLock}} is locked before {{pmf.close()}} is called. But this 
is also the only place where {{pmfPropLock}} is used. So, if another thread is 
in the middle of accessing {{pmf}}, it is possible that the instance is messed 
up for that thread.
Before this code change, resetting {{pmf}} would not affect any threads with an 
outstanding reference.

> Fix Connection leak in ObjectStore when new Conf object is used
> ---
>
> Key: HIVE-16844
> URL: https://issues.apache.org/jira/browse/HIVE-16844
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Fix For: 3.0.0
>
> Attachments: HIVE-16844.1.patch
>
>
> The code path in ObjectStore.java currently leaks BoneCP (or Hikari) 
> connection pools when a new configuration object is passed in. The code needs 
> to ensure that the persistence-factory is closed before it is nullified.
> The relevant code is 
> [here|https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L290].
>  Note that pmf is set to null, but the underlying connection pool is not 
> closed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075566#comment-16075566
 ] 

Mohit Sabharwal commented on HIVE-17022:


Test failures are unrelated.


> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.1.patch, HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17047:

Fix Version/s: (was: 1.2.1)

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17047.1.patch
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17047:

Attachment: HIVE-17047.1.patch

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17047.1.patch
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17047:

Status: Patch Available  (was: Open)

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17047.1.patch
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-17047:

Target Version/s: 1.2.1

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-17047.1.patch
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075556#comment-16075556
 ] 

Zhiyuan Yang commented on HIVE-17047:
-

It turns out HIVE-15147 has fix for this accidentally. HIVE-15147 was for Hive 
2.2.0 LLAP but not for earlier versions. Uploading a partial patch from 
HIVE-15147 for earlier versions.

> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 1.2.1
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17047) Allow table property to be populated to jobConf to make FixedLengthInputFormat work

2017-07-05 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang reassigned HIVE-17047:
---


> Allow table property to be populated to jobConf to make 
> FixedLengthInputFormat work
> ---
>
> Key: HIVE-17047
> URL: https://issues.apache.org/jira/browse/HIVE-17047
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Fix For: 1.2.1
>
>
> To make FixedLengthInputFormat work in Hive, we need table specific value for 
> the configuration "fixedlengthinputformat.record.length". Right now the best 
> place would be table property. Unfortunately, table property is not alway 
> populated to InputFormat configurations because of this in HiveInputFormat:
> {code}
> PartitionDesc part = pathToPartitionInfo.get(hsplit.getPath().toString());
> if ((part != null) && (part.getTableDesc() != null)) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075551#comment-16075551
 ] 

Hive QA commented on HIVE-17022:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875801/HIVE-17022.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10831 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=232)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5894/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5894/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5894/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12875801 - PreCommit-HIVE-Build

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.1.patch, HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075516#comment-16075516
 ] 

Ashutosh Chauhan commented on HIVE-16996:
-

Yeah we should do it in 2 steps. First for hll udaf and metastore changes 
later. 
Also, instead of adding new udaf you can overload existing compute_stats() udaf 
so that we can reuse logic of other stats of the udaf.

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16966.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17001) Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

2017-07-05 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075421#comment-16075421
 ] 

Naveen Gangam commented on HIVE-17001:
--

[~zsombor.klara] Quick qs on the issue. I am a bit confused between the jira 
summary and the reproducer. Summary says "insert overwrite" but the reproducer 
does not use "insert overwrite". So I am wondering if the reproducer is 
intended to be the same as written.

I am not sure if this is a bug. Say, you execute the following
INSERT INTO test PARTITION(ds='p1') values ('a');
INSERT INTO test PARTITION(ds='p1') values ('a');

The resultant partition directory should contain 2 data files and a select * on 
the table should return 2 rows. This is by design.
The testcase in this jira is semantically similar to the case above, where you 
have some existing data in a partition and you are inserting additional data. 
Would you agree?

Normally, step 4 of the reproducer should have deleted the data for the 
partition, had it existed. But I think it is legal to manage some or all of the 
partition data externally, as well. 

Am I making sense? Thanks

> Insert overwrite table doesn't clean partition directory on HDFS if partition 
> is missing from HMS
> -
>
> Key: HIVE-17001
> URL: https://issues.apache.org/jira/browse/HIVE-17001
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17001.01.patch
>
>
> Insert overwrite table should clear existing data before creating the new 
> data files.
> For a partitioned table we will clean any folder of existing partitions on 
> HDFS, however if the partition folder exists only on HDFS and the partition 
> definition is missing in HMS, the folder is not cleared.
> Reproduction steps:
> 1. CREATE TABLE test( col1 string) PARTITIONED BY (ds string);
> 2. INSERT INTO test PARTITION(ds='p1') values ('a');
> 3. Copy the data to a different folder with different name.
> 4. ALTER TABLE test DROP PARTITION (ds='p1');
> 5. Recreate the partition directory, copy and rename the data file back
> 6. INSERT INTO test PARTITION(ds='p1') values ('b');
> 7. SELECT * from test;
> will result in 2 records being returned instead of 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17001) Insert overwrite table doesn't clean partition directory on HDFS if partition is missing from HMS

2017-07-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075406#comment-16075406
 ] 

Sergio Peña commented on HIVE-17001:


[~zsombor.klara] I didn't understand the test case.

{noformat}
# One partition dt='p1' with row ("a",1) is added
insert into test_part partition(dt = 'p1') values ("a", 1);

# Partition metadata is removed only (no data because it is an external table)
alter table test_part drop partition (dt='p1');

# Data is moved
dfs -mv ${system:test.tmp.dir}/test/dt=p1/00_0 
${system:test.tmp.dir}/test/dt=p1/00_1;

# Partition is re-created with dt='p1" with row ("b",2)
insert overwrite table test_part partition(dt = 'p1') values ("b", 2);

# This is correct, only one row is seen because the row ("a",1) was moved to 
another location manually.
# Where is the issue here?
select * from test_part;
{noformat}



> Insert overwrite table doesn't clean partition directory on HDFS if partition 
> is missing from HMS
> -
>
> Key: HIVE-17001
> URL: https://issues.apache.org/jira/browse/HIVE-17001
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-17001.01.patch
>
>
> Insert overwrite table should clear existing data before creating the new 
> data files.
> For a partitioned table we will clean any folder of existing partitions on 
> HDFS, however if the partition folder exists only on HDFS and the partition 
> definition is missing in HMS, the folder is not cleared.
> Reproduction steps:
> 1. CREATE TABLE test( col1 string) PARTITIONED BY (ds string);
> 2. INSERT INTO test PARTITION(ds='p1') values ('a');
> 3. Copy the data to a different folder with different name.
> 4. ALTER TABLE test DROP PARTITION (ds='p1');
> 5. Recreate the partition directory, copy and rename the data file back
> 6. INSERT INTO test PARTITION(ds='p1') values ('b');
> 7. SELECT * from test;
> will result in 2 records being returned instead of 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-07-05 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16974:
-
Status: Patch Available  (was: Open)

> Change the sort key for the schema tool validator to be 
> 
>
> Key: HIVE-16974
> URL: https://issues.apache.org/jira/browse/HIVE-16974
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-16974.patch, HIVE-16974.patch
>
>
> In HIVE-16729, we introduced ordering of results/failures returned by 
> schematool's validators. This allows fault injection testing to expect 
> results that can be verified. However, they were sorted on NAME values which 
> in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK 
> name column value, the result could be different depending on the backend 
> database(if they sort NULLs first or last).
> So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-07-05 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16974:
-
Attachment: HIVE-16974.patch

> Change the sort key for the schema tool validator to be 
> 
>
> Key: HIVE-16974
> URL: https://issues.apache.org/jira/browse/HIVE-16974
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-16974.patch, HIVE-16974.patch
>
>
> In HIVE-16729, we introduced ordering of results/failures returned by 
> schematool's validators. This allows fault injection testing to expect 
> results that can be verified. However, they were sorted on NAME values which 
> in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK 
> name column value, the result could be different depending on the backend 
> database(if they sort NULLs first or last).
> So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-07-05 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16974:
-
Status: Open  (was: Patch Available)

The pre-commits havent been kicked off for some reason. Will re-attach the 
patch.

> Change the sort key for the schema tool validator to be 
> 
>
> Key: HIVE-16974
> URL: https://issues.apache.org/jira/browse/HIVE-16974
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-16974.patch, HIVE-16974.patch
>
>
> In HIVE-16729, we introduced ordering of results/failures returned by 
> schematool's validators. This allows fault injection testing to expect 
> results that can be verified. However, they were sorted on NAME values which 
> in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK 
> name column value, the result could be different depending on the backend 
> database(if they sort NULLs first or last).
> So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table

2017-07-05 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16832:
--
Attachment: HIVE-16832.17.patch

> duplicate ROW__ID possible in multi insert into transactional table
> ---
>
> Key: HIVE-16832
> URL: https://issues.apache.org/jira/browse/HIVE-16832
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, 
> HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, 
> HIVE-16832.08.patch, HIVE-16832.09.patch, HIVE-16832.10.patch, 
> HIVE-16832.11.patch, HIVE-16832.14.patch, HIVE-16832.15.patch, 
> HIVE-16832.16.patch, HIVE-16832.17.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17010) Fix the overflow problem of Long type in SetSparkReducerParallelism

2017-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075268#comment-16075268
 ] 

Chao Sun commented on HIVE-17010:
-

bq. We use Long type and it happens overflow when the data is too big.
I don't understand. How it could overflow with long type? how large is the 
dataset you used for testing?

> Fix the overflow problem of Long type in SetSparkReducerParallelism
> ---
>
> Key: HIVE-17010
> URL: https://issues.apache.org/jira/browse/HIVE-17010
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Attachments: HIVE-17010.1.patch
>
>
> [link title|http://example.com] We use 
> [numberOfByteshttps://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L129]
>  to collect the numberOfBytes of sibling of specified RS. We use Long type 
> and it happens overflow when the data is too big. After happening this 
> situation, the parallelism is decided by 
> [sparkMemoryAndCores.getSecond()|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L184]
>  if spark.dynamic.allocation.enabled is true, sparkMemoryAndCores.getSecond 
> is a dymamic value which is decided by spark runtime. For example, the value 
> of sparkMemoryAndCores.getSecond is 5 or 15 randomly. There is possibility 
> that the value may be 1. The may problem here is the overflow of addition of 
> Long type.  You can reproduce the overflow problem by following code
> {code}
> public static void main(String[] args) {
>   long a1= 9223372036854775807L;
>   long a2=1022672;
>   long res = a1+a2;
>   System.out.println(res);  //-9223372036853753137
>   BigInteger b1= BigInteger.valueOf(a1);
>   BigInteger b2 = BigInteger.valueOf(a2);
>   BigInteger bigRes = b1.add(b2);
>   System.out.println(bigRes); //9223372036855798479
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-05 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075241#comment-16075241
 ] 

Mithun Radhakrishnan edited comment on HIVE-16908 at 7/5/17 6:53 PM:
-

I'll need a little time to review. On the face of it, this change is 
disconcerting, since it looks like this changes the intention of the tests 
added in HIVE-7341. :/ Let me take a closer look.


was (Author: mithun):
I'll need a little time to review. On the face of it, this change is 
disconcerting, since it looks like changes the intention of the tests added in 
HIVE-7341. :/ Let me take a closer look.

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844

2017-07-05 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075241#comment-16075241
 ] 

Mithun Radhakrishnan commented on HIVE-16908:
-

I'll need a little time to review. On the face of it, this change is 
disconcerting, since it looks like changes the intention of the tests added in 
HIVE-7341. :/ Let me take a closer look.

> Failures in TestHcatClient due to HIVE-16844
> 
>
> Key: HIVE-16908
> URL: https://issues.apache.org/jira/browse/HIVE-16908
> Project: Hive
>  Issue Type: Bug
>Reporter: Sunitha Beeram
>Assignee: Sunitha Beeram
> Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch
>
>
> Some of the tests in TestHCatClient.java, for ex:
> {noformat}
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
>  (batchId=177)
> org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
> (batchId=177)
> {noformat}
> are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new 
> configuration object is set on the ObjectStore. TestHCatClient fires up a 
> second instance of metastore thread with a different conf object that results 
> in the PersistenceMangaerFactory closure and hence tests fail. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075209#comment-16075209
 ] 

Naveen Gangam commented on HIVE-17022:
--

I only looked at the {{public}} access for the lock() but did not realize it 
wasnt being called from outside. Makes sense to make it private in this case.
Thanks for the changes. The patch looks good to me. +1 pending tests

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.1.patch, HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2017-07-05 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075206#comment-16075206
 ] 

Ashutosh Chauhan commented on HIVE-17020:
-

[~lirui] If you have test case for it, can you please share it. It will be good 
to add it as part of HIVE-16100 fix.

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16998) Add config to enable HoS DPP only for map-joins

2017-07-05 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-16998:
---

Assignee: Janaki Lahorani  (was: Sahil Takiar)

> Add config to enable HoS DPP only for map-joins
> ---
>
> Key: HIVE-16998
> URL: https://issues.apache.org/jira/browse/HIVE-16998
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Janaki Lahorani
>
> HoS DPP will split a given operator tree in two under the following 
> conditions: it has detected that the query can benefit from DPP, and the 
> filter is not a map-join (see SplitOpTreeForDPP).
> This can hurt performance if the the non-partitioned side of the join 
> involves a complex operator tree - e.g. the query {{select count(*) from 
> srcpart where srcpart.ds in (select max(srcpart.ds) from srcpart union all 
> select min(srcpart.ds) from srcpart)}} will require running the subquery 
> twice, once in each Spark job.
> Queries with map-joins don't get split into two operator trees and thus don't 
> suffer from this drawback. Thus, it would be nice to have a config key that 
> just enables DPP on HoS for map-joins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-17022:
---
Attachment: HIVE-17022.1.patch

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.1.patch, HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075199#comment-16075199
 ] 

Mohit Sabharwal commented on HIVE-17022:


Thanks. 

Note that Logger is from slf4j not log4j, so there is no extra cost of string 
formatting. 
But its probably better to add the conditional for readability. The other lock 
is really
just a helper, so making it private. Printed sorted locks is a great idea. 
Updating patch.

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.1.patch, HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Dan Burkert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Burkert updated HIVE-16993:
---
Attachment: HIVE-17008.7.patch

> ThriftHiveMetastore.create_database can fail if the locationUri is not set
> --
>
> Key: HIVE-16993
> URL: https://issues.apache.org/jira/browse/HIVE-16993
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, 
> HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, 
> HIVE-16993.5.patch, HIVE-17008.6.patch, HIVE-17008.7.patch
>
>
> Calling 
> [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078]
>  with a database with an unset {{locationUri}} field through the C++ 
> implementation fails with:
> {code}
> MetaException(message=java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
> {code}
> The 
> [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270]
>  Thrift field is 'default requiredness (implicit)', and Thrift [does not 
> specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] 
> whether unset default requiredness fields are encoded.  Empirically, the Java 
> generated code [does not write the 
> {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942]
>  when the field is unset, while the C++ generated code 
> [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890].
> The MetaStore treats the field as optional, and [fills in a default 
> value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871]
>  if the field is unset.
> The end result is that when the C++ implementation sends a {{Database}} 
> without the field set, it actually writes an empty string, and the MetaStore 
> treats it as a set field (non-null), and then calls a {{Path}} API which 
> rejects the empty string.  The fix is simple: make the {{locationUri}} field 
> optional in metastore.thrift.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16935) Hive should strip comments from input before choosing which CommandProcessor to run.

2017-07-05 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-16935:
--
Attachment: HIVE-16935.4.patch

> Hive should strip comments from input before choosing which CommandProcessor 
> to run.
> 
>
> Key: HIVE-16935
> URL: https://issues.apache.org/jira/browse/HIVE-16935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-16935.1.patch, HIVE-16935.2.patch, 
> HIVE-16935.3.patch, HIVE-16935.4.patch
>
>
> While using Beeswax, Hue fails to execute statement with following error:
> Error while compiling statement: FAILED: ParseException line 3:4 missing 
> KW_ROLE at 'a' near 'a' line 3:5 missing EOF at '=' near 'a'
> {quote}
> -- comment
> SET a=1;
> SELECT 1;
> {quote}
> The same code works in Beeline and in Impala.
> The same code fails in CliDriver 
>  
> h2. Background
> Hive deals with sql comments (“-- to end of line”) in different places.
> Some clients attempt to strip comments. For example BeeLine was recently 
> enhanced in https://issues.apache.org/jira/browse/HIVE-13864 to strip 
> comments from multi-line commands before they are executed.
> Other clients such as Hue or Jdbc do not strip comments before sending text.
> Some tests such as TestCliDriver strip comments before running tests.
> When Hive gets a command the CommandProcessorFactory looks at the text to 
> determine which CommandProcessor should handle the command. In the bug case 
> the correct CommandProcessor is SetProcessor, but the comments confuse the 
> CommandProcessorFactory and so the command is treated as sql. Hive’s sql 
> parser understands and ignores comments, but it does not understand the set 
> commands usually handled by SetProcessor and so we get the ParseException 
> shown above.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075167#comment-16075167
 ] 

Naveen Gangam commented on HIVE-17022:
--

Thanks for the explanation. The fix in the patch looks good to me. Just a 
couple of nits.
1) Think we should make the code above conditional, only when DEBUG is enabled. 
So perhaps something like this
{code}
if (LOG.isDebugEnabled()) {
  for (HiveLockObj obj : objs) {
 LOG.debug("Acquiring lock for {} with mode {}", obj.getObj().getName(),
  obj.getMode());
  }
}
{code}

2) Is there a reason the above code is not in {{public List 
lock(List objs, int numRetriesForLock, long sleepTime)}} method 
but at a higher level? would it be better if we log these after the 
{{sortLocks}} call so we print the sorted list? 
Thanks

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16893) move replication dump related work in semantic analysis phase to execution phase using a task

2017-07-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-16893:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

HIVE-16893.4.patch pushed to master. Thanks Anishek, Sankar!

> move replication dump related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16893
> URL: https://issues.apache.org/jira/browse/HIVE-16893
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-16893.2.patch, HIVE-16893.3.patch, 
> HIVE-16893.4.patch
>
>
> Since we run in to the possibility of creating a large number tasks during 
> replication bootstrap dump
> * we may not be able to hold all of them in memory for really large 
> databases, which might not hold true once we complete HIVE-16892
> * Also a compile time lock is taken such that only one query is run in this 
> phase which in replication bootstrap scenario is going to be a very long 
> running task and hence moving it to execution phase will limit the lock 
> period in compile phase.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17007) NPE introduced by HIVE-16871

2017-07-05 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-17007:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Patch pushed to master. Thanks Sushanth for review!

> NPE introduced by HIVE-16871
> 
>
> Key: HIVE-17007
> URL: https://issues.apache.org/jira/browse/HIVE-17007
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 3.0.0
>
> Attachments: HIVE-17007.1.patch
>
>
> Stack:
> {code}
> 2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - 
> MetaException(message:java.lang.NullPointerException)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944)
> at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at 
> com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325)
> at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
> at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
> Source)
> at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306)
> at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
> Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> 

[jira] [Commented] (HIVE-17007) NPE introduced by HIVE-16871

2017-07-05 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075136#comment-16075136
 ] 

Sushanth Sowmyan commented on HIVE-17007:
-

+1

> NPE introduced by HIVE-16871
> 
>
> Key: HIVE-17007
> URL: https://issues.apache.org/jira/browse/HIVE-17007
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17007.1.patch
>
>
> Stack:
> {code}
> 2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - 
> MetaException(message:java.lang.NullPointerException)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944)
> at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at 
> com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397)
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325)
> at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
> at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
> Source)
> at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306)
> at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
> Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.cache.SharedCache.getCachedTableColStats(SharedCache.java:140)
> at 
> 

[jira] [Commented] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075137#comment-16075137
 ] 

Hive QA commented on HIVE-16993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875788/HIVE-17008.6.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5893/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5893/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5893/

Messages:
{noformat}
 This message was trimmed, see log for full details 
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database)
 is not applicable
  (actual and formal argument lists differ in length)
[ERROR] 
/data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseStoreIntegration.java:[1311,21]
 no suitable constructor found for 
Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map)
constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not 
applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database)
 is not applicable
  (actual and formal argument lists differ in length)
[ERROR] 
/data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[277,9]
 no suitable constructor found for 
Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map)
constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not 
applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database)
 is not applicable
  (actual and formal argument lists differ in length)
[ERROR] 
/data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[324,9]
 no suitable constructor found for 
Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map)
constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not 
applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database)
 is not applicable
  (actual and formal argument lists differ in length)
[ERROR] 
/data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[365,9]
 no suitable constructor found for 
Database(java.lang.String,java.lang.String,java.lang.String,java.util.Map)
constructor org.apache.hadoop.hive.metastore.api.Database.Database() is not 
applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(java.lang.String,java.lang.String,java.util.Map)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.hive.metastore.api.Database.Database(org.apache.hadoop.hive.metastore.api.Database)
 is not applicable
  (actual and formal argument lists differ in length)
[ERROR] 
/data/hiveptest/working/apache-github-source-source/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/hbase/TestHBaseImport.java:[433,9]
 no suitable constructor found for 

[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats

2017-07-05 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075128#comment-16075128
 ] 

Pengcheng Xiong commented on HIVE-16996:


[~ashutoshc] and [~hagleitn], it seems that if we shift from FM to HLL, we will 
already have lots of plan changes. Let alone the aggregation of partition 
stats. Do you want to try it in a single step (i.e., replace FM with HLL and do 
the aggr) or split into 2 steps? Thanks.

> Add HLL as an alternative to FM sketch to compute stats
> ---
>
> Key: HIVE-16996
> URL: https://issues.apache.org/jira/browse/HIVE-16996
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16966.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17045) Add HyperLogLog as an UDAF

2017-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17045:
---
Attachment: HIVE-17045.01.patch

> Add HyperLogLog as an UDAF
> --
>
> Key: HIVE-17045
> URL: https://issues.apache.org/jira/browse/HIVE-17045
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17045.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17045) Add HyperLogLog as an UDAF

2017-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-17045:
---
Status: Patch Available  (was: Open)

> Add HyperLogLog as an UDAF
> --
>
> Key: HIVE-17045
> URL: https://issues.apache.org/jira/browse/HIVE-17045
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-17045.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17045) Add HyperLogLog as an UDAF

2017-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-17045:
--


> Add HyperLogLog as an UDAF
> --
>
> Key: HIVE-17045
> URL: https://issues.apache.org/jira/browse/HIVE-17045
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16961) Hive on Spark leaks spark application in case user cancels query and closes session

2017-07-05 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16961:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks to Rui for the review.

> Hive on Spark leaks spark application in case user cancels query and closes 
> session
> ---
>
> Key: HIVE-16961
> URL: https://issues.apache.org/jira/browse/HIVE-16961
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 3.0.0
>
> Attachments: HIVE-16961.patch, HIVE-16961.patch
>
>
> It's found that a Spark application is leaked when user cancels query and 
> closes the session while Hive is waiting for remote driver to connect back. 
> This is found for asynchronous query execution, but seemingly equally 
> applicable for synchronous submission when session is abruptly closed. The 
> leaked Spark application that runs Spark driver connects back to Hive 
> successfully and run for ever (until HS2 restarts), but receives no job 
> submission because the session is already closed. Ideally, Hive should 
> rejects the connection from the driver so the driver will exist.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16962) Better error msg for Hive on Spark in case user cancels query and closes session

2017-07-05 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074240#comment-16074240
 ] 

Xuefu Zhang edited comment on HIVE-16962 at 7/5/17 5:23 PM:


I updated the fix version as 3.0.0. There are too many pending releases. It's 
very confusing.


was (Author: xuefuz):
@lefty, thank for pointing it out. I lost track of the releases. I committed it 
to master and have no plan to commit to other branches. What's the right fix 
version then?

> Better error msg for Hive on Spark in case user cancels query and closes 
> session
> 
>
> Key: HIVE-16962
> URL: https://issues.apache.org/jira/browse/HIVE-16962
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 3.0.0
>
> Attachments: HIVE-16962.2.patch, HIVE-16962.patch, HIVE-16962.patch
>
>
> In case user cancels a query and closes the session, Hive marks the query as 
> failed. However, the error message is a little confusing. It still says:
> {quote}
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark 
> client. This is likely because the queue you assigned to does not have free 
> resource at the moment to start the job. Please check your queue usage and 
> try the query again later.
> {quote}
> followed by some InterruptedException.
> Ideally, the error should clearly indicates the fact that user cancels the 
> execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16962) Better error msg for Hive on Spark in case user cancels query and closes session

2017-07-05 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-16962:
---
Fix Version/s: (was: 2.2.0)
   3.0.0

> Better error msg for Hive on Spark in case user cancels query and closes 
> session
> 
>
> Key: HIVE-16962
> URL: https://issues.apache.org/jira/browse/HIVE-16962
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Fix For: 3.0.0
>
> Attachments: HIVE-16962.2.patch, HIVE-16962.patch, HIVE-16962.patch
>
>
> In case user cancels a query and closes the session, Hive marks the query as 
> failed. However, the error message is a little confusing. It still says:
> {quote}
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create spark 
> client. This is likely because the queue you assigned to does not have free 
> resource at the moment to start the job. Please check your queue usage and 
> try the query again later.
> {quote}
> followed by some InterruptedException.
> Ideally, the error should clearly indicates the fact that user cancels the 
> execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17018) Small table is converted to map join even the total size of small tables exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)

2017-07-05 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075106#comment-16075106
 ] 

Xuefu Zhang commented on HIVE-17018:


Ping [~csun] for comments.

> Small table is converted to map join even the total size of small tables 
> exceeds the threshold(hive.auto.convert.join.noconditionaltask.size)
> -
>
> Key: HIVE-17018
> URL: https://issues.apache.org/jira/browse/HIVE-17018
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
>
>  we use "hive.auto.convert.join.noconditionaltask.size" as the threshold. it 
> means  the sum of size for n-1 of the tables/partitions for a n-way join is 
> smaller than it, it will be converted to a map join. for example, A join B 
> join C join D join E. Big table is A(100M), small tables are 
> B(10M),C(10M),D(10M),E(10M).  If we set 
> hive.auto.convert.join.noconditionaltask.size=20M. In current code, E,D,B 
> will be converted to map join but C will not be converted to map join. In my 
> understanding, because hive.auto.convert.join.noconditionaltask.size can only 
> contain E and D, so C and B should not be converted to map join.  
> Let's explain more why E can be converted to map join.
> in current code, 
> [SparkMapJoinOptimizer#getConnectedMapJoinSize|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L364]
>  calculates all the mapjoins  in the parent path and child path. The search 
> stops when encountering [UnionOperator or 
> ReduceOperator|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L381].
>  Because C is not converted to map join because {{connectedMapJoinSize + 
> totalSize) > maxSize}} [see 
> code|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#L330].The
>  RS before the join of C remains. When calculating whether B will be 
> converted to map join, {{getConnectedMapJoinSize}} returns 0 as encountering 
> [RS 
> |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java#409]
>  and causes  {{connectedMapJoinSize + totalSize) < maxSize}} matches.
> [~xuefuz] or [~jxiang]: can you help see whether this is a bug or not  as you 
> are more familiar with SparkJoinOptimizer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16993) ThriftHiveMetastore.create_database can fail if the locationUri is not set

2017-07-05 Thread Dan Burkert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Burkert updated HIVE-16993:
---
Attachment: HIVE-17008.6.patch

> ThriftHiveMetastore.create_database can fail if the locationUri is not set
> --
>
> Key: HIVE-16993
> URL: https://issues.apache.org/jira/browse/HIVE-16993
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-16993.0-master.patch, HIVE-16993.1-master.patch, 
> HIVE-16993.2.patch, HIVE-16993.3.patch, HIVE-16993.4.patch, 
> HIVE-16993.5.patch, HIVE-17008.6.patch
>
>
> Calling 
> [{{ThriftHiveMetastore.create_database}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L1078]
>  with a database with an unset {{locationUri}} field through the C++ 
> implementation fails with:
> {code}
> MetaException(message=java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
> {code}
> The 
> [{{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/if/hive_metastore.thrift#L270]
>  Thrift field is 'default requiredness (implicit)', and Thrift [does not 
> specify|https://thrift.apache.org/docs/idl#default-requiredness-implicit] 
> whether unset default requiredness fields are encoded.  Empirically, the Java 
> generated code [does not write the 
> {{locationUri}}|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java#L938-L942]
>  when the field is unset, while the C++ generated code 
> [does|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp#L3888-L3890].
> The MetaStore treats the field as optional, and [fills in a default 
> value|https://github.com/apache/hive/blob/3fa48346d509813977cd3c7622d581c0ccd51e99/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L867-L871]
>  if the field is unset.
> The end result is that when the C++ implementation sends a {{Database}} 
> without the field set, it actually writes an empty string, and the MetaStore 
> treats it as a set field (non-null), and then calls a {{Path}} API which 
> rejects the empty string.  The fix is simple: make the {{locationUri}} field 
> optional in metastore.thrift.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist

2017-07-05 Thread Dan Burkert (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075068#comment-16075068
 ] 

Dan Burkert commented on HIVE-17008:


Hi [~mohitsabharwal], here is a better diff of the patch: 
https://github.com/danburkert/hive/commit/ec115a584c4b2b715f339458afc2bcbf353d1e47?w=1.
  Since filing this bug / uploading the patch I've found that the HMS can fire 
event listeners on almost any type of failed DDL operation: drop database, 
create table, partitions, functions, indices, etc.  The patch only fixes the 
drop database case, but the fix is pretty much the same.  It's not clear to me 
what the designed behavior is, though.  Are these just copy/pasted bugs, or is 
it by design that the HMS notifies listeners for failed DDL operations?

> HiveMetastore.drop_database can return NPE if database does not exist
> -
>
> Key: HIVE-17008
> URL: https://issues.apache.org/jira/browse/HIVE-17008
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17008.0.patch
>
>
> When dropping a non-existent database, the HMS will still fire registered 
> {{DROP_DATABASE}} event listeners.  This results in an NPE when the listeners 
> attempt to deref the {{null}} database parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16355) Service: embedded mode should only be available if service is loaded onto the classpath

2017-07-05 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075036#comment-16075036
 ] 

Zoltan Haindrich commented on HIVE-16355:
-

[~thejas], [~vgumashta], [~hagleitn] could you please take a look?

> Service: embedded mode should only be available if service is loaded onto the 
> classpath
> ---
>
> Key: HIVE-16355
> URL: https://issues.apache.org/jira/browse/HIVE-16355
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Server Infrastructure
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16355.1.patch, HIVE-16355.2.patch, 
> HIVE-16355.2.patch, HIVE-16355.3.patch, HIVE-16355.4.patch
>
>
> I would like to relax the hard reference to 
> {{EmbeddedThriftBinaryCLIService}} to be only used in case {{service}} module 
> is loaded onto the classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075030#comment-16075030
 ] 

Mohit Sabharwal commented on HIVE-17022:


For a table will large number of partitions, it will indeed print lot of log 
statements. But only
applicable under debug mode, when you are debugging and could use this info.
For a complex query involving many write entities, it is hard to tell what lock 
is being
taken for what entity otherwise.

Earlier, we had these statements as INFO, which we changed to DEBUG in 
HIVE-12966
to avoid noise.

We already have the debug statement for ZooKeeperHiveLockManager,  but did not
print the actual lock mode in that statement, so I added that to this patch.

For EmbeddedLockManager, which is useful when debugging locally, we had no 
debug statements
whatsoever, so I added that to this patch.

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075017#comment-16075017
 ] 

Naveen Gangam commented on HIVE-17022:
--

[~mohitsabharwal] I am a bit worried that the following code generates a lot of 
noise in the logs, given the frequency at which lock() method is called and the 
number of locks that can be held at any particular point in time. Have you seen 
otherwise?
{code}
for (HiveLockObj obj : objs) {
  LOG.debug("Acquiring lock for {} with mode {}", obj.getObj().getName(),
  obj.getMode());
}
{code}
Thanks

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17008) HiveMetastore.drop_database can return NPE if database does not exist

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074983#comment-16074983
 ] 

Mohit Sabharwal commented on HIVE-17008:


[~dan_impala_9180], the changes aren't clear to me, look like changes in 
indentation. Could you add a review board link ?

Also, could you add the NPE stracktrace you are seeing to the description ?

You also need a unit test here, perhaps in TestDbNotificationListener 

> HiveMetastore.drop_database can return NPE if database does not exist
> -
>
> Key: HIVE-17008
> URL: https://issues.apache.org/jira/browse/HIVE-17008
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Dan Burkert
>Assignee: Dan Burkert
> Attachments: HIVE-17008.0.patch
>
>
> When dropping a non-existent database, the HMS will still fire registered 
> {{DROP_DATABASE}} event listeners.  This results in an NPE when the listeners 
> attempt to deref the {{null}} database parameter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16883) HBaseStorageHandler Ignores Case for HBase Table Name

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li reassigned HIVE-16883:
--

Assignee: Bing Li

> HBaseStorageHandler Ignores Case for HBase Table Name
> -
>
> Key: HIVE-16883
> URL: https://issues.apache.org/jira/browse/HIVE-16883
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.2.1
> Environment: Hortonworks HDP 2.6.0.3, CentOS 7.0, VMWare ESXI
>Reporter: Shawn Weeks
>Assignee: Bing Li
>Priority: Minor
>
> Currently the HBaseStorageHandler is lower casing the HBase Table name. This 
> prevent use of the storage handler with existing HBase tables that are not 
> all lower case. Looking at the source this was done intentionally but I 
> haven't found any documentation about why on the wiki. To prevent a change in 
> the default behavior I'd suggest adding an additional property to the serde. 
> {code}
> create 'TestTable', 'd'
> create external table `TestTable` (
> id bigint,
> hash String,
> location String,
> name String
> )
> stored by "org.apache.hadoop.hive.hbase.HBaseStorageHandler"
> with serdeproperties (
> "hbase.columns.mapping" = ":key,d:hash,d:location,d:name",
> "hbase.table.name" = "TestTable"
> );
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"

2017-07-05 Thread Bing Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bing Li reassigned HIVE-16922:
--

Assignee: Bing Li

> Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
> ---
>
> Key: HIVE-16922
> URL: https://issues.apache.org/jira/browse/HIVE-16922
> Project: Hive
>  Issue Type: Bug
>  Components: Thrift API
>Reporter: Dudu Markovitz
>Assignee: Bing Li
>
> https://github.com/apache/hive/blob/master/serde/if/serde.thrift
> Typo in serde.thrift: 
> COLLECTION_DELIM = "colelction.delim"
> (*colelction* instead of *collection*)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-1938) Cost Based Query optimization for Joins in Hive

2017-07-05 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-1938.
---
Resolution: Duplicate

> Cost Based Query optimization for Joins in Hive
> ---
>
> Key: HIVE-1938
> URL: https://issues.apache.org/jira/browse/HIVE-1938
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor, Statistics
> Environment: *nix,java
>Reporter: bharath v
>Assignee: bharath v
>
> Current optimization in Hive is just rule-based and involves applying a set 
> of rules on the Plan tree. This depends on hints given by the user (which may 
> or may-not be correct) and might result in execution of costlier plans.So 
> this jira aims at building a cost-model which can give a good estimate 
> various plans before hand (using some meta-data already collected) and we can 
> choose the best plan which incurs the least cost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-33) [Hive]: Add optimizer statistics in Hive

2017-07-05 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-33.
-
Resolution: Duplicate

> [Hive]: Add optimizer statistics in Hive
> 
>
> Key: HIVE-33
> URL: https://issues.apache.org/jira/browse/HIVE-33
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Statistics
>Reporter: Ashish Thusoo
>  Labels: statistics
>
> Add commands to collect partition and column level statistics in hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17022) Add mode in lock debug statements

2017-07-05 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074828#comment-16074828
 ] 

Mohit Sabharwal commented on HIVE-17022:


Test failures are unrelated.

> Add mode in lock debug statements
> -
>
> Key: HIVE-17022
> URL: https://issues.apache.org/jira/browse/HIVE-17022
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>Priority: Trivial
> Attachments: HIVE-17022.patch
>
>
> Currently, lock debug statements print IMPLICIT/EXPLICIT as lock mode,
> whereas SHARED/EXCLUSIVE/SEMI_SHARED are more useful
> when debugging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-787) Hive Freeway - support near-realtime data processing

2017-07-05 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-787.
--
Resolution: Won't Fix

The rise of n million stream processing solutions makes it unlikely anyone 
would attempt to implement this directly. It looks like people are using 
calcite in real time platforms like Samza so in effect I would say this was 
done in another way. Reopen if you feel differently.

> Hive Freeway - support near-realtime data processing
> 
>
> Key: HIVE-787
> URL: https://issues.apache.org/jira/browse/HIVE-787
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zheng Shao
>
> Most people are using Hive for daily (or at most hourly) data processing.
> We want to explore what are the obstacles for using Hive for 15 minutes, 5 
> minutes or even 1 minute data processing intervals, and remove these 
> obstacles.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17026) HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master

2017-07-05 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074762#comment-16074762
 ] 

Fei Hui commented on HIVE-17026:


Hi, [~cartershanklin] 
i see the source code, Hive {{Version = 3.0.0}} only provides hive2 driver. So 
there is error info 'java.sql.SQLException: No suitable driver found for 
jdbc:hive://'

> HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master
> ---
>
> Key: HIVE-17026
> URL: https://issues.apache.org/jira/browse/HIVE-17026
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Fei Hui
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> The docs at http://www.hplsql.org/configuration#hplsqlconnhiveconn state:
> 
>   hplsql.conn.hiveconn
>   org.apache.hadoop.hive.jdbc.HiveDriver;jdbc:hive://
> 
> If you use that on current master you get:
> java.sql.SQLException: No suitable driver found for jdbc:hive://
> If you use hive2 it works fine. It's not clear to me if that's a change from 
> prior versions or not.
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17026) HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master

2017-07-05 Thread Fei Hui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui reassigned HIVE-17026:
--

Assignee: Fei Hui

> HPL/SQL Example for hplsql.conn.hiveconn doesn't work on master
> ---
>
> Key: HIVE-17026
> URL: https://issues.apache.org/jira/browse/HIVE-17026
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Carter Shanklin
>Assignee: Fei Hui
>
> This bug is part of a series of issues and surprising behavior I encountered 
> writing a reporting script that would aggregate values and give rows 
> different classifications based on an the aggregate. Addressing some or all 
> of these issues would make HPL/SQL more accessible to newcomers.
> The docs at http://www.hplsql.org/configuration#hplsqlconnhiveconn state:
> 
>   hplsql.conn.hiveconn
>   org.apache.hadoop.hive.jdbc.HiveDriver;jdbc:hive://
> 
> If you use that on current master you get:
> java.sql.SQLException: No suitable driver found for jdbc:hive://
> If you use hive2 it works fine. It's not clear to me if that's a change from 
> prior versions or not.
> Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17013) Delete request with a subquery based on select over a view

2017-07-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074743#comment-16074743
 ] 

Frédéric ESCANDELL commented on HIVE-17013:
---

I would like to complete information given in the description of the ticket : 

I'm using Hive 1.2.1000.2.6.0.3-8.

I think this bug could come from the patch of this ticket 
https://issues.apache.org/jira/browse/HIVE-15970 et more particulary the 
snippet of code below : 

 throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to 
be in sub-query or set operation.");

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java]

{code:java}
/**
   * The suffix is always relative to a given ASTNode
   */
  public DestClausePrefix getDestNamePrefix(ASTNode curNode) {
assert curNode != null : "must supply curNode";
if(curNode.getType() != HiveParser.TOK_INSERT_INTO) {
  //select statement
  assert curNode.getType() == HiveParser.TOK_DESTINATION;
  if(operation == Operation.OTHER) {
//not an 'interesting' op
return DestClausePrefix.INSERT;
  }
  //if it is an 'interesting' op but it's a select it must be a sub-query 
or a derived table
  //it doesn't require a special Acid code path - the reset of the code 
here is to ensure
  //the tree structure is what we expect
  boolean thisIsInASubquery = false;
  parentLoop: while(curNode.getParent() != null) {
curNode = (ASTNode) curNode.getParent();
switch (curNode.getType()) {
  case HiveParser.TOK_SUBQUERY_EXPR:
//this is a real subquery (foo IN (select ...))
  case HiveParser.TOK_SUBQUERY:
//this is a Derived Table Select * from (select a from ...))
//strictly speaking SetOps should have a TOK_SUBQUERY parent so 
next 6 items are redundant
  case HiveParser.TOK_UNIONALL:
  case HiveParser.TOK_UNIONDISTINCT:
  case HiveParser.TOK_EXCEPTALL:
  case HiveParser.TOK_EXCEPTDISTINCT:
  case HiveParser.TOK_INTERSECTALL:
  case HiveParser.TOK_INTERSECTDISTINCT:
thisIsInASubquery = true;
break parentLoop;
}
  }
if(!thisIsInASubquery) {
throw new IllegalStateException("Expected '" + getMatchedText(curNode) 
+ "' to be in sub-query or set operation.");
  }   return DestClausePrefix.INSERT;
}
{code}


> Delete request with a subquery based on select over a view
> --
>
> Key: HIVE-17013
> URL: https://issues.apache.org/jira/browse/HIVE-17013
> Project: Hive
>  Issue Type: Bug
>Reporter: Frédéric ESCANDELL
>Priority: Blocker
>
> Hi, 
> I based my DDL on this exemple 
> https://fr.hortonworks.com/tutorial/using-hive-acid-transactions-to-insert-update-and-delete-data/.
> In a delete request, the use of a view in a subquery throw an exception : 
> FAILED: IllegalStateException Expected 'insert into table default.mydim 
> select ROW__ID from default.mydim sort by ROW__ID' to be in sub-query or set 
> operation.
> {code}
> {code:sql}
> drop table if exists mydim;
> create table mydim (key int, name string, zip string, is_current boolean)
> clustered by(key) into 3 buckets
> stored as orc tblproperties ('transactional'='true');
> insert into mydim values
>   (1, 'bob',   '95136', true),
>   (2, 'joe',   '70068', true),
>   (3, 'steve', '22150', true);
> drop table if exists updates_staging_table;
> create table updates_staging_table (key int, newzip string);
> insert into updates_staging_table values (1, 87102), (3, 45220);
> drop view if exists updates_staging_view;
> create view updates_staging_view (key, newzip) as select key, newzip from 
> updates_staging_table;
> delete from mydim
> where mydim.key in (select key from updates_staging_view);
> FAILED: IllegalStateException Expected 'insert into table default.mydim 
> select ROW__ID from default.mydim sort by ROW__ID' to be in sub-query or set 
> operation.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17013) Delete request with a subquery based on select over a view

2017-07-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074743#comment-16074743
 ] 

Frédéric ESCANDELL edited comment on HIVE-17013 at 7/5/17 1:24 PM:
---

I would like to complete information given in the description of the ticket : 

I'm using Hive 1.2.1000.2.6.0.3-8.

I think this bug could come from the patch of this ticket 
https://issues.apache.org/jira/browse/HIVE-15970 and more particulary the 
snippet of code below : 

{code:java}
 throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to 
be in sub-query or set operation.");
{code}

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java]

{code:java}
/**
   * The suffix is always relative to a given ASTNode
   */
  public DestClausePrefix getDestNamePrefix(ASTNode curNode) {
assert curNode != null : "must supply curNode";
if(curNode.getType() != HiveParser.TOK_INSERT_INTO) {
  //select statement
  assert curNode.getType() == HiveParser.TOK_DESTINATION;
  if(operation == Operation.OTHER) {
//not an 'interesting' op
return DestClausePrefix.INSERT;
  }
  //if it is an 'interesting' op but it's a select it must be a sub-query 
or a derived table
  //it doesn't require a special Acid code path - the reset of the code 
here is to ensure
  //the tree structure is what we expect
  boolean thisIsInASubquery = false;
  parentLoop: while(curNode.getParent() != null) {
curNode = (ASTNode) curNode.getParent();
switch (curNode.getType()) {
  case HiveParser.TOK_SUBQUERY_EXPR:
//this is a real subquery (foo IN (select ...))
  case HiveParser.TOK_SUBQUERY:
//this is a Derived Table Select * from (select a from ...))
//strictly speaking SetOps should have a TOK_SUBQUERY parent so 
next 6 items are redundant
  case HiveParser.TOK_UNIONALL:
  case HiveParser.TOK_UNIONDISTINCT:
  case HiveParser.TOK_EXCEPTALL:
  case HiveParser.TOK_EXCEPTDISTINCT:
  case HiveParser.TOK_INTERSECTALL:
  case HiveParser.TOK_INTERSECTDISTINCT:
thisIsInASubquery = true;
break parentLoop;
}
  }
if(!thisIsInASubquery) {
throw new IllegalStateException("Expected '" + getMatchedText(curNode) 
+ "' to be in sub-query or set operation.");
  }   return DestClausePrefix.INSERT;
}
{code}



was (Author: fescandell):
I would like to complete information given in the description of the ticket : 

I'm using Hive 1.2.1000.2.6.0.3-8.

I think this bug could come from the patch of this ticket 
https://issues.apache.org/jira/browse/HIVE-15970 et more particulary the 
snippet of code below : 

 throw new IllegalStateException("Expected '" + getMatchedText(curNode) + "' to 
be in sub-query or set operation.");

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Context.java]

{code:java}
/**
   * The suffix is always relative to a given ASTNode
   */
  public DestClausePrefix getDestNamePrefix(ASTNode curNode) {
assert curNode != null : "must supply curNode";
if(curNode.getType() != HiveParser.TOK_INSERT_INTO) {
  //select statement
  assert curNode.getType() == HiveParser.TOK_DESTINATION;
  if(operation == Operation.OTHER) {
//not an 'interesting' op
return DestClausePrefix.INSERT;
  }
  //if it is an 'interesting' op but it's a select it must be a sub-query 
or a derived table
  //it doesn't require a special Acid code path - the reset of the code 
here is to ensure
  //the tree structure is what we expect
  boolean thisIsInASubquery = false;
  parentLoop: while(curNode.getParent() != null) {
curNode = (ASTNode) curNode.getParent();
switch (curNode.getType()) {
  case HiveParser.TOK_SUBQUERY_EXPR:
//this is a real subquery (foo IN (select ...))
  case HiveParser.TOK_SUBQUERY:
//this is a Derived Table Select * from (select a from ...))
//strictly speaking SetOps should have a TOK_SUBQUERY parent so 
next 6 items are redundant
  case HiveParser.TOK_UNIONALL:
  case HiveParser.TOK_UNIONDISTINCT:
  case HiveParser.TOK_EXCEPTALL:
  case HiveParser.TOK_EXCEPTDISTINCT:
  case HiveParser.TOK_INTERSECTALL:
  case HiveParser.TOK_INTERSECTDISTINCT:
thisIsInASubquery = true;
break parentLoop;
}
  }
if(!thisIsInASubquery) {
throw new IllegalStateException("Expected '" + getMatchedText(curNode) 
+ "' to be in sub-query or set operation.");
  }   return DestClausePrefix.INSERT;
}
{code}


> Delete request with a subquery based on select over a view
> --
>
> Key: 

[jira] [Updated] (HIVE-17038) invalid result when CAST-ing to DATE

2017-07-05 Thread Jim Hopper (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Hopper updated HIVE-17038:
--
Description: 
when casting incorrect date literals to DATE data type hive returns wrong 
values instead of NULL.

{code}

SELECT CAST('2017-02-31' AS DATE);
SELECT CAST('2017-04-31' AS DATE);

{code}


  was:
when casting incorrect date literals to DATE data type hive returns wrong 
values instead of NULL.

{code}

SELECT CAST('2017-02-31' AS DATE);
SELECT CAST('2017-05-31' AS DATE);

{code}



> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17037) Extend join algorithm selection to avoid unnecessary input data shuffle

2017-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074723#comment-16074723
 ] 

Hive QA commented on HIVE-17037:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12875736/HIVE-17037.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 10833 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=237)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partialsmbjoin] 
(batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_smb_mapjoin_14]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_gby] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_gby] 
(batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_exists]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_subq_not_in]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cross_product_check_1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[jdbc_handler]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mrr] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_exists]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_nested_subquery]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_null_agg]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_select]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[table_access_keys_stats]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_1] 
(batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_grouping_sets4]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=99)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=98)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=177)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=177)
{noformat}

Test results: 

  1   2   >