[jira] [Commented] (HIVE-11852) numRows and rawDataSize table properties are not replicated

2015-09-22 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903322#comment-14903322
 ] 

Sushanth Sowmyan commented on HIVE-11852:
-

[~alangates], can I bug you for a review? (Most of the patch file size is the 
.q and the .out, I promise this time it's not a huge patch dump. :D )

> numRows and rawDataSize table properties are not replicated
> ---
>
> Key: HIVE-11852
> URL: https://issues.apache.org/jira/browse/HIVE-11852
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
>Reporter: Paul Isaychuk
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11852.patch
>
>
> numRows and rawDataSize table properties are not replicated when exported for 
> replication and re-imported.
> {code}
> Table drdbnonreplicatabletable.vanillatable has different TblProps from 
> drdbnonreplicatabletable.vanillatable expected [{numFiles=1, numRows=2, 
> totalSize=560, rawDataSize=440}] but found [{numFiles=1, totalSize=560}]
> java.lang.AssertionError: Table drdbnonreplicatabletable.vanillatable has 
> different TblProps from drdbnonreplicatabletable.vanillatable expected 
> [{numFiles=1, numRows=2, totalSize=560, rawDataSize=440}] but found 
> [{numFiles=1, totalSize=560}]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11897) JDO rollback can throw pointless exceptions

2015-09-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903147#comment-14903147
 ] 

Hive QA commented on HIVE-11897:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761534/HIVE-11897.01.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9576 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5374/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5374/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5374/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761534 - PreCommit-HIVE-TRUNK-Build

> JDO rollback can throw pointless exceptions
> ---
>
> Key: HIVE-11897
> URL: https://issues.apache.org/jira/browse/HIVE-11897
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11897.01.patch, HIVE-11897.patch
>
>
> Datanucleus does a bunch of stuff before the actual rollback, with each next 
> step in a finally block; that way even if the prior steps fail, the rollback 
> should still happen. However, an exception from some questionable 
> pre-rollback logic like manipulating resultset after failure can affect 
> DirectSQL fallback



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903327#comment-14903327
 ] 

Thejas M Nair commented on HIVE-11915:
--

[~taksaito] Can you please include the details of the exception in this jira ?

[~sershe]
Looking at the bonecp code, it does not look like 
config.setMaxConnectionAgeInSeconds(0) will have an impact in our case. The 
default is 0 and that wasn't being modified.
Looks like markPossiblyBroken is called on exception anyway. Should we just 
catch SQLException and retry instead of using reflection ?


> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903332#comment-14903332
 ] 

Sergey Shelukhin commented on HIVE-11915:
-

There's no SQLException, there's simply a closed connection being returned. The 
exception happens later when Hive code tries to use it.

> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-11915:
-
Reporter: Takahiko Saito  (was: Sergey Shelukhin)

> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11898) support default partition in metastoredirectsql

2015-09-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903344#comment-14903344
 ] 

Hive QA commented on HIVE-11898:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761547/HIVE-11898.01.patch

{color:red}ERROR:{color} -1 due to 35 failed/errored test(s), 9576 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_update_status
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_part_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_filter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_partitions_filter3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial_ndv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_rc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_compact_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_only_queries_with_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_split_elimination
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partcols1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_ppr_all
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries_with_filters
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5375/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5375/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5375/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 35 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761547 - PreCommit-HIVE-TRUNK-Build

> support default partition in metastoredirectsql
> ---
>
> Key: HIVE-11898
> URL: https://issues.apache.org/jira/browse/HIVE-11898
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11898.01.patch, HIVE-11898.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11765) SMB Join fails in Hive 1.2

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902063#comment-14902063
 ] 

Prasanth Jayachandran commented on HIVE-11765:
--

Also let me know if you using tez or mr.

> SMB Join fails in Hive 1.2
> --
>
> Key: HIVE-11765
> URL: https://issues.apache.org/jira/browse/HIVE-11765
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Na Yang
>Assignee: Prasanth Jayachandran
>
> SMB join on Hive 1.2 fails with the following stack trace :
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:173)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
> ... 11 more
> Caused by: java.lang.IndexOutOfBoundsException: toIndex = 5
> at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
> at java.util.ArrayList.subList(ArrayList.java:996)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.getSchemaOnRead(RecordReaderFactory.java:161)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:66)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202)
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:230)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:163)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1104)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:67)
> {code}
> This error happens after adding the patch of HIVE-10591. Reverting HIVE-10591 
> fixes this exception. 
> Steps to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> create Table table1 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) INTO 384 buckets 
> stored as ORC;
> create Table table2 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) into 384 buckets 
> stored as ORC;
> create Table table_tmp (empID int, name varchar(64), email varchar(64), 
> company varchar(64), age int);
> load data local inpath '/tmp/employee.csv’ into table table_tmp;
> INSERT OVERWRITE table  table1 select * from table_tmp;
> INSERT OVERWRITE table  table2 select * from table_tmp;
> SELECT table1.age, table2.age from table1 inner join table2 on 
> table1.age=table2.age;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902048#comment-14902048
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11915:
--

[~sershe] +1 for the smaller patch which I believe should not cause performance 
degradation in any case.

Thanks
Hari

> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11912) Make snappy compression default for parquet tables

2015-09-22 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902005#comment-14902005
 ] 

Szehon Ho commented on HIVE-11912:
--

Thanks guys for the review!  It should not happen (the same serdeProps are only 
set for 'alter table' code path), but I will make a check anyway.  I will make 
a new patch.

> Make snappy compression default for parquet tables
> --
>
> Key: HIVE-11912
> URL: https://issues.apache.org/jira/browse/HIVE-11912
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-11912.patch
>
>
> Snappy is a popular compression codec for Parquet, and is the default in many 
> Parquet applications, increasing the performance.  
> This change would make it the default for new Hive Parquet tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11699) Support special characters in quoted table names

2015-09-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11699:
---
Attachment: HIVE-11699.05.patch

address comments from [~jpullokkaran] and [~leftylev].

> Support special characters in quoted table names
> 
>
> Key: HIVE-11699
> URL: https://issues.apache.org/jira/browse/HIVE-11699
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, 
> HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch
>
>
> Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
> investigate how much change there should be if we would like to support 
> special characters, e.g., "/" in table names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11765) SMB Join fails in Hive 1.2

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902067#comment-14902067
 ] 

Prasanth Jayachandran commented on HIVE-11765:
--

Both mr and tez seems to be working for me.

> SMB Join fails in Hive 1.2
> --
>
> Key: HIVE-11765
> URL: https://issues.apache.org/jira/browse/HIVE-11765
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Na Yang
>Assignee: Prasanth Jayachandran
>
> SMB join on Hive 1.2 fails with the following stack trace :
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:173)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
> ... 11 more
> Caused by: java.lang.IndexOutOfBoundsException: toIndex = 5
> at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
> at java.util.ArrayList.subList(ArrayList.java:996)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.getSchemaOnRead(RecordReaderFactory.java:161)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:66)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202)
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:230)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:163)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1104)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:67)
> {code}
> This error happens after adding the patch of HIVE-10591. Reverting HIVE-10591 
> fixes this exception. 
> Steps to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> create Table table1 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) INTO 384 buckets 
> stored as ORC;
> create Table table2 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) into 384 buckets 
> stored as ORC;
> create Table table_tmp (empID int, name varchar(64), email varchar(64), 
> company varchar(64), age int);
> load data local inpath '/tmp/employee.csv’ into table table_tmp;
> INSERT OVERWRITE table  table1 select * from table_tmp;
> INSERT OVERWRITE table  table2 select * from table_tmp;
> SELECT table1.age, table2.age from table1 inner join table2 on 
> table1.age=table2.age;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11912) Make snappy compression default for parquet tables

2015-09-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902917#comment-14902917
 ] 

Hive QA commented on HIVE-11912:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761533/HIVE-11912.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9576 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_like
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5373/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5373/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5373/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761533 - PreCommit-HIVE-TRUNK-Build

> Make snappy compression default for parquet tables
> --
>
> Key: HIVE-11912
> URL: https://issues.apache.org/jira/browse/HIVE-11912
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-11912.patch
>
>
> Snappy is a popular compression codec for Parquet, and is the default in many 
> Parquet applications, increasing the performance.  
> This change would make it the default for new Hive Parquet tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11902) Abort txn cleanup thread throws SyntaxErrorException

2015-09-22 Thread Deepesh Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903017#comment-14903017
 ] 

Deepesh Khandelwal commented on HIVE-11902:
---

The failing hcat test 
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation also 
seems unrelated.

> Abort txn cleanup thread throws SyntaxErrorException
> 
>
> Key: HIVE-11902
> URL: https://issues.apache.org/jira/browse/HIVE-11902
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-11902.patch
>
>
> When cleaning left over transactions we see the DeadTxnReaper code threw the 
> following exception:
> {noformat}
> 2015-09-21 05:23:38,148 WARN  [DeadTxnReaper-0]: txn.TxnHandler 
> (TxnHandler.java:performTimeOuts(1876)) - Aborting timedout transactions 
> failed due to You have an error in your SQL syntax; check the manual that 
> corresponds to your MySQL server version for the right syntax to use near ')' 
> at line 1(SQLState=42000,ErrorCode=1064)
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error 
> in your SQL syntax; check the manual that corresponds to your MySQL server 
> version for the right syntax to use near ')' at line 1
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
> at com.mysql.jdbc.Util.getInstance(Util.java:360)
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:978)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3887)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3823)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2435)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2582)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2526)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1618)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1549)
> at 
> com.jolbox.bonecp.StatementHandle.executeUpdate(StatementHandle.java:497)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.abortTxns(TxnHandler.java:1275)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.performTimeOuts(TxnHandler.java:1866)
> at 
> org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService$TimedoutTxnReaper.run(AcidHouseKeeperService.java:87)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The problem here is that the method {{abortTxns(Connection dbConn, List 
> txnids)}} in 
> metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java 
> creates the following bad query when txnids list is empty.
> {code}
> delete from HIVE_LOCKS where hl_txnid in ();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-22 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902910#comment-14902910
 ] 

Ashutosh Chauhan commented on HIVE-11820:
-

There is no affected version for this, since this bug is not present in any 
release. It's present only on master.

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Fix For: 2.0.0
>
> Attachments: HIVE-11820.2.patch, HIVE-11820.2.patch, HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902928#comment-14902928
 ] 

Xuefu Zhang commented on HIVE-11820:


Thanks. In this case, I think the "affected" version should be "2.0.0" in this 
case. Otherwise, it's hard to tell whether the field is just missing. It's 
normal to have the same "fix version" as the "affected version".

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Fix For: 2.0.0
>
> Attachments: HIVE-11820.2.patch, HIVE-11820.2.patch, HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore

2015-09-22 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903104#comment-14903104
 ] 

Aihua Xu commented on HIVE-11826:
-

Hi [~leftylev] the doc seems not necessary since 'hadoop.proxyuser.hive.groups' 
should work as it supposed to be while we just had this issue. This is to 
correct the incorrect implementation. 

Let me know if anyone else has different opinion and I can provide more info if 
needed.  

> 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized 
> user to access metastore
> --
>
> Key: HIVE-11826
> URL: https://issues.apache.org/jira/browse/HIVE-11826
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11826.2.patch, HIVE-11826.patch
>
>
> With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain 
> groups, currently if you run the job with a user not belonging to those 
> groups, it won't fail to access metastore. With old version hive 0.13, 
> actually it fails properly. 
> Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() 
> while HadoopThriftAuthBridge23 doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11820:

Affects Version/s: 2.0.0

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Fix For: 2.0.0
>
> Attachments: HIVE-11820.2.patch, HIVE-11820.2.patch, HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11820) export tables with size of >32MB throws "java.lang.IllegalArgumentException: Skip CRC is valid only with update options"

2015-09-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14902928#comment-14902928
 ] 

Xuefu Zhang edited comment on HIVE-11820 at 9/22/15 4:36 PM:
-

Thanks. In this case, I think the "affected" version should be "2.0.0". 
Otherwise, it's hard to tell whether the field is just missing. It's normal to 
have the same "fix version" as the "affected version".


was (Author: xuefuz):
Thanks. In this case, I think the "affected" version should be "2.0.0" in this 
case. Otherwise, it's hard to tell whether the field is just missing. It's 
normal to have the same "fix version" as the "affected version".

> export tables with size of >32MB throws "java.lang.IllegalArgumentException: 
> Skip CRC is valid only with update options"
> 
>
> Key: HIVE-11820
> URL: https://issues.apache.org/jira/browse/HIVE-11820
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Takahiko Saito
>Assignee: Takahiko Saito
> Fix For: 2.0.0
>
> Attachments: HIVE-11820.2.patch, HIVE-11820.2.patch, HIVE-11820.patch
>
>
> Tested a patch of HIVE-11607 and seeing the following exception:
> {noformat}
> 2015-09-14 21:44:16,817 ERROR [main]: exec.Task 
> (SessionState.java:printError(960)) - Failed with exception Skip CRC is valid 
> only with update options
> java.lang.IllegalArgumentException: Skip CRC is valid only with update options
> at 
> org.apache.hadoop.tools.DistCpOptions.validate(DistCpOptions.java:556)
> at 
> org.apache.hadoop.tools.DistCpOptions.setSkipCRC(DistCpOptions.java:311)
> at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.runDistCp(Hadoop23Shims.java:1147)
> at org.apache.hadoop.hive.common.FileUtils.copy(FileUtils.java:553)
> at org.apache.hadoop.hive.ql.exec.CopyTask.execute(CopyTask.java:82)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> A possible resolution is to reverse the order of the following two lines from 
> a patch of HIVE-11607:
> {noformat}
> +options.setSkipCRC(true);
> +options.setSyncFolder(true);
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11920) ADD JAR failing with URL schemes other than file/ivy/hdfs

2015-09-22 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11920:
--
Attachment: HIVE-11920.1.patch

Attaching fix plus qfile test. The qfile test passes on branch-0.14 code, so 
this  is a regression.

> ADD JAR failing with URL schemes other than file/ivy/hdfs
> -
>
> Key: HIVE-11920
> URL: https://issues.apache.org/jira/browse/HIVE-11920
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11920.1.patch
>
>
> Example stack trace below. It looks like this was introduced by HIVE-9664.
> {noformat}
> 015-09-16 19:53:16,502 ERROR [main]: SessionState 
> (SessionState.java:printError(960)) - invalid url: 
> wasb:///tmp/hive-udfs-0.1.jar, expecting ( file | hdfs | ivy)  as url scheme.
> java.lang.RuntimeException: invalid url: wasb:///tmp/hive-udfs-0.1.jar, 
> expecting ( file | hdfs | ivy)  as url scheme.
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getURLType(SessionState.java:1230)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1237)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149)
> at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:301)
> at 
> org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:453)
> at 
> org.apache.hadoop.hive.ql.exec.Registry.registerPermanentFunction(Registry.java:200)
> at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerPermanentFunction(FunctionRegistry.java:1495)
> at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.createPermanentFunction(FunctionTask.java:136)
> at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:75)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11850) On Humboldt, creating udf function using wasb fail throwing java.lang.RuntimeException: invalid url: wasb:///... expecting ( file | hdfs | ivy) as url scheme.

2015-09-22 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-11850.
---
Resolution: Duplicate

> On Humboldt, creating udf function using wasb fail throwing 
> java.lang.RuntimeException: invalid url: wasb:///...  expecting ( file | hdfs 
> | ivy)  as url scheme.
> 
>
> Key: HIVE-11850
> URL: https://issues.apache.org/jira/browse/HIVE-11850
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
> Environment: Humboldt
>Reporter: Takahiko Saito
> Fix For: 1.2.1
>
>
> {noformat}
> hive> drop function if exists gencounter;
> OK
> Time taken: 2.614 seconds
> On Humboldt, creating UDF function fail as follows:
> hive> create function gencounter as 
> 'org.apache.hive.udf.generic.GenericUDFGenCounter' using jar 
> 'wasb:///tmp/hive-udfs-0.1.jar';
> invalid url: wasb:///tmp/hive-udfs-0.1.jar, expecting ( file | hdfs | ivy)  
> as url scheme.
> Failed to register default.gencounter using class 
> org.apache.hive.udf.generic.GenericUDFGenCounter
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> {noformat}
> The jar exists in wasb dir:
> {noformat}
> hrt_qa@headnode0:~$ hadoop fs -ls wasb:///tmp/
> Found 2 items
> -rw-r--r--   1 hrt_qa supergroup   4472 2015-09-16 11:50 
> wasb:///tmp/hive-udfs-0.1.jar
> drwxrwxrwx   - hdfs   supergroup  0 2015-09-16 12:00 
> wasb:///tmp/阿䶵aa阿䶵
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Updated] (HIVE-11922) Better error message when ORC split generation fails

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11922:
-
Description: When ORC split generation fails, it just prints out "serious 
problem" message on the console which does not tell anything about the cause of 
the exception.   (was: When ORC split generation fails, it just prints out 
"serious error" message on the console which does not tell anything about the 
cause of the exception. )

> Better error message when ORC split generation fails
> 
>
> Key: HIVE-11922
> URL: https://issues.apache.org/jira/browse/HIVE-11922
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Trivial
>
> When ORC split generation fails, it just prints out "serious problem" message 
> on the console which does not tell anything about the cause of the exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903433#comment-14903433
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11915:
--

I am not entirely sure but the following is what I understood from a quick 
glance. 

It looks like what Sergey is trying to do in the bigger patch is to workaround 
the BoneCP dangling connection bug at the Hive side. i.e. flag the connection 
as broken in getDbConn() if it detects is an already closed connection. This 
way the retry mechanism kicks in(based on getConnAttemptCount) and will 
hopefully get a free connection in the next iteration. Without this invocation 
to dbConn.markPossiblyBroken(), the already broken connection may not be 
destroyed and BoneCP might wrongly still return the existing broken connection 
when connPool.getConnection() is called the second time. [~sershe], does my 
understanding make any sense.

Thanks
Hari

> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4525) Support timestamps earlier than 1970 and later than 2038

2015-09-22 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903469#comment-14903469
 ] 

Phabricator commented on HIVE-4525:
---

sdong resigned from this revision.
sdong removed a reviewer: sdong.

REVISION DETAIL
  https://reviews.facebook.net/D10755

EMAIL PREFERENCES
  https://reviews.facebook.net/settings/panel/emailpreferences/

To: mbautin, JIRA, ashutoshc, cwsteinbach, omalley


> Support timestamps earlier than 1970 and later than 2038
> 
>
> Key: HIVE-4525
> URL: https://issues.apache.org/jira/browse/HIVE-4525
> Project: Hive
>  Issue Type: Bug
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Fix For: 0.12.0
>
> Attachments: D10755.1.patch, D10755.2.patch
>
>
> TimestampWritable currently serializes timestamps using the lower 31 bits of 
> an int. This does not allow to store timestamps earlier than 1970 or later 
> than a certain point in 2038.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-22 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903471#comment-14903471
 ] 

Xuefu Zhang commented on HIVE-11710:


[~aihuaxu], your new patch looks good. One thing I'm not 100% sure is what 
happens when deleting file w/o closing it first. 

Secondly, we could leak either "new PrintStream" or "new FileOutputStream" or 
both if there is any exception or if we don't close ss.out. This seems minor 
but reliable code is ideal.

Thus, I suggest that we make sure that these objects are closed properly 
whether there is an exception or not.

As you can see that session state is complete thread-unsafe (ref. HIVE-11402). 
However, this has nothing to do the problem you're addressing.


> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.2.patch, HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11925) Hive file format checking breaks load from named pipes

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11925:

Attachment: (was: HIVE-11925.patch)

> Hive file format checking breaks load from named pipes
> --
>
> Key: HIVE-11925
> URL: https://issues.apache.org/jira/browse/HIVE-11925
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Opening the file and mucking with it when hive.fileformat.check is true (the 
> default) breaks the LOAD command from a named pipe. Right now, it's done for 
> all the text files blindly to see if they might be in some other format. 
> Files.getAttribute can be used to figure out if the input is a named pipe (or 
> a socket) and skip the format check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11925) Hive file format checking breaks load from named pipes

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11925:

Attachment: HIVE-11925.patch

> Hive file format checking breaks load from named pipes
> --
>
> Key: HIVE-11925
> URL: https://issues.apache.org/jira/browse/HIVE-11925
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11925.patch
>
>
> Opening the file and mucking with it when hive.fileformat.check is true (the 
> default) breaks the LOAD command from a named pipe. Right now, it's done for 
> all the text files blindly to see if they might be in some other format. 
> Files.getAttribute can be used to figure out if the input is a named pipe (or 
> a socket) and skip the format check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903759#comment-14903759
 ] 

Sergey Shelukhin commented on HIVE-11777:
-

No :)

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-22 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reopened HIVE-11473:
---

Changed the JIRA as Xuefu suggested.
[~jxiang], please let me know if you still want to work on this.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11928:
-
Reporter: Jagruti Varia  (was: Prasanth Jayachandran)

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928.1.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903785#comment-14903785
 ] 

Prasanth Jayachandran commented on HIVE-11928:
--

[~sershe] Can you take a look this issue? This is same as HIVE-11592 but for 
footer. [~jvaria] created a test case with 1024 columns and 2 rows which blew 
up the footer section.

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928.1.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Attachment: HIVE-11553.03.patch

Rebased the patch to master (some stuff got removed as it was already committed 
to master). Will look at addressing feedback next.

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Description: This is the first step; uses the simple footer-getting API, 
without PPD.  (was: NO PRECOMMIT TESTS)

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.patch
>
>
> This is the first step; uses the simple footer-getting API, without PPD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11762) TestHCatLoaderEncryption failures when using Hadoop 2.7

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903797#comment-14903797
 ] 

Prasanth Jayachandran commented on HIVE-11762:
--

This broke branch-1 build. DFSClient is not imported. Is there a jira already?

> TestHCatLoaderEncryption failures when using Hadoop 2.7
> ---
>
> Key: HIVE-11762
> URL: https://issues.apache.org/jira/browse/HIVE-11762
> Project: Hive
>  Issue Type: Bug
>  Components: Shims, Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11762.1.patch, HIVE-11762.2.patch, 
> HIVE-11762.3.patch, HIVE-11762.4.patch
>
>
> When running TestHCatLoaderEncryption with -Dhadoop23.version=2.7.0, we get 
> the following error during setup():
> {noformat}
> testReadDataFromEncryptedHiveTableByPig[5](org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption)
>   Time elapsed: 3.648 sec  <<< ERROR!
> java.lang.NoSuchMethodError: 
> org.apache.hadoop.hdfs.DFSClient.setKeyProvider(Lorg/apache/hadoop/crypto/key/KeyProviderCryptoExtension;)V
>   at 
> org.apache.hadoop.hive.shims.Hadoop23Shims.getMiniDfs(Hadoop23Shims.java:534)
>   at 
> org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.initEncryptionShim(TestHCatLoaderEncryption.java:252)
>   at 
> org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:200)
> {noformat}
> It looks like between Hadoop 2.6 and Hadoop 2.7, the argument to 
> DFSClient.setKeyProvider() changed:
> {noformat}
>@VisibleForTesting
> -  public void setKeyProvider(KeyProviderCryptoExtension provider) {
> -this.provider = provider;
> +  public void setKeyProvider(KeyProvider provider) {
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11929) Fix branch-1 build broke

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-11929:


Assignee: Prasanth Jayachandran

> Fix branch-1 build broke
> 
>
> Key: HIVE-11929
> URL: https://issues.apache.org/jira/browse/HIVE-11929
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> HIVE-11762 commit broke branch-1 build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-09-22 Thread Chen Xin Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Xin Yu updated HIVE-9600:
--
Attachment: (was: HIVE-9600.1.patch)

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9600) add missing classes to hive-jdbc-standalone.jar

2015-09-22 Thread Chen Xin Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Xin Yu updated HIVE-9600:
--
Attachment: HIVE-9600.1.patch

> add missing classes to hive-jdbc-standalone.jar
> ---
>
> Key: HIVE-9600
> URL: https://issues.apache.org/jira/browse/HIVE-9600
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9600.1.patch
>
>
> hive-jdbc-standalone.jar does not have hadoop Configuration and maybe other 
> hadoop-common classes required to open jdbc connection



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11929) Fix branch-1 build broke

2015-09-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903831#comment-14903831
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11929:
--

+1 pending unit test run.

> Fix branch-1 build broke
> 
>
> Key: HIVE-11929
> URL: https://issues.apache.org/jira/browse/HIVE-11929
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11929.patch
>
>
> HIVE-11762 commit broke branch-1 build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11929) Fix branch-1 build broke

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11929.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

Thanks [~hsubramaniyan]! Since we don't run unit tests on branch-1 and this is 
branch-1 only. Committed patch to branch-1.

> Fix branch-1 build broke
> 
>
> Key: HIVE-11929
> URL: https://issues.apache.org/jira/browse/HIVE-11929
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0
>
> Attachments: HIVE-11929.patch
>
>
> HIVE-11762 commit broke branch-1 build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-22 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903844#comment-14903844
 ] 

Yongzhi Chen commented on HIVE-11217:
-

Thanks [~prasanth_j] for reviewing the patch.

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch, HIVE-11217.4.patch, HIVE-11217.5.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292)
>   at 

[jira] [Resolved] (HIVE-11921) LLAP: merge master into branch

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-11921.
-
Resolution: Fixed

> LLAP: merge master into branch
> --
>
> Key: HIVE-11921
> URL: https://issues.apache.org/jira/browse/HIVE-11921
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>
> Again because of hbase-metastore changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11925) Hive file format checking breaks load from named pipes

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903732#comment-14903732
 ] 

Sergey Shelukhin commented on HIVE-11925:
-

Note this doesn't remove the check for other file format, I am not sure if 
allowing ORC/RC/etc. load from named pipes without checks is a good idea. What 
it does is disable automatic detection via blind checks for pipes.

> Hive file format checking breaks load from named pipes
> --
>
> Key: HIVE-11925
> URL: https://issues.apache.org/jira/browse/HIVE-11925
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11925.patch
>
>
> Opening the file and mucking with it when hive.fileformat.check is true (the 
> default) breaks the LOAD command from a named pipe. Right now, it's done for 
> all the text files blindly to see if they might be in some other format. 
> Files.getAttribute can be used to figure out if the input is a named pipe (or 
> a socket) and skip the format check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903760#comment-14903760
 ] 

Sergey Shelukhin edited comment on HIVE-11777 at 9/23/15 1:16 AM:
--

[~gopalv] this is the method to not make metastore call per split. Thoughts? 
Another alternative I thought about was to make metastore call from one thread, 
allowing directory listings to accumulate until it completes, for the next call.


was (Author: sershe):
[~gopalv] this is the method to not make metastore call per split. Thoughts? 

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Fix Version/s: (was: hbase-metastore-branch)

> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Description: 
See HIVE-11705. This just contains the hbase-metastore-specific methods from 
that patch



  was:
See HIVE-11705. This just contains the hbase-metastore-specific methods from 
that patch

NO PRECOMMIT TESTS



> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11823:

Attachment: HIVE-11823.01.patch

The same patch to run HiveQA now that the branch is merged

> create a self-contained translation for SARG to be used by metastore
> 
>
> Key: HIVE-11823
> URL: https://issues.apache.org/jira/browse/HIVE-11823
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11823.01.patch, HIVE-11823.patch
>
>
> See HIVE-11705. This just contains the hbase-metastore-specific methods from 
> that patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-22 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-11473:
--
Summary: Upgrade Spark dependency to 1.5 [Spark Branch]  (was: Failed to 
create spark client with Spark 1.5)

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11699) Support special characters in quoted table names

2015-09-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903770#comment-14903770
 ] 

Hive QA commented on HIVE-11699:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761578/HIVE-11699.05.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9580 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_3
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5378/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5378/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5378/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761578 - PreCommit-HIVE-TRUNK-Build

> Support special characters in quoted table names
> 
>
> Key: HIVE-11699
> URL: https://issues.apache.org/jira/browse/HIVE-11699
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, 
> HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch
>
>
> Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
> investigate how much change there should be if we would like to support 
> special characters, e.g., "/" in table names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11928:
-
Attachment: HIVE-11928.1.patch

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928.1.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903794#comment-14903794
 ] 

Sergey Shelukhin commented on HIVE-11928:
-

+1

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928.1.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11929) Fix branch-1 build broke

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11929:
-
Attachment: HIVE-11929.patch

> Fix branch-1 build broke
> 
>
> Key: HIVE-11929
> URL: https://issues.apache.org/jira/browse/HIVE-11929
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11929.patch
>
>
> HIVE-11762 commit broke branch-1 build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11840) when multi insert the inputformat becomes OneNullRowInputFormat

2015-09-22 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903820#comment-14903820
 ] 

Feng Yuan commented on HIVE-11840:
--

is there anyone can help this?

> when multi insert the inputformat becomes OneNullRowInputFormat
> ---
>
> Key: HIVE-11840
> URL: https://issues.apache.org/jira/browse/HIVE-11840
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Blocker
> Fix For: 0.14.1
>
> Attachments: multi insert, single__insert
>
>
> example:
> from portrait.rec_feature_feedback a 
> insert overwrite table portrait.test1 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('949722CF_12F7_523A_EE21_E3D591B7E755') 
> insert overwrite table portrait.test2 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('test') 
> insert overwrite table portrait.test3 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('F7734668_CC49_8C4F_24C5_EA8B6728E394')
> when single insert it works.but multi insert when i select * from test1:
> NULL NULL NULL NULL NULL NULL.
> i see "explain extended"
> Path -> Alias:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} [a]
> -mr-10007portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Czgc_pc, bid=949722CF_12F7_523A_EE21_E3D591B7E755} [a]
>   Path -> Partition:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} 
>   Partition
> base file name: bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
> input format: org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid F7734668_CC49_8C4F_24C5_EA8B6728E394
>   cid Cyiyaowang
>   l_date 2015-09-09
> but when single insert:
> Path -> Alias:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  [a]
>   Path -> Partition:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  
>   Partition
> base file name: bid=949722CF_12F7_523A_EE21_E3D591B7E755
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid 949722CF_12F7_523A_EE21_E3D591B7E755
>   cid Czgc_pc
>   l_date 2015-09-09



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11840) when multi insert the inputformat becomes OneNullRowInputFormat

2015-09-22 Thread Feng Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Yuan updated HIVE-11840:
-
Priority: Blocker  (was: Critical)

> when multi insert the inputformat becomes OneNullRowInputFormat
> ---
>
> Key: HIVE-11840
> URL: https://issues.apache.org/jira/browse/HIVE-11840
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Blocker
> Fix For: 0.14.1
>
> Attachments: multi insert, single__insert
>
>
> example:
> from portrait.rec_feature_feedback a 
> insert overwrite table portrait.test1 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('949722CF_12F7_523A_EE21_E3D591B7E755') 
> insert overwrite table portrait.test2 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('test') 
> insert overwrite table portrait.test3 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('F7734668_CC49_8C4F_24C5_EA8B6728E394')
> when single insert it works.but multi insert when i select * from test1:
> NULL NULL NULL NULL NULL NULL.
> i see "explain extended"
> Path -> Alias:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} [a]
> -mr-10007portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Czgc_pc, bid=949722CF_12F7_523A_EE21_E3D591B7E755} [a]
>   Path -> Partition:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} 
>   Partition
> base file name: bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
> input format: org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid F7734668_CC49_8C4F_24C5_EA8B6728E394
>   cid Cyiyaowang
>   l_date 2015-09-09
> but when single insert:
> Path -> Alias:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  [a]
>   Path -> Partition:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  
>   Partition
> base file name: bid=949722CF_12F7_523A_EE21_E3D591B7E755
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid 949722CF_12F7_523A_EE21_E3D591B7E755
>   cid Czgc_pc
>   l_date 2015-09-09



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11929) Fix branch-1 build broke

2015-09-22 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903834#comment-14903834
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11929:
--

+1 after realizing that this is a branch-1 only change.

Thanks
Hari

> Fix branch-1 build broke
> 
>
> Key: HIVE-11929
> URL: https://issues.apache.org/jira/browse/HIVE-11929
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11929.patch
>
>
> HIVE-11762 commit broke branch-1 build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11925) Hive file format checking breaks load from named pipes

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11925:

Attachment: HIVE-11925.patch

[~ashutoshc] do you want to take a look?

> Hive file format checking breaks load from named pipes
> --
>
> Key: HIVE-11925
> URL: https://issues.apache.org/jira/browse/HIVE-11925
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11925.patch
>
>
> Opening the file and mucking with it when hive.fileformat.check is true (the 
> default) breaks the LOAD command from a named pipe. Right now, it's done for 
> all the text files blindly to see if they might be in some other format. 
> Files.getAttribute can be used to figure out if the input is a named pipe (or 
> a socket) and skip the format check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-09-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-11927:
--

Assignee: Pengcheng Xiong

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-09-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11927:
---
Attachment: HIVE-11927.01.patch

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903736#comment-14903736
 ] 

Sergey Shelukhin commented on HIVE-11915:
-

That makes sense; however there's no plan to commit WIP patch right now. As for 
the main patch, I am setting the 0 setting according to random reports of that 
fixing the issue for some people :) As per jira description. It definitely 
wouldn't hurt.

> BoneCP returns closed connections from the pool
> ---
>
> Key: HIVE-11915
> URL: https://issues.apache.org/jira/browse/HIVE-11915
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11915.WIP.patch, HIVE-11915.patch
>
>
> It's a very old bug in BoneCP and it will never be fixed... There are 
> multiple workarounds on the internet but according to responses they are all 
> unreliable. We should upgrade to HikariCP (which in turn is only supported by 
> DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a 
> relatively weak drum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11592) ORC metadata section can sometimes exceed protobuf message size limit

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903776#comment-14903776
 ] 

Prasanth Jayachandran commented on HIVE-11592:
--

[~owen.omalley] If the field ends at the buffer boundary then parseFrom() still 
should succeed. Right? This patch expands the range only on failure (from 
parseFrom) until it reaches 1GB max limit. 

> ORC metadata section can sometimes exceed protobuf message size limit
> -
>
> Key: HIVE-11592
> URL: https://issues.apache.org/jira/browse/HIVE-11592
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11592.1.patch, HIVE-11592.2.patch, 
> HIVE-11592.3.patch
>
>
> If there are too many small stripes and with many columns, the overhead for 
> storing metadata (column stats) can exceed the default protobuf message size 
> of 64MB. Reading such files will throw the following exception
> {code}
> Exception in thread "main" 
> com.google.protobuf.InvalidProtocolBufferException: Protocol message was too 
> large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase 
> the size limit.
> at 
> com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at 
> com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at 
> com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811)
> at 
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.(OrcProto.java:1331)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.(OrcProto.java:1281)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1374)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1369)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.(OrcProto.java:4887)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.(OrcProto.java:4803)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4990)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:4985)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.(OrcProto.java:12925)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics.(OrcProto.java:12872)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12961)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics$1.parsePartialFrom(OrcProto.java:12956)
> at 
> com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.(OrcProto.java:13599)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.(OrcProto.java:13546)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13635)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata$1.parsePartialFrom(OrcProto.java:13630)
> at 
> com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
> at 
> com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Metadata.parseFrom(OrcProto.java:13746)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl$MetaInfoObjExtractor.(ReaderImpl.java:468)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:314)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:228)
> at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:67)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at 

[jira] [Commented] (HIVE-11923) allow qtests to run via a single client session for tez and llap

2015-09-22 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903790#comment-14903790
 ] 

Siddharth Seth commented on HIVE-11923:
---

[~sershe] - not much point submitting this to jenkins yet. There's a lot of 
failures. Some potential issues like 11924, others are caused by the fact that 
the TransactionManager can be set once per session.
Will upload another patch which resets the SessionState, but uses the same AM.

> allow qtests to run via a single client session for tez and llap
> 
>
> Key: HIVE-11923
> URL: https://issues.apache.org/jira/browse/HIVE-11923
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-11923.1.txt, HIVE-11923.branch-1.txt
>
>
> Launching a new session - AM and containers for each test adds unnecessary 
> overheads. Running via a single session should reduce the run time 
> significantly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903788#comment-14903788
 ] 

Sergey Shelukhin commented on HIVE-11819:
-

Test failure is unrelated

> HiveServer2 catches OOMs on request threads
> ---
>
> Key: HIVE-11819
> URL: https://issues.apache.org/jira/browse/HIVE-11819
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11819.01.patch, HIVE-11819.02.patch, 
> HIVE-11819.patch
>
>
> ThriftCLIService methods such as ExecuteStatement are apparently capable of 
> catching OOMs because they get wrapped in RTE by HiveSessionProxy. 
> This shouldn't happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11928) ORC footer section can also exceed protobuf message limit

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11928:
-
Attachment: HIVE-11928-branch-1.patch

Attaching branch-1 patch.

> ORC footer section can also exceed protobuf message limit
> -
>
> Key: HIVE-11928
> URL: https://issues.apache.org/jira/browse/HIVE-11928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jagruti Varia
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11928-branch-1.patch, HIVE-11928.1.patch
>
>
> Similar to HIVE-11592 but for orc footer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-22 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903424#comment-14903424
 ] 

Yongzhi Chen commented on HIVE-11217:
-

[~prasanth_j], the place that I put the code to throw exception is within 
if (field_schemas != null) which means will create a CTAS table.
If the field_schemas is null, the change from void to string code will be 
reached (this is for the case such as "select NULL from tt", something more 
like hive internal use void for temp table...) So we have to keep it. 

I will make more changes to save some cpu time. Thanks


> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch, HIVE-11217.4.patch, HIVE-11271.5.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.(OrcStruct.java:195)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
>   at 
> 

[jira] [Commented] (HIVE-11911) The stats table limits are too large for innodb

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903435#comment-14903435
 ] 

Sergey Shelukhin commented on HIVE-11911:
-

Test failures are unrelated, known bad tests, Tez failed due to some DFS 
cluster issue


> The stats table limits are too large for innodb
> ---
>
> Key: HIVE-11911
> URL: https://issues.apache.org/jira/browse/HIVE-11911
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11911.patch
>
>
> The limits were increased to a reasonable value some time ago, apparently 
> these values are too large for innodb due to some index limit nonsense. We 
> need to decrease them a little bit.
> There's no need to decrease them in an upgrade script; if they were already 
> created successfully it's ok to have them as is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11923) allow qtests to run via a single client session for tez and llap

2015-09-22 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-11923:
--
Attachment: HIVE-11923.branch-1.txt
HIVE-11923.1.txt

Patches for branch-1 and the llap branch.

This causes some test failures. Creating a separate jira for those.

> allow qtests to run via a single client session for tez and llap
> 
>
> Key: HIVE-11923
> URL: https://issues.apache.org/jira/browse/HIVE-11923
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-11923.1.txt, HIVE-11923.branch-1.txt
>
>
> Launching a new session - AM and containers for each test adds unnecessary 
> overheads. Running via a single session should reduce the run time 
> significantly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-09-22 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903372#comment-14903372
 ] 

Prasanth Jayachandran commented on HIVE-11217:
--

[~ychena] Thanks for the explanation. Make sense now. 
Some minor comments: Few lines below your change there is comment about 
assigning STRING type if column type is VOID. Since we throw SemanticException 
earlier the following code will be a dead code. Right? Can you please remove it?

{code}
// Replace VOID type with string when the output is a temp table or
// local files.
// A VOID type can be generated under the query:
//
// select NULL from tt;
// or
// insert overwrite local directory "abc" select NULL from tt;
//
// where there is no column type to which the NULL value should be
// converted.
//
String tName = colInfo.getType().getTypeName();
if (tName.equals(serdeConstants.VOID_TYPE_NAME)) {
  colTypes = colTypes.concat(serdeConstants.STRING_TYPE_NAME);
} else {
  colTypes = colTypes.concat(tName);
}
{code}

nit: the colType check that you have added can be moved further up, immediately 
after the for loop iteration to save some allocation/cpu cycles.

> CTAS statements throws error, when the table is stored as ORC File format and 
> select clause has NULL/VOID type column 
> --
>
> Key: HIVE-11217
> URL: https://issues.apache.org/jira/browse/HIVE-11217
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
>Reporter: Gaurav Kohli
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch, 
> HIVE-11217.3.patch, HIVE-11217.4.patch, HIVE-11271.5.patch
>
>
> If you try to use create-table-as-select (CTAS) statement and create a ORC 
> File format based table, then you can't use NULL as a column value in select 
> clause 
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> Error: 
> {quote}
> 347084 [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IllegalArgumentException: Unknown primitive type VOID
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
>   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
>   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
>   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
>   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
>   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 

[jira] [Updated] (HIVE-11919) Hive Union Type Mismatch

2015-09-22 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-11919:
--
Attachment: HIVE-11919.1.patch

> Hive Union Type Mismatch
> 
>
> Key: HIVE-11919
> URL: https://issues.apache.org/jira/browse/HIVE-11919
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-11919.1.patch
>
>
> In Hive for union right most type wins out for most primitive types during 
> plan gen. However when union op gets initialized the type gets switched.
> This could result in bad data & type exceptions.
> This happens only in non cbo mode.
> In CBO mode, Hive would add explicit type casts that would prevent such type 
> issues.
> Sample Query: 
> select cd/sum(cd) over() from(select cd from u1 union all select cd from u2 
> union all select cd from u3)u4;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11911) The stats table limits are too large for innodb

2015-09-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903427#comment-14903427
 ] 

Sergey Shelukhin commented on HIVE-11911:
-

[~prasanth_j] see the exception text I added; you can increase prefix size by 
configuring mysql.

[~ashutoshc] maybe, but that's a separate discussion. For now we should fix it. 

> The stats table limits are too large for innodb
> ---
>
> Key: HIVE-11911
> URL: https://issues.apache.org/jira/browse/HIVE-11911
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11911.patch
>
>
> The limits were increased to a reasonable value some time ago, apparently 
> these values are too large for innodb due to some index limit nonsense. We 
> need to decrease them a little bit.
> There's no need to decrease them in an upgrade script; if they were already 
> created successfully it's ok to have them as is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10785) Support aggregate push down through joins

2015-09-22 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10785:

Attachment: HIVE-10785.4.patch

> Support aggregate push down through joins
> -
>
> Key: HIVE-10785
> URL: https://issues.apache.org/jira/browse/HIVE-10785
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10785.2.patch, HIVE-10785.3.patch, 
> HIVE-10785.4.patch, HIVE-10785.patch
>
>
> Enable {{AggregateJoinTransposeRule}} in CBO that pushes Aggregate through 
> Join operators. The rule has been extended in Calcite 1.4 to cover complex 
> cases e.g. Aggregate operators comprising UDAF. The decision on whether to 
> push the Aggregate through Join or not should be cost-driven.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.06.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.08.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: (was: HIVE-11642.07.patch)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.08.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top

2015-09-22 Thread Kiran Kumar Kolli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903486#comment-14903486
 ] 

Kiran Kumar Kolli commented on HIVE-11724:
--

Makes sense, will update it

> WebHcat get jobs to order jobs on time order with latest at top
> ---
>
> Key: HIVE-11724
> URL: https://issues.apache.org/jira/browse/HIVE-11724
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Attachments: HIVE-11724.1.patch, HIVE-11724.2.patch, 
> HIVE-11724.3.patch, HIVE-11724.4.patch
>
>
> HIVE-5519 added pagination feature support to WebHcat. This implementation 
> returns the jobs lexicographically resulting in older jobs showing at the 
> top. 
> Improvement is to order them on time with latest at top. Typically latest 
> jobs (or running) ones are more relevant to the user. Time based ordering 
> with pagination makes more sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11918) Implement/Enable constant related optimization rules in Calcite

2015-09-22 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11918:
---
Description: Right now, Hive optimizer (Calcite) is short of the constant 
related optimization rules. For example, constant folding, constant propagation 
and constant transitive rules. Although Hive later provides those rules in the 
logical optimizer, we would like to implement those inside Calcite. This will 
benefit the current optimization as well as the optimization based on return 
path that we are planning to use in the future. This JIRA is the umbrella JIRA 
to implement/enable those rules.  (was: Right now, Hive optimizer (Calcite) is 
short of the constant related optimization rules. For example, constant 
folding, constant propagation and constant transitive rules. This JIRA is the 
umbrella JIRA to implement/enable those rules.)

> Implement/Enable constant related optimization rules in Calcite
> ---
>
> Key: HIVE-11918
> URL: https://issues.apache.org/jira/browse/HIVE-11918
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now, Hive optimizer (Calcite) is short of the constant related 
> optimization rules. For example, constant folding, constant propagation and 
> constant transitive rules. Although Hive later provides those rules in the 
> logical optimizer, we would like to implement those inside Calcite. This will 
> benefit the current optimization as well as the optimization based on return 
> path that we are planning to use in the future. This JIRA is the umbrella 
> JIRA to implement/enable those rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11714) Turn off hybrid grace hash join for cross product join

2015-09-22 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-11714:
-
Attachment: HIVE-11714.1.patch

> Turn off hybrid grace hash join for cross product join
> --
>
> Key: HIVE-11714
> URL: https://issues.apache.org/jira/browse/HIVE-11714
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-11714.1.patch
>
>
> Current partitioning calculation is solely based on hash value of the key. 
> For cross product join where keys are empty, all the rows will be put into 
> partition 0. This falls back to the regular mapjoin behavior where we only 
> have one hashtable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11922) Better error message when ORC split generation fails

2015-09-22 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11922:
-
Priority: Trivial  (was: Major)

> Better error message when ORC split generation fails
> 
>
> Key: HIVE-11922
> URL: https://issues.apache.org/jira/browse/HIVE-11922
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Trivial
>
> When ORC split generation fails, it just prints out "serious error" message 
> on the console which does not tell anything about the cause of the exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.08.patch

Updated the patch - some conflicts with metastore branch merge

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.06.patch, HIVE-11642.07.patch, HIVE-11642.08.patch, 
> HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-09-22 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11517:

Attachment: HIVE-11517.01.patch

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11517.01.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11921) LLAP: merge master into branch

2015-09-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11921:

Description: 
Again because of hbase-metastore changes


> LLAP: merge master into branch
> --
>
> Key: HIVE-11921
> URL: https://issues.apache.org/jira/browse/HIVE-11921
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: llap
>
>
> Again because of hbase-metastore changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top

2015-09-22 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903457#comment-14903457
 ] 

Eugene Koifman commented on HIVE-11724:
---

[~kiran.kolli] could the values of this property be more meaningful?  
For example lexicographicalAsc, lexicographicalDesc, etc.  So that we can later 
others like byTimeAsc for example.

> WebHcat get jobs to order jobs on time order with latest at top
> ---
>
> Key: HIVE-11724
> URL: https://issues.apache.org/jira/browse/HIVE-11724
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Attachments: HIVE-11724.1.patch, HIVE-11724.2.patch, 
> HIVE-11724.3.patch, HIVE-11724.4.patch
>
>
> HIVE-5519 added pagination feature support to WebHcat. This implementation 
> returns the jobs lexicographically resulting in older jobs showing at the 
> top. 
> Improvement is to order them on time with latest at top. Typically latest 
> jobs (or running) ones are more relevant to the user. Time based ordering 
> with pagination makes more sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11926) Stats annotation might not extract stats for varchar/decimal columns

2015-09-22 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11926:
---
Description: 
It is because StatsUtils uses the String.startWith to compare VARCHAR/DECIMAL 
column type name with serdeConstants which are in lowercase.  But these type 
names in stats might not be in lower case. We ran into a case where the type 
name from TAB_COL_STATS/PART_COL_STATS was actually in uppercase (e.g. VARCHAR, 
DECIMAL) because these column stats were populated from other HMS clients like 
Impala.
We need changes these type name comparison to be case insensitive

  was:
If column stats is calculated and populated to HMS from its client like Impala 
etc, the column type name stored in TAB_COL_STATS/PART_COL_STATS could be in 
uppercase (e.g. VARCHAR, DECIMAL). When Hive collects stats for these columns 
during optimization (with hive.stats.fetch.column.stats set to true), it will 
throw out NPE. See error message like below:
{code}
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: NullPointerException null
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:103)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:379)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:366)
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:636)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:623)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:180)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136)
at 
org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124)
truncated
{code}

Summary: Stats annotation might not extract stats for varchar/decimal 
columns  (was: NPE could occur in collectStatistics when column type is varchar)

Changed the JIRA title and description since NPE won't happen in this version. 

> Stats annotation might not extract stats for varchar/decimal columns
> 
>
> Key: HIVE-11926
> URL: https://issues.apache.org/jira/browse/HIVE-11926
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> It is because StatsUtils uses the String.startWith to compare VARCHAR/DECIMAL 
> column type name with serdeConstants which are in lowercase.  But these type 
> names in stats might not be in lower case. We ran into a case where the type 
> name from TAB_COL_STATS/PART_COL_STATS was actually in uppercase (e.g. 
> VARCHAR, DECIMAL) because these column stats were populated from other HMS 
> clients like Impala.
> We need changes these type name comparison to be case insensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11783) Extending HPL/SQL parser

2015-09-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903905#comment-14903905
 ] 

Lefty Leverenz commented on HIVE-11783:
---

Does this need documentation?

* [Hive HPL/SQL | 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156]
oops, the wiki just has a stub so far (I'll add a link to the HPL/SQL Reference)
* [HPL/SQL Reference | http://www.plhql.org/doc]

> Extending HPL/SQL parser
> 
>
> Key: HIVE-11783
> URL: https://issues.apache.org/jira/browse/HIVE-11783
> Project: Hive
>  Issue Type: Improvement
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Fix For: 2.0.0
>
> Attachments: HIVE-11783.1.patch
>
>
> Need to extend procedural SQL parser and synchronize code base by adding 
> PART_COUNT, PART_COUNT_BY functions as well as CMP ROW_COUNT, CMP SUM and 
> COPY TO HDFS statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true

2015-09-22 Thread WangMeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903977#comment-14903977
 ] 

WangMeng commented on HIVE-11880:
-

[~hiveqa]

> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A(INT 
> Type). 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
> {code}
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT`o_orderkey` , `0L AS `o_custkey`   
>   FROM   `rcfileorders`   
>   UNION ALL   
>   SELECT `o_orderkey` ,`o_custkey`   
>   FROM  `textfileorders`) `oo`.
> {code}
>   In VIEW view_orders , type of 'o_custkey' is INT normally, while  the type 
> of corresponding column constant "0" is BIGINT( `0L AS `o_custkey` ).
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}
>  the fllowing query (with filter " incompatible column 'o_custkey' ")  will 
> fail  with  java.lang.IndexOutOfBoundsException:
> {code}   
>  select count(1) from view_orders  where o_custkey<10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11106) HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database

2015-09-22 Thread Chen Xin Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903988#comment-14903988
 ] 

Chen Xin Yu commented on HIVE-11106:


It works for me with hive-jdbs-standalone.jar, and the version is 1.2.1.
jdbc connection: "jdbc:hive2://xxx:1/testdb", I select data from a table 
testdb_t1 in database testdb, it works well. 


> HiveServer2 JDBC (greater than v0.13.1) cannot connect to non-default database
> --
>
> Key: HIVE-11106
> URL: https://issues.apache.org/jira/browse/HIVE-11106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.14.0
>Reporter: Tom Coleman
>
> Using HiveServer 0.14.0 or greater, I cannot connect a non-default database.
> For example when connecting to HiveServer to via the following URLs, the 
> session uses the 'default' database, instead of the intended database.
> jdbc://localhost:1/customDb
> This exact issue was fixed in 0.13.1 of HiveServer from 
> https://issues.apache.org/jira/browse/HIVE-5904 but for some reason this fix 
> was not ported to v0.14.0 or greater. From looking at the source, it looks as 
> if this fix was overriden by another change to the HiveConnection class, was 
> this intentional or a defect reintroduced from another defect fix?
> This means that we need to use 0.13.1 in order to connect to a non-default 
> database via JDBC and we cannot upgrade Hive versions. We don't want placing 
> a JDBC interceptor to inject "use customDb" each time a connection is 
> borrowed from the pool on production code. One should be able to connect 
> straight to the non-default database via the JDBC URL.
> Now it perhaps could be a simple oversight on my behalf in which the syntax 
> to connect to a non-default database has changed from 0.14.0 onwards but I'd 
> be grateful is this could be confirmed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-09-22 Thread Ratandeep Ratti (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903990#comment-14903990
 ] 

Ratandeep Ratti commented on HIVE-11878:


RB for approach 3: https://reviews.apache.org/r/38663/

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9752) Documentation for HBase metastore

2015-09-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904001#comment-14904001
 ] 

Lefty Leverenz commented on HIVE-9752:
--

HIVE-11711 merged the hbase-metastore branch to master for the upcoming 2.0.0 
release, so I've added a TODOC2.0 label to this jira.

> Documentation for HBase metastore
> -
>
> Key: HIVE-9752
> URL: https://issues.apache.org/jira/browse/HIVE-9752
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore, Metastore
>Affects Versions: hbase-metastore-branch
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: TODOC2.0
>
> All of the documentation we will need to write for the HBase metastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when execute query with filter condition on type incompatible column on data(generated by UNION ALL with an union column is constant and it h

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Attachment: HIVE-11880.01.patch

>IndexOutOfBoundsException when execute query with filter condition on type 
> incompatible column on data(generated by UNION ALL with an union column is 
> constant and it has incompatible type with corresponding column) 
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when a union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A. The 
> query with filter condition on type incompatible column a on this UNION-ALL 
> results  will cause IndexOutOfBoundsException
>   Such as TPC-H table "orders", we  CREATE VIEW `view_orders` AS select 
> `oo`.`o_orderkey` , `oo`.`o_custkey`  from (  select  `orders`.`o_orderkey` , 
> `rcfileorders`.`o_custkey` from `tpch270g`.`rcfileorders`   union all  select 
> `orcfileorders`.`o_orderkey` , 0L as `o_custkey`   from  
> `tpch270g`.`textfileorders`) `oo`.
>Type of 'o_custkey' is INT normally, while  the type of corresponding 
> column constant "0" is BIGINT.
>Then the fllowing query(with filter incompatible column 0_custkey)  will 
> fail  with  java.lang.IndexOutOfBoundsException:
> 'select count(1) from view_orders  where o_custkey<10 '.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Summary: IndexOutOfBoundsException when query with filter condition on type 
incompatible column of UNION ALL  (was:IndexOutOfBoundsException when 
execute query with filter condition on type incompatible column on 
data(generated by UNION ALL with an union column is constant and it has 
incompatible type with corresponding column) )

> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL
> ---
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when a union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A. The 
> query with filter condition on type incompatible column a on this UNION-ALL 
> results  will cause IndexOutOfBoundsException
>   Such as TPC-H table "orders", we  CREATE VIEW `view_orders` AS select 
> `oo`.`o_orderkey` , `oo`.`o_custkey`  from (  select  `orders`.`o_orderkey` , 
> `rcfileorders`.`o_custkey` from `tpch270g`.`rcfileorders`   union all  select 
> `orcfileorders`.`o_orderkey` , 0L as `o_custkey`   from  
> `tpch270g`.`textfileorders`) `oo`.
>Type of 'o_custkey' is INT normally, while  the type of corresponding 
> column constant "0" is BIGINT.
>Then the fllowing query(with filter incompatible column 0_custkey)  will 
> fail  with  java.lang.IndexOutOfBoundsException:
> 'select count(1) from view_orders  where o_custkey<10 '.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11926) Stats annotation might not extract stats for varchar/decimal columns

2015-09-22 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11926:
---
Attachment: HIVE-11926.patch

> Stats annotation might not extract stats for varchar/decimal columns
> 
>
> Key: HIVE-11926
> URL: https://issues.apache.org/jira/browse/HIVE-11926
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11926.patch
>
>
> It is because StatsUtils uses the String.startWith to compare VARCHAR/DECIMAL 
> column type name with serdeConstants which are in lowercase.  But these type 
> names in stats might not be in lower case. We ran into a case where the type 
> name from TAB_COL_STATS/PART_COL_STATS was actually in uppercase (e.g. 
> VARCHAR, DECIMAL) because these column stats were populated from other HMS 
> clients like Impala.
> We need changes these type name comparison to be case insensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?

2015-09-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11930:
---
Fix Version/s: (was: 0.14.1)

> how to prevent ppd the topN(a) udf predication in where clause?
> ---
>
> Key: HIVE-11930
> URL: https://issues.apache.org/jira/browse/HIVE-11930
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Blocker
>
> select 
> a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id)
>   from
> (  select 
> t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id
>   from 
>   ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
>order by t1.only_id,t1.pv desc
>  )a
>   where  a.customer='Cdianyingwang'
>   and a.taskid='33'
>   and a.step_id='0' 
>   and top1000(a.only_id)<=10;
> in above example:
> outer top1000(a.only_id)<=10;will ppd to:
> stage 1:
> ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
> and this stage have 2 reduce,so you can see this will output 20 records,
> upon to outer stage,the final results is exactly this 20 records.
> so i want to know is there any way to hint this topN udf predication not to 
> ppd?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10238) Loop optimization for SIMD in IfExprColumnColumn.txt

2015-09-22 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903934#comment-14903934
 ] 

Hive QA commented on HIVE-10238:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761592/HIVE-10238.3.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9561 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5379/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5379/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5379/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761592 - PreCommit-HIVE-TRUNK-Build

> Loop optimization for SIMD in IfExprColumnColumn.txt
> 
>
> Key: HIVE-10238
> URL: https://issues.apache.org/jira/browse/HIVE-10238
> Project: Hive
>  Issue Type: Sub-task
>  Components: Vectorization
>Affects Versions: 1.1.0
>Reporter: Chengxiang Li
>Assignee: Teddy Choi
>Priority: Minor
> Attachments: HIVE-10238.2.patch, HIVE-10238.3.patch, HIVE-10238.patch
>
>
> The ?: operator as following could not be vectorized in loop, we may transfer 
> it into mathematical expression.
> {code:java}
> for(int j = 0; j != n; j++) {
>   int i = sel[j];
>   outputVector[i] = (vector1[i] == 1 ? vector2[i] : vector3[i]);
>   outputIsNull[i] = (vector1[i] == 1 ?
>   arg2ColVector.isNull[i] : arg3ColVector.isNull[i]);
> }
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-09-22 Thread Ratandeep Ratti (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903992#comment-14903992
 ] 

Ratandeep Ratti commented on HIVE-11878:


[~jdere], [~ashutoshc], I'd love to hear your thoughts on this.

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?

2015-09-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903911#comment-14903911
 ] 

Gopal V commented on HIVE-11930:


That is hard to understand, you'll have to post an explain plan to get some 
attention to this.

> how to prevent ppd the topN(a) udf predication in where clause?
> ---
>
> Key: HIVE-11930
> URL: https://issues.apache.org/jira/browse/HIVE-11930
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Blocker
> Fix For: 0.14.1
>
>
> select 
> a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id)
>   from
> (  select 
> t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id
>   from 
>   ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
>order by t1.only_id,t1.pv desc
>  )a
>   where  a.customer='Cdianyingwang'
>   and a.taskid='33'
>   and a.step_id='0' 
>   and top1000(a.only_id)<=10;
> in above example:
> outer top1000(a.only_id)<=10;will ppd to:
> stage 1:
> ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
> and this stage have 2 reduce,so you can see this will output 20 records,
> upon to outer stage,the final results is exactly this 20 records.
> so i want to know is there any way to hint this topN udf predication not to 
> ppd?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Description: 
   For Hive UNION ALL , when an union column is constant(column a such as '0L') 
 and it has incompatible type with the corresponding column A(INT Type). 
  Query with filter condition on type incompatible column a on this UNION ALL  
will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders", we  create View  by : 

  CREATE VIEW `view_orders` AS
  SELECT `oo`.`o_orderkey` ,
 `oo`.`o_custkey` 
  FROM   (   
  SELECT   `orders`.`o_orderkey` ,  
   `rcfileorders`.`o_custkey`   
  FROM  `rcfileorders`   
  UNION ALL   
  SELECT   `orcfileorders`.`o_orderkey` , 
0L AS `o_custkey`   
  FROM  `textfileorders`) `oo`.

  In view_orders , type of 'o_custkey' is INT normally, while  the type of 
corresponding column constant "0" is BIGINT.

  When hive.ppd.remove.duplicatefilters=true, the fllowing query (with filter " 
incompatible column 'o_custkey' ")  will fail  with  
java.lang.IndexOutOfBoundsException:
'select count(1) from view_orders  where o_custkey<10 '.

  was:
   For Hive UNION ALL , when an union column is constant(column a such as '0L') 
 and it has incompatible type with the corresponding column A. 
  Query with filter condition on type incompatible column a on this UNION ALL  
will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders", we  create View  by : 

  CREATE VIEW `view_orders` AS
  SELECT `oo`.`o_orderkey` ,
 `oo`.`o_custkey` 
  FROM   (   
  SELECT   `orders`.`o_orderkey` ,  
   `rcfileorders`.`o_custkey`   
  FROM  `rcfileorders`   
  UNION ALL   
  SELECT   `orcfileorders`.`o_orderkey` , 
0L AS `o_custkey`   
  FROM  `textfileorders`) `oo`.

  In view_orders , type of 'o_custkey' is INT normally, while  the type of 
corresponding column constant "0" is BIGINT.

  Then the fllowing query(with filter " incompatible column 'o_custkey' ")  
will fail  with  java.lang.IndexOutOfBoundsException:
'select count(1) from view_orders  where o_custkey<10 '.


> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A(INT 
> Type). 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT   `orders`.`o_orderkey` ,  
>`rcfileorders`.`o_custkey`   
>   FROM  `rcfileorders`   
>   UNION ALL   
>   SELECT   `orcfileorders`.`o_orderkey` , 
> 0L AS `o_custkey`   
>   FROM  `textfileorders`) `oo`.
>   In view_orders , type of 'o_custkey' is INT normally, while  the type of 
> corresponding column constant "0" is BIGINT.
>   When hive.ppd.remove.duplicatefilters=true, the fllowing query (with filter 
> " incompatible column 'o_custkey' ")  will fail  with  
> java.lang.IndexOutOfBoundsException:
> 'select count(1) from view_orders  where o_custkey<10 '.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11926) Stats annotation might not extract stats for varchar/decimal columns

2015-09-22 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903919#comment-14903919
 ] 

Chaoyu Tang commented on HIVE-11926:


Patch has been posted on RB https://reviews.apache.org/r/38659/ . Thanks in 
advanced for review.

> Stats annotation might not extract stats for varchar/decimal columns
> 
>
> Key: HIVE-11926
> URL: https://issues.apache.org/jira/browse/HIVE-11926
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11926.patch
>
>
> It is because StatsUtils uses the String.startWith to compare VARCHAR/DECIMAL 
> column type name with serdeConstants which are in lowercase.  But these type 
> names in stats might not be in lower case. We ran into a case where the type 
> name from TAB_COL_STATS/PART_COL_STATS was actually in uppercase (e.g. 
> VARCHAR, DECIMAL) because these column stats were populated from other HMS 
> clients like Impala.
> We need changes these type name comparison to be case insensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Summary: IndexOutOfBoundsException when query with filter condition on type 
incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true  
(was: IndexOutOfBoundsException when query with filter condition on type 
incompatible column of UNION ALL)

> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A. 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT   `orders`.`o_orderkey` ,  
>`rcfileorders`.`o_custkey`   
>   FROM  `rcfileorders`   
>   UNION ALL   
>   SELECT   `orcfileorders`.`o_orderkey` , 
> 0L AS `o_custkey`   
>   FROM  `textfileorders`) `oo`.
>   In view_orders , type of 'o_custkey' is INT normally, while  the type of 
> corresponding column constant "0" is BIGINT.
>   Then the fllowing query(with filter " incompatible column 'o_custkey' ")  
> will fail  with  java.lang.IndexOutOfBoundsException:
> 'select count(1) from view_orders  where o_custkey<10 '.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Description: 
   For Hive UNION ALL , when an union column is constant(column a such as '0L') 
 and it has incompatible type with the corresponding column A. 
  Query with filter condition on type incompatible column a on this UNION ALL  
will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders", we  create View  by : 

  CREATE VIEW `view_orders` AS
  SELECT `oo`.`o_orderkey` ,
 `oo`.`o_custkey` 
  FROM   (   
  SELECT   `orders`.`o_orderkey` ,  
   `rcfileorders`.`o_custkey`   
  FROM  `rcfileorders`   
  UNION ALL   
  SELECT   `orcfileorders`.`o_orderkey` , 
0L AS `o_custkey`   
  FROM  `textfileorders`) `oo`.

  In view_orders , type of 'o_custkey' is INT normally, while  the type of 
corresponding column constant "0" is BIGINT.

  Then the fllowing query(with filter " incompatible column 'o_custkey' ")  
will fail  with  java.lang.IndexOutOfBoundsException:
'select count(1) from view_orders  where o_custkey<10 '.

  was:
   For Hive UNION ALL , when a union column is constant(column a such as '0L')  
and it has incompatible type with the corresponding column A. The query with 
filter condition on type incompatible column a on this UNION-ALL results  will 
cause IndexOutOfBoundsException

  Such as TPC-H table "orders", we  CREATE VIEW `view_orders` AS select 
`oo`.`o_orderkey` , `oo`.`o_custkey`  from (  select  `orders`.`o_orderkey` , 
`rcfileorders`.`o_custkey` from `tpch270g`.`rcfileorders`   union all  select 
`orcfileorders`.`o_orderkey` , 0L as `o_custkey`   from  
`tpch270g`.`textfileorders`) `oo`.

   Type of 'o_custkey' is INT normally, while  the type of corresponding column 
constant "0" is BIGINT.

   Then the fllowing query(with filter incompatible column 0_custkey)  will 
fail  with  java.lang.IndexOutOfBoundsException:
'select count(1) from view_orders  where o_custkey<10 '.


> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL
> ---
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A. 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT   `orders`.`o_orderkey` ,  
>`rcfileorders`.`o_custkey`   
>   FROM  `rcfileorders`   
>   UNION ALL   
>   SELECT   `orcfileorders`.`o_orderkey` , 
> 0L AS `o_custkey`   
>   FROM  `textfileorders`) `oo`.
>   In view_orders , type of 'o_custkey' is INT normally, while  the type of 
> corresponding column constant "0" is BIGINT.
>   Then the fllowing query(with filter " incompatible column 'o_custkey' ")  
> will fail  with  java.lang.IndexOutOfBoundsException:
> 'select count(1) from view_orders  where o_custkey<10 '.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11826) 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized user to access metastore

2015-09-22 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903924#comment-14903924
 ] 

Lefty Leverenz commented on HIVE-11826:
---

Okay, thanks Aihua Xu.

> 'hadoop.proxyuser.hive.groups' configuration doesn't prevent unauthorized 
> user to access metastore
> --
>
> Key: HIVE-11826
> URL: https://issues.apache.org/jira/browse/HIVE-11826
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11826.2.patch, HIVE-11826.patch
>
>
> With 'hadoop.proxyuser.hive.groups' configured in core-site.xml to certain 
> groups, currently if you run the job with a user not belonging to those 
> groups, it won't fail to access metastore. With old version hive 0.13, 
> actually it fails properly. 
> Seems HadoopThriftAuthBridge20S.java correctly call ProxyUsers.authorize() 
> while HadoopThriftAuthBridge23 doesn't. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?

2015-09-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903928#comment-14903928
 ] 

Gopal V commented on HIVE-11930:


Also, I don't expect your query to generate the expected result?

{code}
...
order by t1.only_id,t1.pv desc
)a
where a.customer='Cdianyingwang'
{code}

https://en.wikipedia.org/wiki/Order_by

{{Although some database systems allow the specification of an ORDER BY clause 
in subselects or view definitions, the presence there has no effect.}}

> how to prevent ppd the topN(a) udf predication in where clause?
> ---
>
> Key: HIVE-11930
> URL: https://issues.apache.org/jira/browse/HIVE-11930
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0
>Reporter: Feng Yuan
>Priority: Blocker
>
> select 
> a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id)
>   from
> (  select 
> t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id
>   from 
>   ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
>order by t1.only_id,t1.pv desc
>  )a
>   where  a.customer='Cdianyingwang'
>   and a.taskid='33'
>   and a.step_id='0' 
>   and top1000(a.only_id)<=10;
> in above example:
> outer top1000(a.only_id)<=10;will ppd to:
> stage 1:
> ( select t11.state_date,
>t11.customer,
>t11.taskid,
>t11.step_id,
>t11.exit_title,
>t11.pv,
>concat(t11.customer,t11.taskid,t11.step_id) as 
> only_id
>from
>   (  select 
> state_date,customer,taskid,step_id,exit_title,count(*) as pv
>  from bdi_fact2.mid_url_step
>  where exit_url!='-1'
>  and exit_title !='-1'
>  and l_date='2015-08-31'
>  group by 
> state_date,customer,taskid,step_id,exit_title
> )t11
>)t1
> and this stage have 2 reduce,so you can see this will output 20 records,
> upon to outer stage,the final results is exactly this 20 records.
> so i want to know is there any way to hint this topN udf predication not to 
> ppd?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true

2015-09-22 Thread WangMeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903975#comment-14903975
 ] 

WangMeng commented on HIVE-11880:
-

[~xuefuz]
I uploaded a new patch for this issue.
Please check it. Thanks.

> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A(INT 
> Type). 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
> {code}
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT`o_orderkey` , `0L AS `o_custkey`   
>   FROM   `rcfileorders`   
>   UNION ALL   
>   SELECT `o_orderkey` ,`o_custkey`   
>   FROM  `textfileorders`) `oo`.
> {code}
>   In VIEW view_orders , type of 'o_custkey' is INT normally, while  the type 
> of corresponding column constant "0" is BIGINT( `0L AS `o_custkey` ).
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}
>  the fllowing query (with filter " incompatible column 'o_custkey' ")  will 
> fail  with  java.lang.IndexOutOfBoundsException:
> {code}   
>  select count(1) from view_orders  where o_custkey<10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) IndexOutOfBoundsException when query with filter condition on type incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true

2015-09-22 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Description: 
   For Hive UNION ALL , when an union column is constant(column a such as '0L') 
 and it has incompatible type with the corresponding column A(INT Type). 
  Query with filter condition on type incompatible column a on this UNION ALL  
will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders", we  create View  by : 
{code}
  CREATE VIEW `view_orders` AS
  SELECT `oo`.`o_orderkey` ,
 `oo`.`o_custkey` 
  FROM   (   
  SELECT`o_orderkey` , `0L AS `o_custkey`   
  FROM   `rcfileorders`   
  UNION ALL   
  SELECT `o_orderkey` ,`o_custkey`   
  FROM  `textfileorders`) `oo`.
{code}
  In VIEW view_orders , type of 'o_custkey' is INT normally, while  the type of 
corresponding column constant "0" is BIGINT( `0L AS `o_custkey` ).

When 
{code}
set hive.ppd.remove.duplicatefilters=true
{code}
 the fllowing query (with filter " incompatible column 'o_custkey' ")  will 
fail  with  java.lang.IndexOutOfBoundsException:
{code}   
 select count(1) from view_orders  where o_custkey<10
{code}

  was:
   For Hive UNION ALL , when an union column is constant(column a such as '0L') 
 and it has incompatible type with the corresponding column A(INT Type). 
  Query with filter condition on type incompatible column a on this UNION ALL  
will cause IndexOutOfBoundsException.

 Such as TPC-H table "orders", we  create View  by : 

  CREATE VIEW `view_orders` AS
  SELECT `oo`.`o_orderkey` ,
 `oo`.`o_custkey` 
  FROM   (   
  SELECT   `orders`.`o_orderkey` ,  
   `rcfileorders`.`o_custkey`   
  FROM  `rcfileorders`   
  UNION ALL   
  SELECT   `orcfileorders`.`o_orderkey` , 
0L AS `o_custkey`   
  FROM  `textfileorders`) `oo`.

  In view_orders , type of 'o_custkey' is INT normally, while  the type of 
corresponding column constant "0" is BIGINT.

  When hive.ppd.remove.duplicatefilters=true, the fllowing query (with filter " 
incompatible column 'o_custkey' ")  will fail  with  
java.lang.IndexOutOfBoundsException:
'select count(1) from view_orders  where o_custkey<10 '.


> IndexOutOfBoundsException when query with filter condition on type 
> incompatible column of UNION ALL when hive.ppd.remove.duplicatefilters=true
> --
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch
>
>
>For Hive UNION ALL , when an union column is constant(column a such as 
> '0L')  and it has incompatible type with the corresponding column A(INT 
> Type). 
>   Query with filter condition on type incompatible column a on this UNION ALL 
>  will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders", we  create View  by : 
> {code}
>   CREATE VIEW `view_orders` AS
>   SELECT `oo`.`o_orderkey` ,
>  `oo`.`o_custkey` 
>   FROM   (   
>   SELECT`o_orderkey` , `0L AS `o_custkey`   
>   FROM   `rcfileorders`   
>   UNION ALL   
>   SELECT `o_orderkey` ,`o_custkey`   
>   FROM  `textfileorders`) `oo`.
> {code}
>   In VIEW view_orders , type of 'o_custkey' is INT normally, while  the type 
> of corresponding column constant "0" is BIGINT( `0L AS `o_custkey` ).
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}
>  the fllowing query (with filter " incompatible column 'o_custkey' ")  will 
> fail  with  java.lang.IndexOutOfBoundsException:
> {code}   
>  select count(1) from view_orders  where o_custkey<10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive

2015-09-22 Thread Ratandeep Ratti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ratandeep Ratti updated HIVE-11878:
---
Attachment: HIVE-11878_approach3.patch

Implemented approach 3 outlined in the ticket

> ClassNotFoundException can possibly  occur if multiple jars are registered 
> one at a time in Hive
> 
>
> Key: HIVE-11878
> URL: https://issues.apache.org/jira/browse/HIVE-11878
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: URLClassLoader
> Attachments: HIVE-11878.patch, HIVE-11878_approach3.patch, 
> HIVE-11878_qtest.patch
>
>
> When we register a jar on the Hive console. Hive creates a fresh URL 
> classloader which includes the path of the current jar to be registered and 
> all the jar paths of the parent classloader. The parent classlaoder is the 
> current ThreadContextClassLoader. Once the URLClassloader is created Hive 
> sets that as the current ThreadContextClassloader.
> So if we register multiple jars in Hive, there will be multiple 
> URLClassLoaders created, each classloader including the jars from its parent 
> and the one extra jar to be registered. The last URLClassLoader created will 
> end up as the current ThreadContextClassLoader. (See details: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath)
> Now here's an example in which the above strategy can lead to a CNF exception.
> We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class 
> *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, 
> the URLClassLoader *u1* is created and also set as the 
> ThreadContextClassLoader. We register *j2* next, the new URLClassLoader 
> created will be *u2* with *u1* as parent and *u2* becomes the new 
> ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* 
> whereas *u1* only has paths to *j1* (For details see: 
> org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath).
> Now when we register class *c1* under a temporary function in Hive, we load 
> the class using {code} class.forName("c1", true, 
> Thread.currentThread().getContextClassLoader()) {code} . The 
> currentThreadContext class-loader is *u2*, and it has the path to the class 
> *c1*, but note that Class-loaders work by delegating to parent class-loader 
> first. In this case class *c1* will be found and *defined* by class-loader 
> *u1*.
> Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say 
> initialize) is called in *c1*, which references the class *c2*, *c2* will not 
> be found since the class-loader used to search for *c2* will be *u1* (Since 
> the caller's class-loader is used to load a class)
> I've added a qtest to explain the problem. Please see the attached patch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >