[jira] [Commented] (HIVE-11611) A bad performance regression issue with Parquet happens if Hive does not select any columns

2015-08-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709468#comment-14709468
 ] 

Sergio Peña commented on HIVE-11611:


Thanks [~rdblue]

[~Ferd] As you told me offline, then we should close this ticket as 'not fix'. 
We will wait until parquet releases a new version, and then change to that new 
one.

 A bad performance regression issue with Parquet happens if Hive does not 
 select any columns
 ---

 Key: HIVE-11611
 URL: https://issues.apache.org/jira/browse/HIVE-11611
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Sergio Peña
Assignee: Ferdinand Xu
 Attachments: HIVE-11611.patch


 A possible performance issue may happen with the below code when using a 
 query like this {{SELECT count(1) FROM parquetTable}}.
 {code}
 if (!ColumnProjectionUtils.isReadAllColumns(configuration)  
 !indexColumnsWanted.isEmpty()) {
 MessageType requestedSchemaByUser =
 getSchemaByIndex(tableSchema, columnNamesList, 
 indexColumnsWanted);
 return new ReadContext(requestedSchemaByUser, contextMetadata);
 } else {
   return new ReadContext(tableSchema, contextMetadata);
 }
 {code}
 If there are not columns nor indexes selected, then the above code will read 
 the full schema from Parquet even if Hive does not do anything with such 
 values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11504) Predicate pushing down doesn't work for float type for Parquet

2015-08-24 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709482#comment-14709482
 ] 

Owen O'Malley commented on HIVE-11504:
--

Ok, I wrote the patch that addresses Parquet's problem without needlessly 
complicating the SARG API by splitting out the integer or float types. Please 
see HIVE-11618.

 Predicate pushing down doesn't work for float type for Parquet
 --

 Key: HIVE-11504
 URL: https://issues.apache.org/jira/browse/HIVE-11504
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-11504.1.patch, HIVE-11504.2.patch, 
 HIVE-11504.2.patch, HIVE-11504.3.patch, HIVE-11504.patch


 Predicate builder should use PrimitiveTypeName type in parquet side to 
 construct predicate leaf instead of the type provided by PredicateLeaf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709525#comment-14709525
 ] 

Ashutosh Chauhan commented on HIVE-11573:
-

I agree there is no good reason for storing original predicate in FilterDesc 
any more.

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: TODOC2.0
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch, 
 HIVE-11573.3.patch, HIVE-11573.4.patch, HIVE-11573.5.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11622) Creating an Avro table with a complex map-typed column leads to incorrect column type.

2015-08-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-11622:
--

Assignee: Jimmy Xiang

 Creating an Avro table with a complex map-typed column leads to incorrect 
 column type.
 --

 Key: HIVE-11622
 URL: https://issues.apache.org/jira/browse/HIVE-11622
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 1.1.0
Reporter: Alexander Behm
Assignee: Jimmy Xiang
  Labels: AvroSerde

 In the following CREATE TABLE the following map-typed column leads to the 
 wrong type. I suspect some problem with inferring the Avro schema from the 
 column definitions, but I am not sure.
 Reproduction:
 {code}
 hive create table t (c mapstring,arrayint) stored as avro;
 OK
 Time taken: 0.101 seconds
 hive desc t;
 OK
 c arraymapstring,int  from deserializer   
 Time taken: 0.135 seconds, Fetched: 1 row(s)
 {code}
 Note how the type shown in DESCRIBE is not the type originally passed in the 
 CREATE TABLE.
 However, *sometimes* the DESCRIBE shows the correct output. You may also try 
 these steps which produce a similar problem to increase the chance of hitting 
 this issue:
 {code}
 hive create table t (c arraymapstring,int) stored as avro;
 OK
 Time taken: 0.063 seconds
 hive desc t;
 OK
 c mapstring,arrayint  from deserializer   
 Time taken: 0.152 seconds, Fetched: 1 row(s)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11620) Fix several qtest output order

2015-08-24 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709583#comment-14709583
 ] 

Chao Sun commented on HIVE-11620:
-

+1

 Fix several qtest output order
 --

 Key: HIVE-11620
 URL: https://issues.apache.org/jira/browse/HIVE-11620
 Project: Hive
  Issue Type: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11620.1.patch


 selectDistinctStar.q
 unionall_unbalancedppd.q
 vector_cast_constant.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for ReduceSink operator

2015-08-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11623:
---
Summary: CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix 
the tableAlias for ReduceSink operator  (was: CBO: Calcite Operator To Hive 
Operator (Calcite Return Path): fix the tableAlias for PTF operator)

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the 
 tableAlias for ReduceSink operator
 

 Key: HIVE-11623
 URL: https://issues.apache.org/jira/browse/HIVE-11623
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11623.01.patch, HIVE-11623.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709681#comment-14709681
 ] 

Sergey Shelukhin commented on HIVE-11605:
-

Nit: can this be replaced with a boolean expression:
{noformat}
if (strict) {
+  if (colCount == listBucketCols.size()) {
+return true;
+  } else {
+return false;
+  }
+} else {
+  return true;
+}
{noformat}

 Incorrect results with bucket map join in tez.
 --

 Key: HIVE-11605
 URL: https://issues.apache.org/jira/browse/HIVE-11605
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.0, 1.2.0, 1.0.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Critical
 Attachments: HIVE-11605.1.patch


 In some cases, we aggressively try to convert to a bucket map join and this 
 ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11544) LazyInteger should avoid throwing NumberFormatException

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709691#comment-14709691
 ] 

Sergey Shelukhin commented on HIVE-11544:
-

In our test data there were also strings such as null and NULL... Not sure 
how frequent that is on real data

 LazyInteger should avoid throwing NumberFormatException
 ---

 Key: HIVE-11544
 URL: https://issues.apache.org/jira/browse/HIVE-11544
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0
Reporter: William Slacum
Assignee: Gopal V
Priority: Minor
  Labels: Performance
 Attachments: HIVE-11544.1.patch


 {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these 
 conditions:
 # bytes are null
 # radix is invalid
 # length is 0
 # the string is '+' or '-'
 # {{LazyInteger#parse}} throws a {{NumberFormatException}}
 Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the 
 exception is caught, swallowed, and {{isNull}} is set to {{true}}.
 This is generally a bad workflow, as exception creation is a performance 
 bottleneck, and potentially repeating for many rows in a query can have a 
 drastic performance consequence.
 It would be better if this method returned an {{OptionalInteger}}, which 
 would provide similar functionality with a higher throughput rate.
 I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so 
 I've marked those as affected. Any version in between would also suffer from 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11526) LLAP: implement LLAP UI as a separate service

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709726#comment-14709726
 ] 

Sergey Shelukhin commented on HIVE-11526:
-

I think the app name is the argument to the $for method. So, the monitor can 
use a different name.
The standard jmx, stack and conf pages do not require any resources in that 
directory

 LLAP: implement LLAP UI as a separate service
 -

 Key: HIVE-11526
 URL: https://issues.apache.org/jira/browse/HIVE-11526
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Kai Sasaki
 Attachments: llap_monitor_design.pdf


 The specifics are vague at this point. 
 Hadoop metrics can be output, as well as metrics we collect and output in 
 jmx, as well as those we collect per fragment and log right now. 
 This service can do LLAP-specific views, and per-query aggregation.
 [~gopalv] may have some information on how to reuse existing solutions for 
 part of the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11544) LazyInteger should avoid throwing NumberFormatException

2015-08-24 Thread William Slacum (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709699#comment-14709699
 ] 

William Slacum commented on HIVE-11544:
---

The common cases still hit a similar code path-- it's just that the checks 
would throw an NFE. I'd envision that this patch should keep the same best case 
scenario, but reduce the worst case scenario.

 LazyInteger should avoid throwing NumberFormatException
 ---

 Key: HIVE-11544
 URL: https://issues.apache.org/jira/browse/HIVE-11544
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0
Reporter: William Slacum
Assignee: Gopal V
Priority: Minor
  Labels: Performance
 Attachments: HIVE-11544.1.patch


 {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these 
 conditions:
 # bytes are null
 # radix is invalid
 # length is 0
 # the string is '+' or '-'
 # {{LazyInteger#parse}} throws a {{NumberFormatException}}
 Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the 
 exception is caught, swallowed, and {{isNull}} is set to {{true}}.
 This is generally a bad workflow, as exception creation is a performance 
 bottleneck, and potentially repeating for many rows in a query can have a 
 drastic performance consequence.
 It would be better if this method returned an {{OptionalInteger}}, which 
 would provide similar functionality with a higher throughput rate.
 I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so 
 I've marked those as affected. Any version in between would also suffer from 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11544) LazyInteger should avoid throwing NumberFormatException

2015-08-24 Thread William Slacum (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709700#comment-14709700
 ] 

William Slacum commented on HIVE-11544:
---

Would that apply to numeric/integer columns?

 LazyInteger should avoid throwing NumberFormatException
 ---

 Key: HIVE-11544
 URL: https://issues.apache.org/jira/browse/HIVE-11544
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0
Reporter: William Slacum
Assignee: Gopal V
Priority: Minor
  Labels: Performance
 Attachments: HIVE-11544.1.patch


 {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these 
 conditions:
 # bytes are null
 # radix is invalid
 # length is 0
 # the string is '+' or '-'
 # {{LazyInteger#parse}} throws a {{NumberFormatException}}
 Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the 
 exception is caught, swallowed, and {{isNull}} is set to {{true}}.
 This is generally a bad workflow, as exception creation is a performance 
 bottleneck, and potentially repeating for many rows in a query can have a 
 drastic performance consequence.
 It would be better if this method returned an {{OptionalInteger}}, which 
 would provide similar functionality with a higher throughput rate.
 I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so 
 I've marked those as affected. Any version in between would also suffer from 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11469) Update doc for InstanceCache to clearly define the contract on the SeedObject

2015-08-24 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11469:

Summary: Update doc for InstanceCache to clearly define the contract on the 
SeedObject  (was: InstanceCache does not have proper implementation of equals 
or hashcode)

 Update doc for InstanceCache to clearly define the contract on the SeedObject
 -

 Key: HIVE-11469
 URL: https://issues.apache.org/jira/browse/HIVE-11469
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-11469.1.patch.txt


 With HIVE-11288, we started using InstanceCache as a key. However it doesn't 
 seem like the class actually implements the equals or hashcode methods which 
 can potentially lead to inaccurate results. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709678#comment-14709678
 ] 

Sergey Shelukhin commented on HIVE-11606:
-

+1

 Bucket map joins fail at hash table construction time
 -

 Key: HIVE-11606
 URL: https://issues.apache.org/jira/browse/HIVE-11606
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 1.0.1, 1.2.1
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-11606.1.patch


 {code}
 info=[Error: Failure while running task:java.lang.RuntimeException: 
 java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
 power of two
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
 must be a power of two
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
  
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709751#comment-14709751
 ] 

Sergey Shelukhin commented on HIVE-11595:
-

1-2 done, although it is misleading cause the buffer contains the entire footer 
structure, incl. metadata and PS, not just OrcProto.Footer.
3-4 sure
5 there's a comment in the class. It could be changed to expand FileMetaInfo 
but FileMetaInfo is serialized in splits, so it would be confusing because the 
newly added fields would be missing on the other side (they are only used 
during split generation)

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11552) implement basic methods for getting/putting file metadata

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709803#comment-14709803
 ] 

Sergey Shelukhin commented on HIVE-11552:
-

The way code is currently implemented, the method already bails fast
{noformat}
  ListLong fileIds = req.getFileIds();
  ByteBuffer[] metadatas = getMS().getFileMetadata(fileIds);
  GetFileMetadataResult result = new GetFileMetadataResult();
  result.setIsSupported(metadatas != null);
  if (metadatas != null) {
[snip]
  }
  return result;
{noformat}

It will be the same path with extra call otherwise.
For put and clear, the methods in ObjectStore are just a no-op.

 implement basic methods for getting/putting file metadata
 -

 Key: HIVE-11552
 URL: https://issues.apache.org/jira/browse/HIVE-11552
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch

 Attachments: HIVE-11552.01.patch, HIVE-11552.nogen.patch, 
 HIVE-11552.nogen.patch, HIVE-11552.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11422) Join a ACID table with non-ACID table fail with MR

2015-08-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11422:
--
Attachment: HIVE-11422.1.patch

Retest with the latest top of trunk/branch-1, it seems to be fixed. I didn't 
chase what actually fix it.

The only thing left is to add the test case of HIVE-11438 to make sure this 
will not be broken going forward.

 Join a ACID table with non-ACID table fail with MR
 --

 Key: HIVE-11422
 URL: https://issues.apache.org/jira/browse/HIVE-11422
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 1.3.0
Reporter: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11422.1.patch


 The following script fail on MR mode:
 {code}
 CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
 CLUSTERED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC TBLPROPERTIES(transactional=true); 
 INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
 CREATE TABLE orc_table (k1 INT, f1 STRING) 
 CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC; 
 INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
 SET hive.execution.engine=mr; 
 SET hive.auto.convert.join=false; 
 SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
 SELECT t1.*, t2.* FROM orc_table t1 
 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
 {code}
 Stack:
 {code}
 Error: java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas(AcidUtils.java:368)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1211)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1129)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
   ... 9 more
 {code}
 The script pass in 1.2.0 release however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709836#comment-14709836
 ] 

Hive QA commented on HIVE-11595:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752057/HIVE-11595.02.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5053/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5053/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5053/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5053/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
From https://github.com/apache/hive
   92bd50e..f4aac7e  branch-1   - origin/branch-1
   9d9dd72..5e16d53  hbase-metastore - origin/hbase-metastore
   a16bbd4..dd2bdfc  master - origin/master
+ git reset --hard HEAD
HEAD is now at a16bbd4 HIVE-11176 : Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
[Ljava.lang.Object; (Navis via Ashutosh Chauhan)
+ git clean -f -d
Removing ql/src/test/queries/clientpositive/pointlookup.q
Removing ql/src/test/queries/clientpositive/pointlookup2.q
Removing ql/src/test/results/clientpositive/pointlookup.q.out
Removing ql/src/test/results/clientpositive/pointlookup2.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 4 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at dd2bdfc HIVE-11469 : Update doc for InstanceCache to clearly 
define the contract on the SeedObject (Swarnim Kulkarni via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752057 - PreCommit-HIVE-TRUNK-Build

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
 HIVE-11595.02.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10296) Cast exception observed when hive runs a multi join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before appl

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709783#comment-14709783
 ] 

Sergey Shelukhin commented on HIVE-10296:
-

this case was incomplete and was recently improved to include other join 
filters (see dbHasJoinCastBug there). However, we didn't realize postgres 
also has this bug, so it needs to be added to the list.
With the full case filter it should not try to cast non-numerics unless they 
are actually stored in the numeric column. Can you check where DEFAULT_BINSRC 
value is coming from?

 Cast exception observed when hive runs a multi join query on metastore 
 (postgres), since postgres pushes the filter into the join, and ignores the 
 condition before applying cast
 -

 Key: HIVE-10296
 URL: https://issues.apache.org/jira/browse/HIVE-10296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Yash Datta

 Try to drop a partition from hive:
 ALTER TABLE f___edr_bin_source___900_sub_id DROP IF EXISTS PARTITION ( 
 exporttimestamp=1427824800, timestamp=1427824800)
 This triggers a query on the metastore like this :
  select PARTITIONS.PART_ID from PARTITIONS inner join TBLS on 
 PARTITIONS.TBL_ID = TBLS.TBL_ID and TBLS.TBL_NAME = ? inner join 
 DBS on TBLS.DB_ID = DBS.DB_ID and DBS.NAME = ? inner join 
 PARTITION_KEY_VALS FILTER0 on FILTER0.PART_ID = 
 PARTITIONS.PART_ID and FILTER0.INTEGER_IDX = 0 inner join 
 PARTITION_KEY_VALS FILTER1 on FILTER1.PART_ID = 
 PARTITIONS.PART_ID and FILTER1.INTEGER_IDX = 1 where ( (((case when 
 TBLS.TBL_NAME = ? and DBS.NAME = ? then cast(FILTER0.PART_KEY_VAL 
 as decimal(21,0)) else null end) = ?) and ((case when TBLS.TBL_NAME = ? 
 and DBS.NAME = ? then cast(FILTER1.PART_KEY_VAL as decimal(21,0)) 
 else null end) = ?)) )
 In some cases, when the internal tables in postgres (metastore) have some 
 amount of data, the query plan pushes the condition down into the join.
 Now because of DERBY-6358 , case when clause is used before the cast, but in 
 this case , cast is evaluated before condition being evaluated. So in case we 
 have different tables partitioned on string and integer columns, cast 
 exception is observed!
 15/04/06 08:41:20 ERROR metastore.ObjectStore: Direct SQL failed, falling 
 back to ORM 
 javax.jdo.JDODataStoreException: Error executing SQL query select 
 PARTITIONS.PART_ID from PARTITIONS inner join TBLS on 
 PARTITIONS.TBL_ID = TBLS.TBL_ID and TBLS.TBL_NAME = ? inner join 
 DBS on TBLS.DB_ID = DBS.DB_ID and DBS.NAME = ? inner join 
 PARTITION_KEY_VALS FILTER0 on FILTER0.PART_ID = 
 PARTITIONS.PART_ID and FILTER0.INTEGER_IDX = 0 inner join 
 PARTITION_KEY_VALS FILTER1 on FILTER1.PART_ID = 
 PARTITIONS.PART_ID and FILTER1.INTEGER_IDX = 1 where ( (((case when 
 TBLS.TBL_NAME = ? and DBS.NAME = ? then cast(FILTER0.PART_KEY_VAL 
 as decimal(21,0)) else null end) = ?) and ((case when TBLS.TBL_NAME = ? 
 and DBS.NAME = ? then cast(FILTER1.PART_KEY_VAL as decimal(21,0)) 
 else null end) = ?)) ). 
 at 
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
  
 at 
 org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
 at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:300)
  
 at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:211)
  
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1915)
  
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1909)
  
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
  
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1909)
  
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1882)
  
 org.postgresql.util.PSQLException: ERROR: invalid input syntax for type 
 numeric: __DEFAULT_BINSRC__ 
 15/04/06 08:41:20 INFO metastore.ObjectStore: JDO filter pushdown cannot be 
 used: Filtering is supported only on partition keys of type string 
 15/04/06 08:41:20 ERROR metastore.ObjectStore: 
 javax.jdo.JDOException: Exception thrown when executing query 
 at 
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
  
 at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesNoTxn(ObjectStore.java:1700)
  
 at 
 

[jira] [Resolved] (HIVE-10289) Support filter on non-first partition key and non-string partition key

2015-08-24 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-10289.
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: hbase-metastore-branch

Patch committed to hbase-metastore branch. Thanks Alan for review!

 Support filter on non-first partition key and non-string partition key
 --

 Key: HIVE-10289
 URL: https://issues.apache.org/jira/browse/HIVE-10289
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Metastore, Metastore
Affects Versions: hbase-metastore-branch
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: hbase-metastore-branch

 Attachments: HIVE-10289.1.patch, HIVE-10289.2.patch, 
 HIVE-10289.3.patch


 Currently, partition filtering only handles the first partition key and the 
 type for this partition key must be string. In order to break this 
 limitation, several improvements are required:
 1. Change serialization format for partition key. Currently partition keys 
 are serialized into delimited string, which sorted on string order not with 
 regard to the actual type of the partition key. We use BinarySortableSerDe 
 for this purpose.
 2. For filter condition not on the initial partition keys, push it into HBase 
 RowFilter. RowFilter will deserialize the partition key and evaluate the 
 filter condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11595:

Attachment: HIVE-11595.02.patch

Updated patch

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
 HIVE-11595.02.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9005) HiveSever2 error with Illegal Operation state transition from CLOSED to ERROR

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709851#comment-14709851
 ] 

Ashutosh Chauhan commented on HIVE-9005:


[~vgumashta] Would you like to review this?

 HiveSever2 error with Illegal Operation state transition from CLOSED to 
 ERROR
 ---

 Key: HIVE-9005
 URL: https://issues.apache.org/jira/browse/HIVE-9005
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Binglin Chang
 Attachments: HIVE-9005.1.patch


 {noformat}
 2014-12-02 11:25:40,855 WARN  [HiveServer2-Background-Pool: Thread-17]: 
 ql.Driver (DriverContext.java:shutdown(137)) - Shutting down task : 
 Stage-1:MAPRED
 2014-12-02 11:25:41,898 INFO  [HiveServer2-Background-Pool: Thread-30]: 
 exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for 
 Stage-1: number of mappers: 0; number of reducers: 0
 2014-12-02 11:25:41,942 WARN  [HiveServer2-Background-Pool: Thread-30]: 
 mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
 org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
 org.apache.hadoop.mapreduce.TaskCounter instead
 2014-12-02 11:25:41,942 INFO  [HiveServer2-Background-Pool: Thread-30]: 
 exec.Task (SessionState.java:printInfo(536)) - 2014-12-02 11:25:41,939 
 Stage-1 map = 0%,  reduce = 0%
 2014-12-02 11:25:41,945 WARN  [HiveServer2-Background-Pool: Thread-30]: 
 mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
 org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
 org.apache.hadoop.mapreduce.TaskCounter instead
 2014-12-02 11:25:41,952 ERROR [HiveServer2-Background-Pool: Thread-30]: 
 exec.Task (SessionState.java:printError(545)) - Ended Job = 
 job_1413717733669_207982 with errors
 2014-12-02 11:25:41,954 ERROR [Thread-39]: exec.Task 
 (SessionState.java:printError(545)) - Error during job, obtaining debugging 
 information...
 2014-12-02 11:25:41,957 ERROR [HiveServer2-Background-Pool: Thread-30]: 
 ql.Driver (SessionState.java:printError(545)) - FAILED: Operation cancelled
 2014-12-02 11:25:41,957 INFO  [HiveServer2-Background-Pool: Thread-30]: 
 ql.Driver (SessionState.java:printInfo(536)) - MapReduce Jobs Launched:
 2014-12-02 11:25:41,960 WARN  [HiveServer2-Background-Pool: Thread-30]: 
 mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group 
 FileSystemCounters is deprecated. Use 
 org.apache.hadoop.mapreduce.FileSystemCounter instead
 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
 ql.Driver (SessionState.java:printInfo(536)) - Stage-Stage-1:  HDFS Read: 0 
 HDFS Write: 0 FAIL
 2014-12-02 11:25:41,961 INFO  [HiveServer2-Background-Pool: Thread-30]: 
 ql.Driver (SessionState.java:printInfo(536)) - Total MapReduce CPU Time 
 Spent: 0 msec
 2014-12-02 11:25:41,965 ERROR [HiveServer2-Background-Pool: Thread-30]: 
 operation.Operation (SQLOperation.java:run(205)) - Error running hive query:
 org.apache.hive.service.cli.HiveSQLException: Illegal Operation state 
 transition from CLOSED to ERROR
   at 
 org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:91)
   at 
 org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:97)
   at 
 org.apache.hive.service.cli.operation.Operation.setState(Operation.java:116)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:161)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:71)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1589)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:504)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:215)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11547) beeline does not continue running the script after an error occurs while beeline --force=true is already set.

2015-08-24 Thread Wei Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709865#comment-14709865
 ] 

Wei Huang commented on HIVE-11547:
--

Tried HIVE-11203 patch and it works only in interactive mode.  
It does not work if we use the –f option to run the script from a file.

 beeline does not continue running the script after an error occurs while 
 beeline --force=true is already set.
 ---

 Key: HIVE-11547
 URL: https://issues.apache.org/jira/browse/HIVE-11547
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 1.2.0
 Environment: HDP 2.3 on Virtual box 
Reporter: Wei Huang

 If you execute beeline to run a SQL script file, using the following command
  beeline -f query file name
 the beeline exists after the first error. i.e. when a test query fails 
 beeline quits to the CLI.
 The beeline --force=true seems to have a bug and it does not continue 
 running the script after an error occurs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11599) Add metastore command to dump it's configs

2015-08-24 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709871#comment-14709871
 ] 

Sushanth Sowmyan commented on HIVE-11599:
-

+1 to intent, this would be most useful.

 Add metastore command to dump it's configs
 --

 Key: HIVE-11599
 URL: https://issues.apache.org/jira/browse/HIVE-11599
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 1.0.0
Reporter: Eugene Koifman

 We should have equivalent of Hive CLI set command on Metastore (and likely 
 HS2) which can dump out all properties this particular process is running 
 with.
 cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11552) implement basic methods for getting/putting file metadata

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709873#comment-14709873
 ] 

Sergey Shelukhin commented on HIVE-11552:
-

There's an issue with ASF LDAP server somewhere, will commit when I can...

 implement basic methods for getting/putting file metadata
 -

 Key: HIVE-11552
 URL: https://issues.apache.org/jira/browse/HIVE-11552
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: hbase-metastore-branch

 Attachments: HIVE-11552.01.patch, HIVE-11552.nogen.patch, 
 HIVE-11552.nogen.patch, HIVE-11552.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11595:

Attachment: HIVE-11595.03.patch

rebased the patch

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
 HIVE-11595.02.patch, HIVE-11595.03.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2015-08-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11581:

Attachment: HIVE-11581.4.patch

 HiveServer2 should store connection params in ZK when using dynamic service 
 discovery for simpler client connection string.
 ---

 Key: HIVE-11581
 URL: https://issues.apache.org/jira/browse/HIVE-11581
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 1.3.0, 2.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
 HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch


 Currently, the client needs to specify several parameters based on which an 
 appropriate connection is created with the server. In case of dynamic service 
 discovery, when multiple HS2 instances are running, it is much more usable 
 for the server to add its config parameters to ZK which the driver can use to 
 configure the connection, instead of the jdbc/odbc user adding those in 
 connection string.
 However, at minimum, client will need to specify zookeeper ensemble and that 
 she wants the JDBC driver to use ZooKeeper:
 {noformat}
 beeline !connect 
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
 {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11217) CTAS statements throws error, when the table is stored as ORC File format and select clause has NULL/VOID type column

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708894#comment-14708894
 ] 

Ashutosh Chauhan commented on HIVE-11217:
-

Patch seems reasonable to me. [~prasanth_j] what do you think ?

 CTAS statements throws error, when the table is stored as ORC File format and 
 select clause has NULL/VOID type column 
 --

 Key: HIVE-11217
 URL: https://issues.apache.org/jira/browse/HIVE-11217
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11217.1.patch, HIVE-11217.2.patch


 If you try to use create-table-as-select (CTAS) statement and create a ORC 
 File format based table, then you can't use NULL as a column value in select 
 clause 
 CREATE TABLE empty (x int);
 CREATE TABLE orc_table_with_null 
 STORED AS ORC 
 AS 
 SELECT 
 x,
 null
 FROM empty;
 Error: 
 {quote}
 347084 [main] ERROR hive.ql.exec.DDLTask  - 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.IllegalArgumentException: Unknown primitive type VOID
   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:643)
   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4242)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:285)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:952)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)
   at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)
   at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
   at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
   at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
   at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.IllegalArgumentException: Unknown primitive type VOID
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:530)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.init(OrcStruct.java:195)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcStruct.createObjectInspector(OrcStruct.java:534)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcSerde.initialize(OrcSerde.java:106)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:519)
   at 
 org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:345)
   at 
 org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:292)
   at 
 org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:194)
   at 

[jira] [Updated] (HIVE-11625) Map instances with null keys are not written to Parquet tables

2015-08-24 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11625:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-8120

 Map instances with null keys are not written to Parquet tables
 --

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 dropped:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 This is because entries with null keys are explicitly skipped in 
 {{DataWritableWriter}}, [see 
 here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11383) Upgrade Hive to Calcite 1.4

2015-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11383:
---
Attachment: HIVE-11383.11.patch

 Upgrade Hive to Calcite 1.4
 ---

 Key: HIVE-11383
 URL: https://issues.apache.org/jira/browse/HIVE-11383
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11383.1.patch, HIVE-11383.10.patch, 
 HIVE-11383.11.patch, HIVE-11383.2.patch, HIVE-11383.3.patch, 
 HIVE-11383.3.patch, HIVE-11383.3.patch, HIVE-11383.4.patch, 
 HIVE-11383.5.patch, HIVE-11383.6.patch, HIVE-11383.7.patch, 
 HIVE-11383.8.patch, HIVE-11383.8.patch, HIVE-11383.9.patch


 CLEAR LIBRARY CACHE
 Upgrade Hive to Calcite 1.4.0-incubating.
 There is currently a snapshot release, which is close to what will be in 1.4. 
 I have checked that Hive compiles against the new snapshot, fixing one issue. 
 The patch is attached.
 Next step is to validate that Hive runs against the new Calcite, and post any 
 issues to the Calcite list or log Calcite Jira cases. [~jcamachorodriguez], 
 can you please do that.
 [~pxiong], I gather you are dependent on CALCITE-814, which will be fixed in 
 the new Calcite version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11450) Resources are not cleaned up properly at multiple places

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708905#comment-14708905
 ] 

Hive QA commented on HIVE-11450:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751973/HIVE-11450.4.patch

{color:green}SUCCESS:{color} +1 9377 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5049/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5049/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5049/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751973 - PreCommit-HIVE-TRUNK-Build

 Resources are not cleaned up properly at multiple places
 

 Key: HIVE-11450
 URL: https://issues.apache.org/jira/browse/HIVE-11450
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Nezih Yigitbasi
Assignee: Nezih Yigitbasi
 Attachments: HIVE-11450.2.patch, HIVE-11450.3.patch, 
 HIVE-11450.4.patch, HIVE-11450.patch


 I noticed that various resources aren't properly cleaned up in various 
 classes. To be specific,
 * Some streams aren't properly cleaned up in 
 {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and 
 {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}}
 * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up 
 in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}}
 * {{Statement}} and {{ResultSet}} aren't properly cleaned up in  
 {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11625) Map instances with null keys are not properly handled for Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated HIVE-11625:
--
Description: 
Hive allows maps with null keys:
{code:sql}
hive select map(null, 'foo', 1, 'bar', null, 'baz');
{null:baz,1:bar}
{code}
However, when written into Parquet tables, map entries with null as keys are 
either dropped or cause exceptions. Below is the result of Hive 0.14.0 and 
0.13.1:
{code:sql}
hive CREATE TABLE map_test STORED AS PARQUET
 AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
...
hive SELECT * from map_test;
{1:bar}
{code}
And Hive 1.2.1 throws exception:
{noformat}
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing writable (null)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing writable (null)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: java.lang.RuntimeException: Parquet record is malformed: empty 
fields are illegal, the field should be ommited completely instead
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:64)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
at 
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
at 
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:123)
at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:42)
at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:111)
at 
org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:124)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:753)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 9 more
Caused by: parquet.io.ParquetEncodingException: empty fields are illegal, the 
field should be ommited completely instead
at 
parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.endField(MessageColumnIO.java:244)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeMap(DataWritableWriter.java:228)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeValue(DataWritableWriter.java:116)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeGroupFields(DataWritableWriter.java:89)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:60)
... 23 more

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing writable (null)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing writable 

[jira] [Commented] (HIVE-11625) Map instances with null keys are not properly handled for Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709076#comment-14709076
 ] 

Cheng Lian commented on HIVE-11625:
---

Sorry, according to the following statements in parquet-format spec
{quote}
The {{key}} field encodes the map's key type. This field must have repetition 
{{required}} and must always be present.
{quote}
Map keys written to Parquet must not be null.

Then I think the problem here is that, whether should we silently ignore null 
keys when writing a map to a Parquet table like what Hive 0.14.0 does, or throw 
an exception (probably a more descriptive one instead of the one mentioned in 
the ticket description) like Hive 1.2.1.

 Map instances with null keys are not properly handled for Parquet tables
 

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 either dropped or cause exceptions. Below is the result of Hive 0.14.0 and 
 0.13.1:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 And Hive 1.2.1 throws exception:
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing writable (null)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing writable (null)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
   ... 8 more
 Caused by: java.lang.RuntimeException: Parquet record is malformed: empty 
 fields are illegal, the field should be ommited completely instead
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
   at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
   at 
 parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:123)
   at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:42)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:111)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:124)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:753)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
   ... 9 more
 Caused by: parquet.io.ParquetEncodingException: empty fields are illegal, the 
 field should be ommited completely instead
   at 
 parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.endField(MessageColumnIO.java:244)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeMap(DataWritableWriter.java:228)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeValue(DataWritableWriter.java:116)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeGroupFields(DataWritableWriter.java:89)
   at 
 

[jira] [Commented] (HIVE-11625) Map instances with null keys are not properly handled for Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709081#comment-14709081
 ] 

Cheng Lian commented on HIVE-11625:
---

Updated ticket description according to my comment above.

 Map instances with null keys are not properly handled for Parquet tables
 

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 either dropped or cause exceptions. Below is the result of Hive 0.14.0 and 
 0.13.1:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 And Hive 1.2.1 throws exception:
 {noformat}
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing writable (null)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing writable (null)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
   ... 8 more
 Caused by: java.lang.RuntimeException: Parquet record is malformed: empty 
 fields are illegal, the field should be ommited completely instead
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:64)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
   at 
 parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
   at 
 parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:123)
   at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:42)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:111)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper.write(ParquetRecordWriterWrapper.java:124)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:753)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
   ... 9 more
 Caused by: parquet.io.ParquetEncodingException: empty fields are illegal, the 
 field should be ommited completely instead
   at 
 parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.endField(MessageColumnIO.java:244)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeMap(DataWritableWriter.java:228)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeValue(DataWritableWriter.java:116)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeGroupFields(DataWritableWriter.java:89)
   at 
 org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:60)
   ... 23 more
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing writable (null)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at 

[jira] [Commented] (HIVE-11625) Map instances with null keys are not written to Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708885#comment-14708885
 ] 

Cheng Lian commented on HIVE-11625:
---

I meant to open this issue as a Parquet bug, but it seems that the Parquet 
support in Hive code base diverges a lot from parquet-hive. Fixes made in Hive 
were not backported to parquet-hive. For example, [the most recent master 
version of 
parquet-hive|https://github.com/apache/parquet-mr/blob/04f524d5ad91b1cdda66dfde4089f2f83f4528aa/parquet-hive/parquet-hive-storage-handler/src/main/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java]
 doesn't support writing maps, decimals, or timestamps, while all these data 
types are [supported in Hive 
1.2.1|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237]
 and earlier versions.

 Map instances with null keys are not written to Parquet tables
 --

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 dropped:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 This is because entries with null keys are explicitly skipped in 
 {{DataWritableWriter}}, [see 
 here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11176) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to [Ljava.lang.Object;

2015-08-24 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11176:

Summary: Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
[Ljava.lang.Object;  (was: aused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
[Ljava.lang.Object;)

 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
 [Ljava.lang.Object;
 

 Key: HIVE-11176
 URL: https://issues.apache.org/jira/browse/HIVE-11176
 Project: Hive
  Issue Type: Bug
  Components: Hive, Tez
Affects Versions: 1.0.0, 1.2.0
 Environment: Hive 1.2 and TEz 0.7
Reporter: Soundararajan Velu
Priority: Critical
 Attachments: HIVE-11176.1.patch.txt


 Unreachable code: 
 hive/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/StandardStructObjectInspector.java
 // With Data
   @Override
   @SuppressWarnings(unchecked)
   public Object getStructFieldData(Object data, StructField fieldRef) {
 if (data == null) {
   return null;
 }
 // We support both ListObject and Object[]
 // so we have to do differently.
 boolean isArray = ! (data instanceof List);
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 *
 The if condition above translates to 
 if(!true  true) the code section cannot be reached,
 this causes a lot of class cast exceptions while using Tez or ORC file 
 formats or custom jsonsede, Strangely this happens only while using Tez. 
 Changed the code to 
  boolean isArray = data.getClass().isArray();
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 Even then, lazystructs get passed as fields causing downstream cast 
 exceptions like lazystruct cannot be cast to Text etc...
 So I changed the method to something like this,
  // With Data
   @Override
   @SuppressWarnings(unchecked)
   public Object getStructFieldData(Object data, StructField fieldRef) {
 if (data == null) {
   return null;
 }
 if (data instanceof LazyBinaryStruct) {
 data = ((LazyBinaryStruct) data).getFieldsAsList();
 }
 // We support both ListObject and Object[]
 // so we have to do differently.
 boolean isArray = data.getClass().isArray();
 if (!isArray  !(data instanceof List)) {
   return data;
 }
 This is causing arrayindexout of bounds exception and other typecast 
 exceptions in object inspectors,
 Please help,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10631) create_table_core method has invalid update for Fast Stats

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708908#comment-14708908
 ] 

Ashutosh Chauhan commented on HIVE-10631:
-

+1

 create_table_core method has invalid update for Fast Stats
 --

 Key: HIVE-10631
 URL: https://issues.apache.org/jira/browse/HIVE-10631
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.0.0
Reporter: Dongwook Kwon
Assignee: Aaron Tokhy
Priority: Minor
 Attachments: HIVE-10631-branch-1.0.patch, HIVE-10631.patch


 HiveMetaStore.create_table_core method calls 
 MetaStoreUtils.updateUnpartitionedTableStatsFast when hive.stats.autogather 
 is on, however for partitioned table, this updateUnpartitionedTableStatsFast 
 call scanning warehouse dir and doesn't seem to use it. 
 Fast Stats was implemented by HIVE-3959
 https://github.com/apache/hive/blob/branch-1.0/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1363
 From create_table_core method
 {code}
 if (HiveConf.getBoolVar(hiveConf, 
 HiveConf.ConfVars.HIVESTATSAUTOGATHER) 
 !MetaStoreUtils.isView(tbl)) {
   if (tbl.getPartitionKeysSize() == 0)  { // Unpartitioned table
 MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, 
 madeDir);
   } else { // Partitioned table with no partitions.
 MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, 
 true);
   }
 }
 {code}
 Particularly Line 1363: // Partitioned table with no partitions.
 {code}
 MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, true);
 {code}
 This call ends up calling Warehouse.getFileStatusesForUnpartitionedTable and 
 do nothing in MetaStoreUtils.updateUnpartitionedTableStatsFast method due to 
 newDir flag is always true
 Impact of this bug is minor with HDFS warehouse 
 location(hive.metastore.warehouse.dir), it could be big with S3 warehouse 
 location especially for large existing partitions.
 Also the impact is heighten with HIVE-6727 when warehouse location is S3, 
 basically it could scan wrong S3 directory recursively and do nothing with 
 it. I will add more detail of cases in comments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11383) Upgrade Hive to Calcite 1.4

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709013#comment-14709013
 ] 

Hive QA commented on HIVE-11383:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751984/HIVE-11383.11.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9377 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_mapjoin_reduce
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5050/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5050/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5050/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751984 - PreCommit-HIVE-TRUNK-Build

 Upgrade Hive to Calcite 1.4
 ---

 Key: HIVE-11383
 URL: https://issues.apache.org/jira/browse/HIVE-11383
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11383.1.patch, HIVE-11383.10.patch, 
 HIVE-11383.11.patch, HIVE-11383.2.patch, HIVE-11383.3.patch, 
 HIVE-11383.3.patch, HIVE-11383.3.patch, HIVE-11383.4.patch, 
 HIVE-11383.5.patch, HIVE-11383.6.patch, HIVE-11383.7.patch, 
 HIVE-11383.8.patch, HIVE-11383.8.patch, HIVE-11383.9.patch


 CLEAR LIBRARY CACHE
 Upgrade Hive to Calcite 1.4.0-incubating.
 There is currently a snapshot release, which is close to what will be in 1.4. 
 I have checked that Hive compiles against the new snapshot, fixing one issue. 
 The patch is attached.
 Next step is to validate that Hive runs against the new Calcite, and post any 
 issues to the Calcite list or log Calcite Jira cases. [~jcamachorodriguez], 
 can you please do that.
 [~pxiong], I gather you are dependent on CALCITE-814, which will be fixed in 
 the new Calcite version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709015#comment-14709015
 ] 

Hive QA commented on HIVE-11624:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751991/HIVE-11624.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5051/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5051/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5051/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5051/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at a16bbd4 HIVE-11176 : Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
[Ljava.lang.Object; (Navis via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at a16bbd4 HIVE-11176 : Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct cannot be cast to 
[Ljava.lang.Object; (Navis via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751991 - PreCommit-HIVE-TRUNK-Build

 Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]
 -

 Key: HIVE-11624
 URL: https://issues.apache.org/jira/browse/HIVE-11624
 Project: Hive
  Issue Type: Sub-task
Reporter: Ke Jia
Assignee: Ke Jia
 Attachments: HIVE-11624.patch


 In the old CLI, it uses hive.cli.print.header from the hive configuration 
 to force execution a script . We need to support the previous configuration 
 using beeline functionality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV

2015-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11573:
---
Attachment: HIVE-11573.5.patch

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: TODOC2.0
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch, 
 HIVE-11573.3.patch, HIVE-11573.4.patch, HIVE-11573.5.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV

2015-08-24 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709054#comment-14709054
 ] 

Jesus Camacho Rodriguez commented on HIVE-11573:


[~gopalv], I added a new test in HIVE-11573.5.patch to verify that partition 
pruning is working fine.

I have a comment about the patch. I think we should not store the original 
predicate in the Filter operator if {{hive.optimize.point.lookup.extract}} is 
set to true (line 155 in PointLookupOptimizer). We added that line in 
HIVE-11461 so we do not get regressions with partition pruner, but with your 
patch, we shouldn't see that issue if extract is true. What do you think?

cc'd [~ashutoshc]

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: TODOC2.0
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch, 
 HIVE-11573.3.patch, HIVE-11573.4.patch, HIVE-11573.5.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10785) Support aggregate push down through joins

2015-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10785:
---
Assignee: Ashutosh Chauhan  (was: Jesus Camacho Rodriguez)

 Support aggregate push down through joins
 -

 Key: HIVE-10785
 URL: https://issues.apache.org/jira/browse/HIVE-10785
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Ashutosh Chauhan

 Enable {{AggregateJoinTransposeRule}} in CBO that pushes Aggregate through 
 Join operators. The rule has been extended in Calcite 1.4 to cover complex 
 cases e.g. Aggregate operators comprising UDAF. The decision on whether to 
 push the Aggregate through Join or not should be cost-driven.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11625) Map instances with null keys are not properly handled for Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated HIVE-11625:
--
Summary: Map instances with null keys are not properly handled for Parquet 
tables  (was: Map instances with null keys are not written to Parquet tables)

 Map instances with null keys are not properly handled for Parquet tables
 

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 dropped:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 This is because entries with null keys are explicitly skipped in 
 {{DataWritableWriter}}, [see 
 here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].
 This issue can be fixed by moving [the value writing 
 block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236]
  out of [the key writing 
 block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11625) Map instances with null keys are not written to Parquet tables

2015-08-24 Thread Cheng Lian (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated HIVE-11625:
--
Description: 
Hive allows maps with null keys:
{code:sql}
hive select map(null, 'foo', 1, 'bar', null, 'baz');
{null:baz,1:bar}
{code}
However, when written into Parquet tables, map entries with null as keys are 
dropped:
{code:sql}
hive CREATE TABLE map_test STORED AS PARQUET
 AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
...
hive SELECT * from map_test;
{1:bar}
{code}
This is because entries with null keys are explicitly skipped in 
{{DataWritableWriter}}, [see 
here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].

This issue can be fixed by moving [the value writing 
block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236]
 out of [the key writing 
block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].

  was:
Hive allows maps with null keys:
{code:sql}
hive select map(null, 'foo', 1, 'bar', null, 'baz');
{null:baz,1:bar}
{code}
However, when written into Parquet tables, map entries with null as keys are 
dropped:
{code:sql}
hive CREATE TABLE map_test STORED AS PARQUET
 AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
...
hive SELECT * from map_test;
{1:bar}
{code}
This is because entries with null keys are explicitly skipped in 
{{DataWritableWriter}}, [see 
here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].


 Map instances with null keys are not written to Parquet tables
 --

 Key: HIVE-11625
 URL: https://issues.apache.org/jira/browse/HIVE-11625
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0, 0.13.1, 1.0.1, 1.1.1, 1.2.1
Reporter: Cheng Lian

 Hive allows maps with null keys:
 {code:sql}
 hive select map(null, 'foo', 1, 'bar', null, 'baz');
 {null:baz,1:bar}
 {code}
 However, when written into Parquet tables, map entries with null as keys are 
 dropped:
 {code:sql}
 hive CREATE TABLE map_test STORED AS PARQUET
  AS SELECT MAP(null, 'foo', 1, 'bar', null, 'baz');
 ...
 hive SELECT * from map_test;
 {1:bar}
 {code}
 This is because entries with null keys are explicitly skipped in 
 {{DataWritableWriter}}, [see 
 here|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].
 This issue can be fixed by moving [the value writing 
 block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L230-L236]
  out of [the key writing 
 block|https://github.com/apache/hive/blob/release-1.2.1/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java#L223-L237].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709112#comment-14709112
 ] 

Hive QA commented on HIVE-11573:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12751994/HIVE-11573.5.patch

{color:green}SUCCESS:{color} +1 9379 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5052/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5052/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5052/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12751994 - PreCommit-HIVE-TRUNK-Build

 PointLookupOptimizer can be pessimistic at a low nDV
 

 Key: HIVE-11573
 URL: https://issues.apache.org/jira/browse/HIVE-11573
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
  Labels: TODOC2.0
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch, 
 HIVE-11573.3.patch, HIVE-11573.4.patch, HIVE-11573.5.patch


 The PointLookupOptimizer can turn off some of the optimizations due to its 
 use of tuple IN() clauses.
 Limit the application of the optimizer for very low nDV cases and extract the 
 sub-clause as a pre-condition during runtime, to trigger the simple column 
 predicate index lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9583) Rolling upgrade of Hive MetaStore Server

2015-08-24 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-9583.

   Resolution: Fixed
Fix Version/s: 1.2.2

(Marking as fixed on the 1.2 line, since per Thiruvel, all the tasks inside 
this are done, and were done as of 1.2.0)

 Rolling upgrade of Hive MetaStore Server
 

 Key: HIVE-9583
 URL: https://issues.apache.org/jira/browse/HIVE-9583
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog, Metastore
Affects Versions: 0.14.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: hcatalog, metastore
 Fix For: 1.2.2


 This is an umbrella JIRA to track all rolling upgrade JIRAs w.r.t MetaStore 
 server. This will be helpful for users deploying Metastore server and 
 connecting to it with HCatalog or Hive CLI interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11628) DB type detection code is failing on Oracle 12

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709939#comment-14709939
 ] 

Sergey Shelukhin commented on HIVE-11628:
-

+1

 DB type detection code is failing on Oracle 12
 --

 Key: HIVE-11628
 URL: https://issues.apache.org/jira/browse/HIVE-11628
 Project: Hive
  Issue Type: Bug
  Components: Metastore
 Environment: Oracle 12
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 2.0.0

 Attachments: HIVE-11628.patch


 DB type detection code is failing when using Oracle 12 as backing store.
 When determining qualification for direct SQL, in the logs following message 
 is seen:
 {noformat}
 2015-08-14 01:15:16,020 INFO  [pool-6-thread-109]: 
 metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:init(131)) - Using 
 direct SQL, underlying DB is OTHER
 {noformat}
 Currently in org/apache/hadoop/hive/metastore/MetaStoreDirectSql, there is a 
 code snippet:
 {code}
   private DB determineDbType() {
 DB dbType = DB.OTHER;
 if (runDbCheck(SET @@session.sql_mode=ANSI_QUOTES, MySql)) {
   dbType = DB.MYSQL;
 } else if (runDbCheck(SELECT version from v$instance, Oracle)) {
   dbType = DB.ORACLE;
 } else if (runDbCheck(SELECT @@version, MSSQL)) {
   dbType = DB.MSSQL;
 } else {
   // TODO: maybe we should use getProductName to identify all the DBs
   String productName = getProductName();
   if (productName != null  productName.toLowerCase().contains(derby)) 
 {
 dbType = DB.DERBY;
   }
 }
 return dbType;
   }
 {code}
 The code relies on access to v$instance in order to identify the backend DB 
 as Oracle, but this can fail if users are not granted select privileges on v$ 
 tables. An alternate way is specified on [Oracle Database Reference 
 pages|http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_4224.htm]
  works.
 I will attach a potential patch that should work.
 Without the patch the workaround here would be to grant select privileges on 
 v$ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11628) DB type detection code is failing on Oracle 12

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710019#comment-14710019
 ] 

Ashutosh Chauhan commented on HIVE-11628:
-

I think HIVE-11123 has a better fix for this.

 DB type detection code is failing on Oracle 12
 --

 Key: HIVE-11628
 URL: https://issues.apache.org/jira/browse/HIVE-11628
 Project: Hive
  Issue Type: Bug
  Components: Metastore
 Environment: Oracle 12
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 2.0.0

 Attachments: HIVE-11628.patch


 DB type detection code is failing when using Oracle 12 as backing store.
 When determining qualification for direct SQL, in the logs following message 
 is seen:
 {noformat}
 2015-08-14 01:15:16,020 INFO  [pool-6-thread-109]: 
 metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:init(131)) - Using 
 direct SQL, underlying DB is OTHER
 {noformat}
 Currently in org/apache/hadoop/hive/metastore/MetaStoreDirectSql, there is a 
 code snippet:
 {code}
   private DB determineDbType() {
 DB dbType = DB.OTHER;
 if (runDbCheck(SET @@session.sql_mode=ANSI_QUOTES, MySql)) {
   dbType = DB.MYSQL;
 } else if (runDbCheck(SELECT version from v$instance, Oracle)) {
   dbType = DB.ORACLE;
 } else if (runDbCheck(SELECT @@version, MSSQL)) {
   dbType = DB.MSSQL;
 } else {
   // TODO: maybe we should use getProductName to identify all the DBs
   String productName = getProductName();
   if (productName != null  productName.toLowerCase().contains(derby)) 
 {
 dbType = DB.DERBY;
   }
 }
 return dbType;
   }
 {code}
 The code relies on access to v$instance in order to identify the backend DB 
 as Oracle, but this can fail if users are not granted select privileges on v$ 
 tables. An alternate way is specified on [Oracle Database Reference 
 pages|http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_4224.htm]
  works.
 I will attach a potential patch that should work.
 Without the patch the workaround here would be to grant select privileges on 
 v$ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11629) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter expressions for full outer join and right outer join

2015-08-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11629:
---
Attachment: HIVE-11629.01.patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter 
 expressions for full outer join and right outer join
 --

 Key: HIVE-11629
 URL: https://issues.apache.org/jira/browse/HIVE-11629
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11629.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11422) Join a ACID table with non-ACID table fail with MR

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710037#comment-14710037
 ] 

Hive QA commented on HIVE-11422:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752065/HIVE-11422.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9378 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5054/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5054/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5054/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752065 - PreCommit-HIVE-TRUNK-Build

 Join a ACID table with non-ACID table fail with MR
 --

 Key: HIVE-11422
 URL: https://issues.apache.org/jira/browse/HIVE-11422
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 1.3.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11422.1.patch


 The following script fail on MR mode:
 {code}
 CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
 CLUSTERED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC TBLPROPERTIES(transactional=true); 
 INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
 CREATE TABLE orc_table (k1 INT, f1 STRING) 
 CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC; 
 INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
 SET hive.execution.engine=mr; 
 SET hive.auto.convert.join=false; 
 SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
 SELECT t1.*, t2.* FROM orc_table t1 
 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
 {code}
 Stack:
 {code}
 Error: java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas(AcidUtils.java:368)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1211)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1129)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
   ... 9 more
 {code}
 The script pass in 1.2.0 release however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11617) Explain plan for multiple lateral views is very slow

2015-08-24 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710074#comment-14710074
 ] 

Aihua Xu commented on HIVE-11617:
-

I will break the task into runtime performance change and explain output change 
(in subtasks) so that I can make sure the run time change should not have 
different affects on the results but the performance.

 Explain plan for multiple lateral views is very slow
 

 Key: HIVE-11617
 URL: https://issues.apache.org/jira/browse/HIVE-11617
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-11617.patch


 The following explain job will be very slow or never finish if there are many 
 lateral views involved. High CPU usage is also noticed.
 {noformat}
 EXPLAIN
 SELECT
 *
 from
 (
 SELECT * FROM table1 
 ) x
 LATERAL VIEW json_tuple(...) x1 
 LATERAL VIEW json_tuple(...) x2 
 ...
 {noformat}
 From jstack, the job is busy with preorder tree traverse. 
 {noformat}
 at java.util.regex.Matcher.getTextLength(Matcher.java:1234)
 at java.util.regex.Matcher.reset(Matcher.java:308)
 at java.util.regex.Matcher.init(Matcher.java:228)
 at java.util.regex.Pattern.matcher(Pattern.java:1088)
 at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
 at 
 

[jira] [Commented] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-08-24 Thread Deepesh Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710109#comment-14710109
 ] 

Deepesh Khandelwal commented on HIVE-11123:
---

[~sinchii] thanks for the patch! What version of Oracle did you use for testing 
this? We saw some issues with existing code (without your patch) on Oracle 12 
whereas it worked fine on Oracle 11. Just want to make sure that it works as 
expected between different Oracle versions.

 Fix how to confirm the RDBMS product name at Metastore.
 ---

 Key: HIVE-11123
 URL: https://issues.apache.org/jira/browse/HIVE-11123
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 1.2.0
 Environment: PostgreSQL
Reporter: Shinichi Yamashita
Assignee: Shinichi Yamashita
 Attachments: HIVE-11123.1.patch, HIVE-11123.2.patch, 
 HIVE-11123.3.patch, HIVE-11123.4.patch


 I use PostgreSQL to Hive Metastore. And I saw the following message at 
 PostgreSQL log.
 {code}
  2015-06-26 10:58:15.488 JST ERROR:  syntax error at or near @@ at 
 character 5
  2015-06-26 10:58:15.488 JST STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
  2015-06-26 10:58:15.489 JST ERROR:  relation v$instance does not exist 
 at character 21
  2015-06-26 10:58:15.489 JST STATEMENT:  SELECT version FROM v$instance
  2015-06-26 10:58:15.490 JST ERROR:  column version does not exist at 
 character 10
  2015-06-26 10:58:15.490 JST STATEMENT:  SELECT @@version
 {code}
 When Hive CLI and Beeline embedded mode are carried out, this message is 
 output to PostgreSQL log.
 These queries are called from MetaStoreDirectSql#determineDbType. And if we 
 use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11628) DB type detection code is failing on Oracle 12

2015-08-24 Thread Deepesh Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710111#comment-14710111
 ] 

Deepesh Khandelwal commented on HIVE-11628:
---

Yes, I agree, HIVE-11123 is a more robust fix. If that works fine between 
Oracle versions (11, 12), then we don't need this. I have posted a question for 
[~sinchii] about his test environment.

 DB type detection code is failing on Oracle 12
 --

 Key: HIVE-11628
 URL: https://issues.apache.org/jira/browse/HIVE-11628
 Project: Hive
  Issue Type: Bug
  Components: Metastore
 Environment: Oracle 12
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 2.0.0

 Attachments: HIVE-11628.patch


 DB type detection code is failing when using Oracle 12 as backing store.
 When determining qualification for direct SQL, in the logs following message 
 is seen:
 {noformat}
 2015-08-14 01:15:16,020 INFO  [pool-6-thread-109]: 
 metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:init(131)) - Using 
 direct SQL, underlying DB is OTHER
 {noformat}
 Currently in org/apache/hadoop/hive/metastore/MetaStoreDirectSql, there is a 
 code snippet:
 {code}
   private DB determineDbType() {
 DB dbType = DB.OTHER;
 if (runDbCheck(SET @@session.sql_mode=ANSI_QUOTES, MySql)) {
   dbType = DB.MYSQL;
 } else if (runDbCheck(SELECT version from v$instance, Oracle)) {
   dbType = DB.ORACLE;
 } else if (runDbCheck(SELECT @@version, MSSQL)) {
   dbType = DB.MSSQL;
 } else {
   // TODO: maybe we should use getProductName to identify all the DBs
   String productName = getProductName();
   if (productName != null  productName.toLowerCase().contains(derby)) 
 {
 dbType = DB.DERBY;
   }
 }
 return dbType;
   }
 {code}
 The code relies on access to v$instance in order to identify the backend DB 
 as Oracle, but this can fail if users are not granted select privileges on v$ 
 tables. An alternate way is specified on [Oracle Database Reference 
 pages|http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_4224.htm]
  works.
 I will attach a potential patch that should work.
 Without the patch the workaround here would be to grant select privileges on 
 v$ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11628) DB type detection code is failing on Oracle 12

2015-08-24 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-11628:
--
Attachment: HIVE-11628.patch

Attaching the patch for review.

 DB type detection code is failing on Oracle 12
 --

 Key: HIVE-11628
 URL: https://issues.apache.org/jira/browse/HIVE-11628
 Project: Hive
  Issue Type: Bug
  Components: Metastore
 Environment: Oracle 12
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-11628.patch


 DB type detection code is failing when using Oracle 12 as backing store.
 When determining qualification for direct SQL, in the logs following message 
 is seen:
 {noformat}
 2015-08-14 01:15:16,020 INFO  [pool-6-thread-109]: 
 metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:init(131)) - Using 
 direct SQL, underlying DB is OTHER
 {noformat}
 Currently in org/apache/hadoop/hive/metastore/MetaStoreDirectSql, there is a 
 code snippet:
 {code}
   private DB determineDbType() {
 DB dbType = DB.OTHER;
 if (runDbCheck(SET @@session.sql_mode=ANSI_QUOTES, MySql)) {
   dbType = DB.MYSQL;
 } else if (runDbCheck(SELECT version from v$instance, Oracle)) {
   dbType = DB.ORACLE;
 } else if (runDbCheck(SELECT @@version, MSSQL)) {
   dbType = DB.MSSQL;
 } else {
   // TODO: maybe we should use getProductName to identify all the DBs
   String productName = getProductName();
   if (productName != null  productName.toLowerCase().contains(derby)) 
 {
 dbType = DB.DERBY;
   }
 }
 return dbType;
   }
 {code}
 The code relies on access to v$instance in order to identify the backend DB 
 as Oracle, but this can fail if users are not granted select privileges on v$ 
 tables. An alternate way is specified on [Oracle Database Reference 
 pages|http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_4224.htm]
  works.
 I will attach a potential patch that should work.
 Without the patch the workaround here would be to grant select privileges on 
 v$ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709976#comment-14709976
 ] 

Siddharth Seth commented on HIVE-11515:
---

[~navis] - I think the patch is good to go in except for the log line, which 
should be an error.

However, I don't see this fixing an issue - I believe the condition mentioned 
is already handled. Up to you if you want to commit this.

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Tez
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11618) Correct the SARG api to reunify the PredicateLeaf.Type INTEGER and LONG

2015-08-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710080#comment-14710080
 ] 

Sergio Peña commented on HIVE-11618:


Thanks [~owen.omalley] for the patch.
+1

The patch looks good. I understand how you want to keep this code simple by 
returning one type for a group of same primitive values. Just one small 
feedback. What about adding some comments to {{CovertAstToSearchArg.getType}} 
and {{PredicateLeaf.Type}} for future reference about simplicity? Other 
developers might see this lack of data types, and they will be eager to add 
those. 

 Correct the SARG api to reunify the PredicateLeaf.Type INTEGER and LONG
 ---

 Key: HIVE-11618
 URL: https://issues.apache.org/jira/browse/HIVE-11618
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-11618.patch


 The Parquet binding leaked implementation details into the generic SARG api. 
 Rather than make all users of the SARG api deal with each of the specific 
 types, reunify the INTEGER and LONG types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2015-08-24 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: (was: HIVE-11548.1.patch)

 HCatLoader should support predicate pushdown.
 -

 Key: HIVE-11548
 URL: https://issues.apache.org/jira/browse/HIVE-11548
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan

 When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
 that support predicate pushdown (such as ORC, with 
 {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
 actually pushed down into the storage layer.
 The forthcoming patch should allow for filter-pushdown, if any of the 
 partitions being scanned with {{HCatLoader}} support the functionality. The 
 patch should technically allow the same for users of {{HCatInputFormat}}, but 
 I don't currently have a neat interface to build a compound 
 predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11633) import tool should print help by default

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11633:
---

Assignee: Sergey Shelukhin

 import tool should print help by default
 

 Key: HIVE-11633
 URL: https://issues.apache.org/jira/browse/HIVE-11633
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 It took me a while to figure out that I need to supply some command to make 
 import work, and I had to read the sources... it should output help by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2015-08-24 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710312#comment-14710312
 ] 

Vaibhav Gumashta commented on HIVE-11581:
-

Failure is unrelated.

 HiveServer2 should store connection params in ZK when using dynamic service 
 discovery for simpler client connection string.
 ---

 Key: HIVE-11581
 URL: https://issues.apache.org/jira/browse/HIVE-11581
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 1.3.0, 2.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
 HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch


 Currently, the client needs to specify several parameters based on which an 
 appropriate connection is created with the server. In case of dynamic service 
 discovery, when multiple HS2 instances are running, it is much more usable 
 for the server to add its config parameters to ZK which the driver can use to 
 configure the connection, instead of the jdbc/odbc user adding those in 
 connection string.
 However, at minimum, client will need to specify zookeeper ensemble and that 
 she wants the JDBC driver to use ZooKeeper:
 {noformat}
 beeline !connect 
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
 {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11599) Add metastore command to dump it's configs

2015-08-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710322#comment-14710322
 ] 

Thejas M Nair commented on HIVE-11599:
--

[~ekoifman] Would writing the hiveconfig to logs on metastore startup meet the 
needs ?


 Add metastore command to dump it's configs
 --

 Key: HIVE-11599
 URL: https://issues.apache.org/jira/browse/HIVE-11599
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 1.0.0
Reporter: Eugene Koifman

 We should have equivalent of Hive CLI set command on Metastore (and likely 
 HS2) which can dump out all properties this particular process is running 
 with.
 cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-08-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Component/s: CBO

 Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
 --

 Key: HIVE-11634
 URL: https://issues.apache.org/jira/browse/HIVE-11634
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 Currently, we do not support partition pruning for the following scenario
 {code}
 create table pcr_t1 (key int, value string) partitioned by (ds string);
 insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
 where key  20 order by key;
 explain extended select ds from pcr_t1 where struct(ds, key) in 
 (struct('2000-04-08',1), struct('2000-04-09',2));
 {code}
 If we run the above query, we see that all the partitions of table pcr_t1 are 
 present in the filter predicate where as we can prune  partition 
 (ds='2000-04-10'). 
 The optimization is to rewrite the above query into 2 IN clauses one 
 containing partition columns and the other containing non-partition columns 
 as follows.
 {code}
 explain extended select ds from pcr_t1 where (struct(key) IN (struct(1), 
 struct(2))) and (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'));
 {code}
 This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-08-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.1.patch

Initial draft, more test cases to follow in patch#2. Lets see how the test runs 
go with patch#1

 Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
 --

 Key: HIVE-11634
 URL: https://issues.apache.org/jira/browse/HIVE-11634
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11634.1.patch


 Currently, we do not support partition pruning for the following scenario
 {code}
 create table pcr_t1 (key int, value string) partitioned by (ds string);
 insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
 where key  20 order by key;
 explain extended select ds from pcr_t1 where struct(ds, key) in 
 (struct('2000-04-08',1), struct('2000-04-09',2));
 {code}
 If we run the above query, we see that all the partitions of table pcr_t1 are 
 present in the filter predicate where as we can prune  partition 
 (ds='2000-04-10'). 
 The optimization is to rewrite the above query into 2 IN clauses one 
 containing partition columns and the other containing non-partition columns 
 as follows.
 {code}
 explain extended select ds from pcr_t1 where (struct(key) IN (struct(1), 
 struct(2))) and (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'));
 {code}
 This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11614) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ctas after order by has problem

2015-08-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710413#comment-14710413
 ] 

Laljo John Pullokkaran commented on HIVE-11614:
---

[~pxiong] 1) We need to find why the col[1] ended up as fully qualified? 2) 
Does this happen only for ret path and not for CBO disabled?

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): ctas after 
 order by has problem
 -

 Key: HIVE-11614
 URL: https://issues.apache.org/jira/browse/HIVE-11614
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11614.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11622) Creating an Avro table with a complex map-typed column leads to incorrect column type.

2015-08-24 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HIVE-11622.

Resolution: Duplicate

This is a duplicate of HIVE-11288 which is already fixed. Thanks.

 Creating an Avro table with a complex map-typed column leads to incorrect 
 column type.
 --

 Key: HIVE-11622
 URL: https://issues.apache.org/jira/browse/HIVE-11622
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 1.1.0
Reporter: Alexander Behm
Assignee: Jimmy Xiang
  Labels: AvroSerde

 In the following CREATE TABLE the following map-typed column leads to the 
 wrong type. I suspect some problem with inferring the Avro schema from the 
 column definitions, but I am not sure.
 Reproduction:
 {code}
 hive create table t (c mapstring,arrayint) stored as avro;
 OK
 Time taken: 0.101 seconds
 hive desc t;
 OK
 c arraymapstring,int  from deserializer   
 Time taken: 0.135 seconds, Fetched: 1 row(s)
 {code}
 Note how the type shown in DESCRIBE is not the type originally passed in the 
 CREATE TABLE.
 However, *sometimes* the DESCRIBE shows the correct output. You may also try 
 these steps which produce a similar problem to increase the chance of hitting 
 this issue:
 {code}
 hive create table t (c arraymapstring,int) stored as avro;
 OK
 Time taken: 0.063 seconds
 hive desc t;
 OK
 c mapstring,arrayint  from deserializer   
 Time taken: 0.152 seconds, Fetched: 1 row(s)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710245#comment-14710245
 ] 

Sergey Shelukhin commented on HIVE-11595:
-

[~prasanth_j] failures are unrelated :)

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
 HIVE-11595.02.patch, HIVE-11595.03.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8007) Clean up Thrift definitions

2015-08-24 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710182#comment-14710182
 ] 

Lars Francke commented on HIVE-8007:


Oh...no... I did not regenerate the files as I could not get Thrift to build :(

At least I was correct about the build failure being unrelated ;-)

I'll try again to get Thrift working, thanks for the heads up!

 Clean up Thrift definitions
 ---

 Key: HIVE-8007
 URL: https://issues.apache.org/jira/browse/HIVE-8007
 Project: Hive
  Issue Type: Improvement
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-8007.1.patch, HIVE-8007.2.patch, HIVE-8007.3.patch


 This patch changes the following:
 * Currently the thrift file uses {{//}} to denote comments. Thrift 
 understands the {{/** ... */}} syntax and converts that into documentation in 
 the generated code. This patch changes the syntax
 * Change tabs to spaces
 * Consistent indentation
 * Minor whitespace and/or formatting issues
 There should be no changes to functionality at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11548) HCatLoader should support predicate pushdown.

2015-08-24 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-11548:

Attachment: HIVE-11548.1.patch

Corrected the bad-code. Submitting for re-test.

 HCatLoader should support predicate pushdown.
 -

 Key: HIVE-11548
 URL: https://issues.apache.org/jira/browse/HIVE-11548
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-11548.1.patch


 When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
 that support predicate pushdown (such as ORC, with 
 {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
 actually pushed down into the storage layer.
 The forthcoming patch should allow for filter-pushdown, if any of the 
 partitions being scanned with {{HCatLoader}} support the functionality. The 
 patch should technically allow the same for users of {{HCatInputFormat}}, but 
 I don't currently have a neat interface to build a compound 
 predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11630) Use string intering with HiveConf to be more space efficient

2015-08-24 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11630:

Description: The giant enum HiveConf#ConfVars has String fields varname and 
description. Each new session on HiveServer2 creates a new conf object, which 
means wastefully creating the varname and description string for each enum 
object.   (was: The giant enum HiveConf#ConfVars has a String field varname. 
Each new session on HiveServer2 creates a new conf object, which means 
wastefully creating the varname string for each enum object. )

 Use string intering with HiveConf to be more space efficient 
 -

 Key: HIVE-11630
 URL: https://issues.apache.org/jira/browse/HIVE-11630
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.1.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 The giant enum HiveConf#ConfVars has String fields varname and description. 
 Each new session on HiveServer2 creates a new conf object, which means 
 wastefully creating the varname and description string for each enum object. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11635) import tool fails on unsecure cluster

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710336#comment-14710336
 ] 

Sergey Shelukhin commented on HIVE-11635:
-

Running with --all. Everything before that was imported, but still the 
exception should not be output

 import tool fails on unsecure cluster
 -

 Key: HIVE-11635
 URL: https://issues.apache.org/jira/browse/HIVE-11635
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin

 {noformat}
 Copying kerberos related items
 2015-08-24 20:28:51,292 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MDelegationToken and subclasses 
 resulted in no possible candidates
 Required table missing : `DELEGATION_TOKENS` in Catalog  Schema . 
 DataNucleus requires this table to perform its persistence operations. Either 
 your MetaData is incorrect, or you need to enable 
 datanucleus.autoCreateTables
 org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table 
 missing : `DELEGATION_TOKENS` in Catalog  Schema . DataNucleus requires 
 this table to perform its persistence operations. Either your MetaData is 
 incorrect, or you need to enable datanucleus.autoCreateTables
   at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:485)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
   at 
 org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
   at 
 org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
   at 
 org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
   at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
   at org.datanucleus.store.query.Query.execute(Query.java:1654)
   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getAllTokenIdentifiers(ObjectStore.java:6888)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.copyKerberos(HBaseImport.java:474)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.run(HBaseImport.java:249)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.main(HBaseImport.java:81)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:497)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 ...
 {noformat}
 2015-08-24 20:28:51,298 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MMasterKey and subclasses resulted 
 in no possible candidates
 Required table missing : `MASTER_KEYS` in Catalog  Schema . DataNucleus 
 requires this table to perform its persistence operations. Either your 
 MetaData is incorrect, or you need to enable datanucleus.autoCreateTables
 org.da
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11635) import tool fails on non-secure cluster

2015-08-24 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710363#comment-14710363
 ] 

Alan Gates commented on HIVE-11635:
---

Do you want to fix this or want me to?  I can, but it will be a few days before 
I get to it.

 import tool fails on non-secure cluster
 ---

 Key: HIVE-11635
 URL: https://issues.apache.org/jira/browse/HIVE-11635
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin

 {noformat}
 Copying kerberos related items
 2015-08-24 20:28:51,292 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MDelegationToken and subclasses 
 resulted in no possible candidates
 Required table missing : `DELEGATION_TOKENS` in Catalog  Schema . 
 DataNucleus requires this table to perform its persistence operations. Either 
 your MetaData is incorrect, or you need to enable 
 datanucleus.autoCreateTables
 org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table 
 missing : `DELEGATION_TOKENS` in Catalog  Schema . DataNucleus requires 
 this table to perform its persistence operations. Either your MetaData is 
 incorrect, or you need to enable datanucleus.autoCreateTables
   at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:485)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
   at 
 org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
   at 
 org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
   at 
 org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
   at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
   at org.datanucleus.store.query.Query.execute(Query.java:1654)
   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getAllTokenIdentifiers(ObjectStore.java:6888)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.copyKerberos(HBaseImport.java:474)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.run(HBaseImport.java:249)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.main(HBaseImport.java:81)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:497)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 ...
 {noformat}
 2015-08-24 20:28:51,298 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MMasterKey and subclasses resulted 
 in no possible candidates
 Required table missing : `MASTER_KEYS` in Catalog  Schema . DataNucleus 
 requires this table to perform its persistence operations. Either your 
 MetaData is incorrect, or you need to enable datanucleus.autoCreateTables
 org.da
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11629) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter expressions for full outer join and right outer join

2015-08-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710424#comment-14710424
 ] 

Laljo John Pullokkaran commented on HIVE-11629:
---

[~jcamachorodriguez] Could you take a look at this one first?

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter 
 expressions for full outer join and right outer join
 --

 Key: HIVE-11629
 URL: https://issues.apache.org/jira/browse/HIVE-11629
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11629.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-08-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Description: 
Currently, we do not support partition pruning for the following scenario
{code}
create table pcr_t1 (key int, value string) partitioned by (ds string);
insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
where key  20 order by key;
insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
where key  20 order by key;
insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
where key  20 order by key;
explain extended select ds from pcr_t1 where struct(ds, key) in 
(struct('2000-04-08',1), struct('2000-04-09',2));
{code}

If we run the above query, we see that all the partitions of table pcr_t1 are 
present in the filter predicate where as we can prune  partition 
(ds='2000-04-10'). 

The optimization is to rewrite the above query into 2 IN clauses one containing 
partition columns and the other containing non-partition columns as follows.
{code}
explain extended select ds from pcr_t1 where (struct(key) IN (struct(1), 
struct(2))) and (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'));
{code}

This is an extension of the idea presented in HIVE-11573.

 Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
 --

 Key: HIVE-11634
 URL: https://issues.apache.org/jira/browse/HIVE-11634
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 Currently, we do not support partition pruning for the following scenario
 {code}
 create table pcr_t1 (key int, value string) partitioned by (ds string);
 insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
 where key  20 order by key;
 insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
 where key  20 order by key;
 explain extended select ds from pcr_t1 where struct(ds, key) in 
 (struct('2000-04-08',1), struct('2000-04-09',2));
 {code}
 If we run the above query, we see that all the partitions of table pcr_t1 are 
 present in the filter predicate where as we can prune  partition 
 (ds='2000-04-10'). 
 The optimization is to rewrite the above query into 2 IN clauses one 
 containing partition columns and the other containing non-partition columns 
 as follows.
 {code}
 explain extended select ds from pcr_t1 where (struct(key) IN (struct(1), 
 struct(2))) and (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'));
 {code}
 This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11633) import tool should print help by default

2015-08-24 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710360#comment-14710360
 ] 

Alan Gates commented on HIVE-11633:
---

+1

 import tool should print help by default
 

 Key: HIVE-11633
 URL: https://issues.apache.org/jira/browse/HIVE-11633
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-11633.patch


 It took me a while to figure out that I need to supply some command to make 
 import work, and I had to read the sources... it should output help by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11633) import tool should print help by default

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11633:

Attachment: HIVE-11633.patch

 import tool should print help by default
 

 Key: HIVE-11633
 URL: https://issues.apache.org/jira/browse/HIVE-11633
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-11633.patch


 It took me a while to figure out that I need to supply some command to make 
 import work, and I had to read the sources... it should output help by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11633) import tool should print help by default

2015-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710302#comment-14710302
 ] 

Sergey Shelukhin commented on HIVE-11633:
-

[~alangates] fyi

 import tool should print help by default
 

 Key: HIVE-11633
 URL: https://issues.apache.org/jira/browse/HIVE-11633
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-11633.patch


 It took me a while to figure out that I need to supply some command to make 
 import work, and I had to read the sources... it should output help by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2015-08-24 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710317#comment-14710317
 ] 

Thejas M Nair commented on HIVE-11581:
--

+1 for new patch

 HiveServer2 should store connection params in ZK when using dynamic service 
 discovery for simpler client connection string.
 ---

 Key: HIVE-11581
 URL: https://issues.apache.org/jira/browse/HIVE-11581
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 1.3.0, 2.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
 HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch


 Currently, the client needs to specify several parameters based on which an 
 appropriate connection is created with the server. In case of dynamic service 
 discovery, when multiple HS2 instances are running, it is much more usable 
 for the server to add its config parameters to ZK which the driver can use to 
 configure the connection, instead of the jdbc/odbc user adding those in 
 connection string.
 However, at minimum, client will need to specify zookeeper ensemble and that 
 she wants the JDBC driver to use ZooKeeper:
 {noformat}
 beeline !connect 
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
 {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11599) Add metastore command to dump it's configs

2015-08-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710327#comment-14710327
 ] 

Eugene Koifman commented on HIVE-11599:
---

That would be a good start but maybe not enough.  Most of the time logs are 
rolled and archived and what you get from customers is logs from today, not 
when Metastore was launched.  So having this on-demand is better.

 Add metastore command to dump it's configs
 --

 Key: HIVE-11599
 URL: https://issues.apache.org/jira/browse/HIVE-11599
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 1.0.0
Reporter: Eugene Koifman

 We should have equivalent of Hive CLI set command on Metastore (and likely 
 HS2) which can dump out all properties this particular process is running 
 with.
 cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11635) import tool fails on unsecure cluster

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11635:

Description: 
{noformat}
Copying kerberos related items
2015-08-24 20:28:51,292 WARN  [main] DataNucleus.Query: Query for candidates of 
org.apache.hadoop.hive.metastore.model.MDelegationToken and subclasses resulted 
in no possible candidates
Required table missing : `DELEGATION_TOKENS` in Catalog  Schema . 
DataNucleus requires this table to perform its persistence operations. Either 
your MetaData is incorrect, or you need to enable datanucleus.autoCreateTables
org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table 
missing : `DELEGATION_TOKENS` in Catalog  Schema . DataNucleus requires 
this table to perform its persistence operations. Either your MetaData is 
incorrect, or you need to enable datanucleus.autoCreateTables
at 
org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:485)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
at 
org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
at 
org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
at 
org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
at org.datanucleus.store.query.Query.execute(Query.java:1654)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getAllTokenIdentifiers(ObjectStore.java:6888)
at 
org.apache.hadoop.hive.metastore.hbase.HBaseImport.copyKerberos(HBaseImport.java:474)
at 
org.apache.hadoop.hive.metastore.hbase.HBaseImport.run(HBaseImport.java:249)
at 
org.apache.hadoop.hive.metastore.hbase.HBaseImport.main(HBaseImport.java:81)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}
...
{noformat}
2015-08-24 20:28:51,298 WARN  [main] DataNucleus.Query: Query for candidates of 
org.apache.hadoop.hive.metastore.model.MMasterKey and subclasses resulted in no 
possible candidates
Required table missing : `MASTER_KEYS` in Catalog  Schema . DataNucleus 
requires this table to perform its persistence operations. Either your MetaData 
is incorrect, or you need to enable datanucleus.autoCreateTables
org.da
{noformat}

  was:
{noformat}
Copying kerberos related items
2015-08-24 20:28:51,292 WARN  [main] DataNucleus.Query: Query for candidates of 
org.apache.hadoop.hive.metastore.model.MDelegationToken and subclasses resulted 
in no possible candidates
Required table missing : `DELEGATION_TOKENS` in Catalog  Schema . 
DataNucleus requires this table to perform its persistence operations. Either 
your MetaData is incorrect, or you need to enable datanucleus.autoCreateTables
org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table 
missing : `DELEGATION_TOKENS` in Catalog  Schema . DataNucleus requires 
this table to perform its persistence operations. Either your MetaData is 
incorrect, or you need to enable datanucleus.autoCreateTables
at 
org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:485)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
at 
org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
at 

[jira] [Updated] (HIVE-11635) import tool fails on non-secure cluster

2015-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11635:

Summary: import tool fails on non-secure cluster  (was: import tool fails 
on unsecure cluster)

 import tool fails on non-secure cluster
 ---

 Key: HIVE-11635
 URL: https://issues.apache.org/jira/browse/HIVE-11635
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Sergey Shelukhin

 {noformat}
 Copying kerberos related items
 2015-08-24 20:28:51,292 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MDelegationToken and subclasses 
 resulted in no possible candidates
 Required table missing : `DELEGATION_TOKENS` in Catalog  Schema . 
 DataNucleus requires this table to perform its persistence operations. Either 
 your MetaData is incorrect, or you need to enable 
 datanucleus.autoCreateTables
 org.datanucleus.store.rdbms.exceptions.MissingTableException: Required table 
 missing : `DELEGATION_TOKENS` in Catalog  Schema . DataNucleus requires 
 this table to perform its persistence operations. Either your MetaData is 
 incorrect, or you need to enable datanucleus.autoCreateTables
   at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:485)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
   at 
 org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
   at 
 org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
   at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
   at 
 org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull(JDOQLQuery.java:947)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:370)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
   at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
   at org.datanucleus.store.query.Query.execute(Query.java:1654)
   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getAllTokenIdentifiers(ObjectStore.java:6888)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.copyKerberos(HBaseImport.java:474)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.run(HBaseImport.java:249)
   at 
 org.apache.hadoop.hive.metastore.hbase.HBaseImport.main(HBaseImport.java:81)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:497)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {noformat}
 ...
 {noformat}
 2015-08-24 20:28:51,298 WARN  [main] DataNucleus.Query: Query for candidates 
 of org.apache.hadoop.hive.metastore.model.MMasterKey and subclasses resulted 
 in no possible candidates
 Required table missing : `MASTER_KEYS` in Catalog  Schema . DataNucleus 
 requires this table to perform its persistence operations. Either your 
 MetaData is incorrect, or you need to enable datanucleus.autoCreateTables
 org.da
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11504) Predicate pushing down doesn't work for float type for Parquet

2015-08-24 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11504:

Attachment: HIVE-11504.1.patch

Hi [~owen.omalley] [~spena], let's use the first edition to resolve this jira. 
Any thoughts?

 Predicate pushing down doesn't work for float type for Parquet
 --

 Key: HIVE-11504
 URL: https://issues.apache.org/jira/browse/HIVE-11504
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-11504.1.patch, HIVE-11504.1.patch, 
 HIVE-11504.2.patch, HIVE-11504.2.patch, HIVE-11504.3.patch, HIVE-11504.patch


 Predicate builder should use PrimitiveTypeName type in parquet side to 
 construct predicate leaf instead of the type provided by PredicateLeaf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]

2015-08-24 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11624:

Attachment: HIVE-11624.1-beeline-cli.patch

rename the patch with branch name

 Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]
 -

 Key: HIVE-11624
 URL: https://issues.apache.org/jira/browse/HIVE-11624
 Project: Hive
  Issue Type: Sub-task
Reporter: Ke Jia
Assignee: Ke Jia
 Attachments: HIVE-11624.1-beeline-cli.patch, HIVE-11624.patch


 In the old CLI, it uses hive.cli.print.header from the hive configuration 
 to force execution a script . We need to support the previous configuration 
 using beeline functionality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work

2015-08-24 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran reassigned HIVE-11445:
-

Assignee: Laljo John Pullokkaran  (was: Pengcheng Xiong)

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby 
 distinct does not work
 -

 Key: HIVE-11445
 URL: https://issues.apache.org/jira/browse/HIVE-11445
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-11445.01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for ReduceSink operator

2015-08-24 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710418#comment-14710418
 ] 

Laljo John Pullokkaran commented on HIVE-11623:
---

[~jcamachorodriguez] Could you look at this one first?

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the 
 tableAlias for ReduceSink operator
 

 Key: HIVE-11623
 URL: https://issues.apache.org/jira/browse/HIVE-11623
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11623.01.patch, HIVE-11623.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-11611) A bad performance regression issue with Parquet happens if Hive does not select any columns

2015-08-24 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reopened HIVE-11611:
-

Hi [~spena], I think if we bump up the latest version of parquet, we still need 
to change the code to the original one. I'd like to reopen this jira.

 A bad performance regression issue with Parquet happens if Hive does not 
 select any columns
 ---

 Key: HIVE-11611
 URL: https://issues.apache.org/jira/browse/HIVE-11611
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 2.0.0
Reporter: Sergio Peña
Assignee: Ferdinand Xu
 Attachments: HIVE-11611.patch


 A possible performance issue may happen with the below code when using a 
 query like this {{SELECT count(1) FROM parquetTable}}.
 {code}
 if (!ColumnProjectionUtils.isReadAllColumns(configuration)  
 !indexColumnsWanted.isEmpty()) {
 MessageType requestedSchemaByUser =
 getSchemaByIndex(tableSchema, columnNamesList, 
 indexColumnsWanted);
 return new ReadContext(requestedSchemaByUser, contextMetadata);
 } else {
   return new ReadContext(tableSchema, contextMetadata);
 }
 {code}
 If there are not columns nor indexes selected, then the above code will read 
 the full schema from Parquet even if Hive does not do anything with such 
 values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11628) DB type detection code is failing on Oracle 12

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710432#comment-14710432
 ] 

Hive QA commented on HIVE-11628:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752075/HIVE-11628.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9377 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5057/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5057/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5057/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752075 - PreCommit-HIVE-TRUNK-Build

 DB type detection code is failing on Oracle 12
 --

 Key: HIVE-11628
 URL: https://issues.apache.org/jira/browse/HIVE-11628
 Project: Hive
  Issue Type: Bug
  Components: Metastore
 Environment: Oracle 12
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 2.0.0

 Attachments: HIVE-11628.patch


 DB type detection code is failing when using Oracle 12 as backing store.
 When determining qualification for direct SQL, in the logs following message 
 is seen:
 {noformat}
 2015-08-14 01:15:16,020 INFO  [pool-6-thread-109]: 
 metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:init(131)) - Using 
 direct SQL, underlying DB is OTHER
 {noformat}
 Currently in org/apache/hadoop/hive/metastore/MetaStoreDirectSql, there is a 
 code snippet:
 {code}
   private DB determineDbType() {
 DB dbType = DB.OTHER;
 if (runDbCheck(SET @@session.sql_mode=ANSI_QUOTES, MySql)) {
   dbType = DB.MYSQL;
 } else if (runDbCheck(SELECT version from v$instance, Oracle)) {
   dbType = DB.ORACLE;
 } else if (runDbCheck(SELECT @@version, MSSQL)) {
   dbType = DB.MSSQL;
 } else {
   // TODO: maybe we should use getProductName to identify all the DBs
   String productName = getProductName();
   if (productName != null  productName.toLowerCase().contains(derby)) 
 {
 dbType = DB.DERBY;
   }
 }
 return dbType;
   }
 {code}
 The code relies on access to v$instance in order to identify the backend DB 
 as Oracle, but this can fail if users are not granted select privileges on v$ 
 tables. An alternate way is specified on [Oracle Database Reference 
 pages|http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_4224.htm]
  works.
 I will attach a potential patch that should work.
 Without the patch the workaround here would be to grant select privileges on 
 v$ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11631) TFetchResultsResp hasMoreRows and startRowOffset not returning actual values

2015-08-24 Thread Jenny Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jenny Kim updated HIVE-11631:
-
Description: 
hasMoreRows always returns False
startRowOffset always appears to be 0

  was:
This was originally reported in https://jira.cloudera.com/browse/CDH-8904 but 
appears to still be broken.

hasMoreRows always returns False, and startRowOffset always appears to be 0

Summary: TFetchResultsResp hasMoreRows and startRowOffset not returning 
actual values  (was: TFetchResultsResp hasMoreRow and startRowOffset not 
returning actual values)

 TFetchResultsResp hasMoreRows and startRowOffset not returning actual values
 

 Key: HIVE-11631
 URL: https://issues.apache.org/jira/browse/HIVE-11631
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.1.1
Reporter: Jenny Kim

 hasMoreRows always returns False
 startRowOffset always appears to be 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710168#comment-14710168
 ] 

Hive QA commented on HIVE-11595:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752073/HIVE-11595.03.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9377 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5055/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5055/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5055/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752073 - PreCommit-HIVE-TRUNK-Build

 refactor ORC footer reading to make it usable from outside
 --

 Key: HIVE-11595
 URL: https://issues.apache.org/jira/browse/HIVE-11595
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
 HIVE-11595.02.patch, HIVE-11595.03.patch


 If ORC footer is read from cache, we want to parse it without having the 
 reader, opening a file, etc. I thought it would be as simple as protobuf 
 parseFrom bytes, but apparently there's bunch of stuff going on there. It 
 needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11515) Still some possible race condition in DynamicPartitionPruner

2015-08-24 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710436#comment-14710436
 ] 

Navis commented on HIVE-11515:
--

[~sseth] If it's already fixed, seemed not need to commit this. Thanks!

 Still some possible race condition in DynamicPartitionPruner
 

 Key: HIVE-11515
 URL: https://issues.apache.org/jira/browse/HIVE-11515
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Tez
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-11515.1.patch.txt


 Even after HIVE-9976, I could see race condition in DPP sometimes. Hard to 
 reproduce but it seemed related to the fact that prune() is called by 
 thread-pool. With some delay in queue, events from fast tasks are arrived 
 before prune() is called.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11357) ACID enable predicate pushdown for insert-only delta file 2

2015-08-24 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11357:
--
Attachment: HIVE-11357.patch

 ACID enable predicate pushdown for insert-only delta file 2
 ---

 Key: HIVE-11357
 URL: https://issues.apache.org/jira/browse/HIVE-11357
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11357.patch


 HIVE-11320 missed a case.  That fix enabled PPD for insert-only delta files 
 when a base file is present.  It won't work if only delta files are present.
 see {{OrcInputFormat.getReader(InputSplit inputSplit, Options options)}}
 which only calls {{setSearchArgument()}} if there is a base file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10215) Large IN() clauses: deep hashCode performance during optimizer pass

2015-08-24 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10215:
--
Fix Version/s: 1.2.0

 Large IN() clauses: deep hashCode performance during optimizer pass
 ---

 Key: HIVE-10215
 URL: https://issues.apache.org/jira/browse/HIVE-10215
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-10215.1.patch


 The logical optimizer uses several maps and sets, which are exceeding 
 expensive for large IN() clauses due to the fact that several part of the 
 queries walk over the lists without short-circuiting during hashCode(), while 
 equals() is faster due to short-circuiting via less expensive operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10163) CommonMergeJoinOperator calls WritableComparator.get() in the inner loop

2015-08-24 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10163:
--
Fix Version/s: 1.2.0

 CommonMergeJoinOperator calls WritableComparator.get() in the inner loop
 

 Key: HIVE-10163
 URL: https://issues.apache.org/jira/browse/HIVE-10163
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gunther Hagleitner
  Labels: JOIN, Performance
 Fix For: 1.2.0

 Attachments: HIVE-10163.1.patch, HIVE-10163.2.patch, 
 HIVE-10163.3.patch, mergejoin-comparekeys.png, mergejoin-parallel-bt.png, 
 mergejoin-parallel-lock.png


 The CommonMergeJoinOperator wastes CPU looking up the correct comparator for 
 each WritableComparable in each row.
 {code}
 @SuppressWarnings(rawtypes)
   private int compareKeys(ListObject k1, ListObject k2) {
 int ret = 0;
    
   ret = WritableComparator.get(key_1.getClass()).compare(key_1, key_2);
   if (ret != 0) {
 return ret;
   }
 }
 {code}
 !mergejoin-parallel-lock.png!
 !mergejoin-comparekeys.png!
 The slow part of that get() is deep within {{ReflectionUtils.setConf}}, where 
 it tries to use reflection to set the Comparator config for each row being 
 compared.
 !mergejoin-parallel-bt.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11599) Add metastore command to dump it's configs

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710531#comment-14710531
 ] 

Ashutosh Chauhan edited comment on HIVE-11599 at 8/25/15 3:41 AM:
--

I agree with [~ekoifman]  logging at startup is bare minimum we can do, but 
really useful will be to have a command like {{bin/hive --metastore 
--printConf}} to print configuration of running metastore on console.


was (Author: ashutoshc):
I agree with [~ekoifman]  logging at startup is bare minimum we can do, but 
really useful will be to have a command like {{bin/hive --metastore --printConf 
}} to print configuration of running metastore on console.

 Add metastore command to dump it's configs
 --

 Key: HIVE-11599
 URL: https://issues.apache.org/jira/browse/HIVE-11599
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 1.0.0
Reporter: Eugene Koifman

 We should have equivalent of Hive CLI set command on Metastore (and likely 
 HS2) which can dump out all properties this particular process is running 
 with.
 cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11599) Add metastore command to dump it's configs

2015-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710531#comment-14710531
 ] 

Ashutosh Chauhan commented on HIVE-11599:
-

I agree with [~ekoifman]  logging at startup is bare minimum we can do, but 
really useful will be to have a command like {{bin/hive --metastore --printConf 
}} to print configuration of running metastore on console.

 Add metastore command to dump it's configs
 --

 Key: HIVE-11599
 URL: https://issues.apache.org/jira/browse/HIVE-11599
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 1.0.0
Reporter: Eugene Koifman

 We should have equivalent of Hive CLI set command on Metastore (and likely 
 HS2) which can dump out all properties this particular process is running 
 with.
 cc [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2015-08-24 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710592#comment-14710592
 ] 

Lefty Leverenz commented on HIVE-11581:
---

Does this need documentation?  If so, please add a TODOC1.3 label.  (No doc 
needed for the HiveConf.java changes -- the patch just moves some parameters 
around in the file.)

* [HiveServer2 Clients | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients]

 HiveServer2 should store connection params in ZK when using dynamic service 
 discovery for simpler client connection string.
 ---

 Key: HIVE-11581
 URL: https://issues.apache.org/jira/browse/HIVE-11581
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 1.3.0, 2.0.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
 HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch


 Currently, the client needs to specify several parameters based on which an 
 appropriate connection is created with the server. In case of dynamic service 
 discovery, when multiple HS2 instances are running, it is much more usable 
 for the server to add its config parameters to ZK which the driver can use to 
 configure the connection, instead of the jdbc/odbc user adding those in 
 connection string.
 However, at minimum, client will need to specify zookeeper ensemble and that 
 she wants the JDBC driver to use ZooKeeper:
 {noformat}
 beeline !connect 
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
 {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11637) Support hive.cli.print.current.db in new CLI[beeline-cli branch]

2015-08-24 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-11637:

Attachment: HIVE-11637.1-beeline-cli.patch

 Support hive.cli.print.current.db in new CLI[beeline-cli branch]
 

 Key: HIVE-11637
 URL: https://issues.apache.org/jira/browse/HIVE-11637
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-11637.1-beeline-cli.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]

2015-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710566#comment-14710566
 ] 

Hive QA commented on HIVE-11624:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752118/HIVE-11624.1-beeline-cli.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9235 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/21/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-BEELINE-Build/21/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-BEELINE-Build-21/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752118 - PreCommit-HIVE-BEELINE-Build

 Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]
 -

 Key: HIVE-11624
 URL: https://issues.apache.org/jira/browse/HIVE-11624
 Project: Hive
  Issue Type: Sub-task
Reporter: Ke Jia
Assignee: Ke Jia
 Attachments: HIVE-11624.1-beeline-cli.patch, HIVE-11624.patch


 In the old CLI, it uses hive.cli.print.header from the hive configuration 
 to force execution a script . We need to support the previous configuration 
 using beeline functionality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)