Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-29 Thread Sergio Pena
Congratulations :)

On Thu, Jan 29, 2015 at 10:23 AM, Chaoyu Tang ct...@cloudera.com wrote:

 Congratulations to everyone.

 On Thu, Jan 29, 2015 at 10:05 AM, Aihua Xu a...@cloudera.com wrote:

  +1. Cong~ everyone!
 
  On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com
 wrote:
 
  Congratulations everyone !
 
  On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote:
 
  I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
  O'Malley and Prasanth Jayachandran have been elected to the Hive Project
  Management Committee. Please join me in congratulating the these new PMC
  members!
 
  Thanks.
 
  - Carl
 
 
 
 
  --
  Philippe Kernévez
 
 
 
  Directeur technique (Suisse),
  pkerne...@octo.com
  +41 79 888 33 32
 
  Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
  OCTO Technology http://www.octo.com
 
 
 



[jira] [Commented] (HIVE-8136) Reduce table locking

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297131#comment-14297131
 ] 

Hive QA commented on HIVE-8136:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695251/HIVE-8136.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.metastore.TestMetaStoreAuthorization.testMetaStoreAuthorization
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2571/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2571/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2571/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695251 - PreCommit-HIVE-TRUNK-Build

 Reduce table locking
 

 Key: HIVE-8136
 URL: https://issues.apache.org/jira/browse/HIVE-8136
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: HIVE-8136.patch


 When using ZK for concurrency control, some statements require an exclusive 
 table lock when they are atomic. Such as setting a tables location.
 This JIRA is to analyze the scope of statements like ALTER TABLE and see if 
 we can reduce the locking required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296967#comment-14296967
 ] 

Xuefu Zhang commented on HIVE-9487:
---

+1

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296975#comment-14296975
 ] 

Xuefu Zhang commented on HIVE-9487:
---

The patch here seems containing more changes (such as itests/hive-jmh folder) 
than shown on RB. [~vanzin], could you check?

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.

2015-01-29 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297139#comment-14297139
 ] 

Thejas M Nair commented on HIVE-9500:
-

[~aihuaxu] Thanks for clarifying that the input is actually in Avro format. 
What part of query processing is the error happening ? Is it during some 
internal serialization that LazySimpleSerde is getting used ? If that is the 
case, maybe we should fix hive to use a better serde there.



 Support nested structs over 24 levels.
 --

 Key: HIVE-9500
 URL: https://issues.apache.org/jira/browse/HIVE-9500
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
  Labels: SerDe

 Customer has deeply nested avro structure and is receiving the following 
 error when performing queries.
 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
 org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
 supported for LazySimpleSerde is 23 Unable to work with level 24
 Currently we support up to 24 levels of nested structs when 
 hive.serialization.extend.nesting.levels is set to true, while the customers 
 have the requirement to support more than that. 
 It would be better to make the supported levels configurable or completely 
 removed (i.e., we can support any number of levels). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-01-29 Thread Sergio Pena


 On Ene. 28, 2015, 5:23 a.m., cheng xu wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java,
   lines 218-225
  https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line218
 
  How about the following code snippet?
  recordConsummer.startField(fieldName, i);
  if(i % 2 == 0){
writeValue(keyElement, KeyInspector, fieldType);
  }else{
writeValue(valueElement, valueInspector, fieldType);
  }
  recordConsumer.endField(fieldName, i);
 
 Sergio Pena wrote:
 The parquet API does not accept NULL values inside startField/endField. 
 This is why I had to check if key or value are nulls before starting the 
 field. Or in the change I did, we check for null values everywhere, and then 
 call startField/endField on writePrimitive. You can see the 
 TestDataWritableWriter.testMapType() method for how null values should work. 
 
 This is how Parquet adds map value 'key3 = null'
 
 startGroup();
   startField(key, 0);
 addString(key3);
   endField(key, 0);
 endGroup();
 
 cheng xu wrote:
 I see. The parquet does not handle the null value well for the 
 startField endField methods. Sorry for missing this point.
 How about this?
 {noformat}
 Object elementValue = (i%2)?keyElement:valueElement;
 if(elementValue == null){
   // field can not be NULL
   continue;
 }
 ObjectInspector elementInspector = (i%2)?keyInspector:valueInspector;
 recordConsummer.startField(fieldName, i);
 writeValue(elementValue, elementInspector, fieldType);
 recordConsumer.endField(fieldName, i);
 {noformat}

Thanks Ferd. I liked your change.


On Ene. 28, 2015, 5:23 a.m., Sergio Pena wrote:
  Hi Sergio, thank you for your changes. I have a few new comments left.
 
 Sergio Pena wrote:
 Thanks Ferd for your comments.
 I'll wait for your feedback before updating the other changes to see how 
 we can make this code better.
 
 cheng xu wrote:
 Thank you for your reply. I prefer the previous one because it matches 
 the method name better. For the WriteMap method, I have one little suggestion 
 for the code. Please see my inline comments.

Thanks Ferd for your comments. I uploaded another patch.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review69935
---


On Ene. 27, 2015, 6:47 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30281/
 ---
 
 (Updated Ene. 27, 2015, 6:47 p.m.)
 
 
 Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
 
 
 Bugs: HIVE-9333
 https://issues.apache.org/jira/browse/HIVE-9333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch moves the ParquetHiveSerDe.serialize() implementation to 
 DataWritableWriter class in order to save time in materializing data on 
 serialize().
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
  ea4109d358f7c48d1e2042e5da299475de4a0a29 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
  060b1b722d32f3b2f88304a1a73eb249e150294b 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  41b5f1c3b0ab43f734f8a211e3e03d5060c75434 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
 a693aff18516d133abf0aae4847d3fe00b9f1c96 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
  667d3671547190d363107019cd9a2d105d26d336 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 007a665529857bcec612f638a157aa5043562a15 
   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30281/diff/
 
 
 Testing
 ---
 
 The tests run were the following:
 
 1. JMH (Java microbenchmark)
 
 This benchmark called parquet serialize/write methods using text writable 
 objects. 
 
 Class.method  Before Change (ops/s)  After Change (ops/s) 
   
 ---
 ParquetHiveSerDe.serialize:  19,113   249,528   -  
 19x speed increase
 DataWritableWriter.write: 5,033 5,201   -  
 3.34% speed increase
 
 
 2. Write 20 million rows (~1GB file) from Text to Parquet
 
 I wrote a ~1Gb 

Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-01-29 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/
---

(Updated Ene. 29, 2015, 5:12 p.m.)


Review request for hive, Ryan Blue, cheng xu, and Dong Chen.


Changes
---

Patch with Ferd changes recommendations.
I also checking for the inspector category on writeValue() in order to pass the 
correct object inspector to the rest of the methods. I thinkg this makes other 
methods clean.


Bugs: HIVE-9333
https://issues.apache.org/jira/browse/HIVE-9333


Repository: hive-git


Description
---

This patch moves the ParquetHiveSerDe.serialize() implementation to 
DataWritableWriter class in order to save time in materializing data on 
serialize().


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 
ea4109d358f7c48d1e2042e5da299475de4a0a29 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
 060b1b722d32f3b2f88304a1a73eb249e150294b 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 
41b5f1c3b0ab43f734f8a211e3e03d5060c75434 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
 e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
a693aff18516d133abf0aae4847d3fe00b9f1c96 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
 667d3671547190d363107019cd9a2d105d26d336 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
007a665529857bcec612f638a157aa5043562a15 
  serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/30281/diff/


Testing
---

The tests run were the following:

1. JMH (Java microbenchmark)

This benchmark called parquet serialize/write methods using text writable 
objects. 

Class.method  Before Change (ops/s)  After Change (ops/s)   

---
ParquetHiveSerDe.serialize:  19,113   249,528   -  19x 
speed increase
DataWritableWriter.write: 5,033 5,201   -  
3.34% speed increase


2. Write 20 million rows (~1GB file) from Text to Parquet

I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
using the following
statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;

Time (s) it took to write the whole file BEFORE changes: 93.758 s
Time (s) it took to write the whole file AFTER changes: 83.903 s

It got a 10% of speed inscrease.


Thanks,

Sergio Pena



[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]

2015-01-29 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147
 ] 

Brock Noland commented on HIVE-9211:


Hi [~chengxiang li],

When we [moved over to a none SNAPSHOT version of 
Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab]
 I used a tarball which does not include the hadoop jars in the spark assembly. 
This can been seen by extracting the spark assembly in [our 
tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz].
 As such you'll see I put {{\$\{test.hive.hadoop.classpath\}}} in the classpath 
to replace the missing hadoop jars from the spark assembly.

As such, I have the following questions:

# which class are you not finding that is required?
# when you say you need the latest branch-1.2, do you mean a released version 
of spark? We can have a snapshot on the spark branch but not on trunk.



 Research on build mini HoS cluster on YARN for unit test[Spark Branch]
 --

 Key: HIVE-9211
 URL: https://issues.apache.org/jira/browse/HIVE-9211
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5
 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, 
 HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch


 HoS on YARN is a common use case in product environment, we'd better enable 
 unit test for this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]

2015-01-29 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147
 ] 

Brock Noland edited comment on HIVE-9211 at 1/29/15 5:14 PM:
-

Hi [~chengxiang li],

When we [moved over to a none SNAPSHOT version of 
Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab]
 I used a tarball which does not include the hadoop jars in the spark assembly. 
This can been seen by extracting the spark assembly in [our 
tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.2.0-bin-hadoop2-without-hive.tgz].
 As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath 
to replace the missing hadoop jars from the spark assembly.

As such, I have the following questions:

# which class are you not finding that is required?
# when you say you need the latest branch-1.2, do you mean a released version 
of spark? We can have a snapshot on the spark branch but not on trunk.




was (Author: brocknoland):
Hi [~chengxiang li],

When we [moved over to a none SNAPSHOT version of 
Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab]
 I used a tarball which does not include the hadoop jars in the spark assembly. 
This can been seen by extracting the spark assembly in [our 
tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz].
 As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath 
to replace the missing hadoop jars from the spark assembly.

As such, I have the following questions:

# which class are you not finding that is required?
# when you say you need the latest branch-1.2, do you mean a released version 
of spark? We can have a snapshot on the spark branch but not on trunk.



 Research on build mini HoS cluster on YARN for unit test[Spark Branch]
 --

 Key: HIVE-9211
 URL: https://issues.apache.org/jira/browse/HIVE-9211
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5
 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, 
 HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch


 HoS on YARN is a common use case in product environment, we'd better enable 
 unit test for this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9489:

Attachment: HIVE-9489.3.patch

Incorporated additional changes suggested by Lefty. 
Lefty, Even for fixes and feedback - its better late than never! Thanks again 
for looking into it!


 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9489:

Attachment: HIVE-9489.3.patch

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]

2015-01-29 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147
 ] 

Brock Noland edited comment on HIVE-9211 at 1/29/15 5:13 PM:
-

Hi [~chengxiang li],

When we [moved over to a none SNAPSHOT version of 
Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab]
 I used a tarball which does not include the hadoop jars in the spark assembly. 
This can been seen by extracting the spark assembly in [our 
tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz].
 As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath 
to replace the missing hadoop jars from the spark assembly.

As such, I have the following questions:

# which class are you not finding that is required?
# when you say you need the latest branch-1.2, do you mean a released version 
of spark? We can have a snapshot on the spark branch but not on trunk.




was (Author: brocknoland):
Hi [~chengxiang li],

When we [moved over to a none SNAPSHOT version of 
Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab]
 I used a tarball which does not include the hadoop jars in the spark assembly. 
This can been seen by extracting the spark assembly in [our 
tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz].
 As such you'll see I put {{\$\{test.hive.hadoop.classpath\}}} in the classpath 
to replace the missing hadoop jars from the spark assembly.

As such, I have the following questions:

# which class are you not finding that is required?
# when you say you need the latest branch-1.2, do you mean a released version 
of spark? We can have a snapshot on the spark branch but not on trunk.



 Research on build mini HoS cluster on YARN for unit test[Spark Branch]
 --

 Key: HIVE-9211
 URL: https://issues.apache.org/jira/browse/HIVE-9211
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5
 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, 
 HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch


 HoS on YARN is a common use case in product environment, we'd better enable 
 unit test for this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9489:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk.
Thanks for the reviews [~leftylev] [~ashutoshc]

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9451) Add max size of column dictionaries to ORC metadata

2015-01-29 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297178#comment-14297178
 ] 

Owen O'Malley commented on HIVE-9451:
-

We should also record the stripe size that was used as the file was written. 
That gives a strict upper bound on the size of memory in the writer.

 Add max size of column dictionaries to ORC metadata
 ---

 Key: HIVE-9451
 URL: https://issues.apache.org/jira/browse/HIVE-9451
 Project: Hive
  Issue Type: Improvement
Reporter: Owen O'Malley

 To predict the amount of memory required to read an ORC file we need to know 
 the size of the dictionaries for the columns that we are reading. I propose 
 adding the number of bytes for each column's dictionary to the stripe's 
 column statistics. The file's column statistics would have the maximum 
 dictionary size for each column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9489:

Attachment: (was: HIVE-9489.3.patch)

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296582#comment-14296582
 ] 

Hive QA commented on HIVE-9473:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694865/HIVE-9473.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7407 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2564/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2564/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2564/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694865 - PreCommit-HIVE-TRUNK-Build

 sql std auth should disallow built-in udfs that allow any java methods to be 
 called
 ---

 Key: HIVE-9473
 URL: https://issues.apache.org/jira/browse/HIVE-9473
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-9473.1.patch


 As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java 
 methods. This should be disallowed when sql standard authorization is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Attachment: HIVE-9471.3.patch

Here's the same, with the LENGTH stream suppressed.

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Status: Open  (was: Patch Available)

Modifying the comment for the second null-check.

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Attachment: (was: HIVE-9471.3.patch)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-29 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296631#comment-14296631
 ] 

Jesus Camacho Rodriguez commented on HIVE-9431:
---

[~jpullokkaran], fails are not related to the patch (HIVE-9498). I think it can 
go in. Thanks

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, 
 HIVE-9431.03.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Status: Open  (was: Patch Available)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Status: Patch Available  (was: Open)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Moustafa Aboul Atta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Moustafa Aboul Atta updated HIVE-9507:
--
Attachment: parial_log.log

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9507.1.patch.txt, parial_log.log


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while 

[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Moustafa Aboul Atta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Moustafa Aboul Atta updated HIVE-9507:
--
Description: 
I have tweets stored with avro on hdfs with the default twitter status (tweet) 
schema.
There's an object called entities that contains arrays of structs.
When I run
 
{{SELECT mytable.*}}
{{FROM tweets}}
{{LATERAL VIEW INLINE(entities.media) mytable}}

I get the exception attached as partial_log.log, however, if I add
{{WHERE entities.media IS NOT NULL}}
it runs perfectly.



  was:
I have tweets stored with avro on hdfs with the default twitter status (tweet) 
schema.
There's an object called entities that contains arrays of structs.
When I run
 
{{SELECT mytable.*}}
{{FROM tweets}}
{{LATERAL VIEW INLINE(entities.media) mytable}}

I get the exception found hereunder, however if I add
{{WHERE entities.media IS NOT NULL}}
it runs perfectly.

Here's the partial log:
2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) - 
Status: Running (Executing on YARN cluster with App id 
application_1422267635031_0618)

2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: -/-  
2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:12,354 INFO  log.PerfLogger 
(PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+5)/13 
2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6)/13 
2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6)/13 
2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-1)/13  
2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-3)/13  
2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-5)/13  
2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-6)/13  
2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-6)/13  
2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-9)/13  
2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-10)/13 
2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-11)/13 
2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-12)/13 
2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-13)/13 
2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-14)/13 
2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-14)/13 
2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-15)/13 
2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-16)/13 
2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-17)/13 
2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-18)/13 
2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+0,-19)/13 
2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) 
- Status: Failed
2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) 
- Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, 
diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 

[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Moustafa Aboul Atta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Moustafa Aboul Atta updated HIVE-9507:
--
Priority: Minor  (was: Major)

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9507.1.patch.txt


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running 

[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296557#comment-14296557
 ] 

Lefty Leverenz commented on HIVE-9489:
--

+1

... although two more quibbles could be fixed (@return true if the udf is 
deterministic - UDF; non deterministic - non-deterministic).  Sorry I 
missed them the first time.

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8136) Reduce table locking

2015-01-29 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296607#comment-14296607
 ] 

Ferdinand Xu commented on HIVE-8136:


Currently the following alter table write type is trying to acquire an 
exclusive lock.
DDL_EXCLUSIVE;
RENAMECOLUMN
ADDCLUSTERSORTCOLUMN:
ADDFILEFORMAT:
DROPPROPS:
REPLACECOLS:
ARCHIVE:
UNARCHIVE:
ALTERPROTECTMODE:
ALTERPARTITIONPROTECTMODE:
ALTERLOCATION:
DROPPARTITION:
RENAMEPARTITION:
ADDSKEWEDBY:
ALTERSKEWEDLOCATION:
ALTERBUCKETNUM:
ALTERPARTITION:
ADDCOLS:
RENAME:
TRUNCATE:
MERGEFILES:

Other following is using shared lock:
  ADDSERDE
  ADDPARTITION
  ADDSERDEPROPS
  ADDPROPS

Others has no lock:
  COMPACT
  TOUCH

For changing table structure, an exclusive lock is a must. Most of the cases 
use the exclusive lock since it changes the table or partition structure 
currently. For adding cluster column and sort column, we can use shared lock 
for the following reason.
{quote}
The CLUSTERED BY and SORTED BY creation commands do not affect how data is 
inserted into a table – only how it is read. This means that users must be 
careful to insert data correctly by specifying the number of reducers to be 
equal to the number of buckets, and using CLUSTER BY and SORT BY commands in 
their query.
{quote}
For changing the properties, I think we can use no lock if it doesn't change 
the structure of the table. We can do a follow-up jira.  Any thought about it, 
[~brocknoland]?

 Reduce table locking
 

 Key: HIVE-8136
 URL: https://issues.apache.org/jira/browse/HIVE-8136
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu

 When using ZK for concurrency control, some statements require an exclusive 
 table lock when they are atomic. Such as setting a tables location.
 This JIRA is to analyze the scope of statements like ALTER TABLE and see if 
 we can reduce the locking required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Moustafa Aboul Atta (JIRA)
Moustafa Aboul Atta created HIVE-9507:
-

 Summary: Make LATERAL VIEW inline(expression) mytable tolerant 
to nulls
 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta


I have tweets stored with avro on hdfs with the default twitter status (tweet) 
schema.
There's an object called entities that contains arrays of structs.
When I run
 
{{SELECT mytable.*}}
{{FROM tweets}}
{{LATERAL VIEW INLINE(entities.media) mytable}}

I get the exception found hereunder, however if I add
{{WHERE entities.media IS NOT NULL}}
it runs perfectly.

Here's the partial log:
2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) - 
Status: Running (Executing on YARN cluster with App id 
application_1422267635031_0618)

2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: -/-  
2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0/13 
2015-01-29 10:15:12,354 INFO  log.PerfLogger 
(PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+5)/13 
2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6)/13 
2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6)/13 
2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-1)/13  
2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-3)/13  
2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-5)/13  
2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-6)/13  
2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-6)/13  
2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-8)/13  
2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-9)/13  
2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-10)/13 
2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-11)/13 
2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-12)/13 
2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-13)/13 
2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-14)/13 
2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-14)/13 
2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-15)/13 
2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-16)/13 
2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-17)/13 
2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+6,-18)/13 
2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) - 
Map 1: 0(+0,-19)/13 
2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) 
- Status: Failed
2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) 
- Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, 
diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 
{metadata:{result_type:recent,iso_language_code:it},query_id:4013,data_source_type:1,search_date:1422300806,created_at:Mon
 Jan 26 04:31:11 + 

[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Assignee: Navis
  Status: Patch Available  (was: Open)

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
Assignee: Navis
 Attachments: HIVE-9507.1.patch.txt


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
 

[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls

2015-01-29 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.1.patch.txt

 Make LATERAL VIEW inline(expression) mytable tolerant to nulls
 

 Key: HIVE-9507
 URL: https://issues.apache.org/jira/browse/HIVE-9507
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, UDF
Affects Versions: 0.14.0
 Environment: hdp 2.2
 Windows server 2012 R2 64-bit
Reporter: Moustafa Aboul Atta
 Attachments: HIVE-9507.1.patch.txt


 I have tweets stored with avro on hdfs with the default twitter status 
 (tweet) schema.
 There's an object called entities that contains arrays of structs.
 When I run
  
 {{SELECT mytable.*}}
 {{FROM tweets}}
 {{LATERAL VIEW INLINE(entities.media) mytable}}
 I get the exception found hereunder, however if I add
 {{WHERE entities.media IS NOT NULL}}
 it runs perfectly.
 Here's the partial log:
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Status: Running (Executing on YARN cluster with App id 
 application_1422267635031_0618)
 2015-01-29 10:15:00,879 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: -/-
 2015-01-29 10:15:02,526 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:05,551 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:08,722 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0/13   
 2015-01-29 10:15:12,354 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 
 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor
 2015-01-29 10:15:12,354 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+5)/13   
 2015-01-29 10:15:12,557 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:15,691 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6)/13   
 2015-01-29 10:15:18,892 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-1)/13
 2015-01-29 10:15:19,094 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-3)/13
 2015-01-29 10:15:19,304 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-5)/13
 2015-01-29 10:15:19,507 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:22,641 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-6)/13
 2015-01-29 10:15:24,704 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:27,735 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:30,957 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:34,095 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-8)/13
 2015-01-29 10:15:35,138 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-9)/13
 2015-01-29 10:15:36,503 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-10)/13   
 2015-01-29 10:15:36,710 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-11)/13   
 2015-01-29 10:15:37,971 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-12)/13   
 2015-01-29 10:15:39,800 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-13)/13   
 2015-01-29 10:15:41,175 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:44,414 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-14)/13   
 2015-01-29 10:15:45,447 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-15)/13   
 2015-01-29 10:15:47,413 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-16)/13   
 2015-01-29 10:15:47,618 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-17)/13   
 2015-01-29 10:15:49,568 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+6,-18)/13   
 2015-01-29 10:15:51,099 INFO  SessionState (SessionState.java:printInfo(824)) 
 - Map 1: 0(+0,-19)/13   
 2015-01-29 10:15:51,331 ERROR SessionState 
 (SessionState.java:printError(833)) - Status: Failed
 2015-01-29 10:15:51,417 ERROR SessionState 
 (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, 
 vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, 
 taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 
 failed, info=[Error: Failure while running task:java.lang.RuntimeException: 
 java.lang.RuntimeException: 

Re: Review Request 30254: HIVE-9444

2015-01-29 Thread Jesús Camacho Rodríguez


 On Jan. 28, 2015, 10:45 p.m., John Pullokkaran wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java, 
  line 155
  https://reviews.apache.org/r/30254/diff/1/?file=833501#file833501line155
 
  How are carrying forward the assumptions? 
  ClusterBy, DistributeBy, OrderBy... is empty?

OrderBy, SortBy, and ClusterBy are covered by the condition that if there is a 
RS in the tree, order is empty (line 154 in the patched code).

DistributeBy is covered by the condition that if there is a RS in the tree, its 
partitionCols are empty (lines 150,151 in the patched code).


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30254/#review70102
---


On Jan. 25, 2015, 1:11 p.m., Jesús Camacho Rodríguez wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30254/
 ---
 
 (Updated Jan. 25, 2015, 1:11 p.m.)
 
 
 Review request for hive and John Pullokkaran.
 
 
 Bugs: HIVE-9444
 https://issues.apache.org/jira/browse/HIVE-9444
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-9444
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java 
 c9848dacd1a02db321583c2b91eb6d7317c295ff 
 
 Diff: https://reviews.apache.org/r/30254/diff/
 
 
 Testing
 ---
 
 Existing tests.
 
 
 Thanks,
 
 Jesús Camacho Rodríguez
 




[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Status: Patch Available  (was: Open)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Attachment: HIVE-9471.3.patch

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Fix Version/s: (was: 0.15.0)
   1.2.0

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8136) Reduce table locking

2015-01-29 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8136:
---
Status: Patch Available  (was: In Progress)

 Reduce table locking
 

 Key: HIVE-8136
 URL: https://issues.apache.org/jira/browse/HIVE-8136
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: HIVE-8136.patch


 When using ZK for concurrency control, some statements require an exclusive 
 table lock when they are atomic. Such as setting a tables location.
 This JIRA is to analyze the scope of statements like ALTER TABLE and see if 
 we can reduce the locking required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HIVE-9508:
---
Attachment: HIVE-9508.1.patch

Attaching basic patch. The connection lifetime is disabled by default so 
existing users should not be affected.

 MetaStore client socket connection should have a lifetime
 -

 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
  Labels: metastore, rolling_upgrade
 Fix For: 1.2.0

 Attachments: HIVE-9508.1.patch


 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
 server until the connection is closed or there is a problem. I would like to 
 introduce the concept of a MetaStore client socket life time. The MS client 
 will reconnect if the socket lifetime is reached. This will help during 
 rolling upgrade of Metastore.
 When there are multiple Metastore servers behind a VIP (load balancer), it is 
 easy to take one server out of rotation and wait for 10+ mins for all 
 existing connections will die down (if the lifetime is 5mins say) and the 
 server can be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296653#comment-14296653
 ] 

Hive QA commented on HIVE-9489:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695157/HIVE-9489.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7405 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2565/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2565/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2565/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695157 - PreCommit-HIVE-TRUNK-Build

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9508) MetaStore client socket connection should have a lifetime

2015-01-29 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-9508:
--

 Summary: MetaStore client socket connection should have a lifetime
 Key: HIVE-9508
 URL: https://issues.apache.org/jira/browse/HIVE-9508
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Metastore
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.15.0


Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore 
server until the connection is closed or there is a problem. I would like to 
introduce the concept of a MetaStore client socket life time. The MS client 
will reconnect if the socket lifetime is reached. This will help during rolling 
upgrade of Metastore.

When there are multiple Metastore servers behind a VIP (load balancer), it is 
easy to take one server out of rotation and wait for 10+ mins for all existing 
connections will die down (if the lifetime is 5mins say) and the server can be 
updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9416) Get rid of Extract Operator

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296707#comment-14296707
 ] 

Hive QA commented on HIVE-9416:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695166/HIVE-9416.6.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2566/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2566/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2566/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695166 - PreCommit-HIVE-TRUNK-Build

 Get rid of Extract Operator
 ---

 Key: HIVE-9416
 URL: https://issues.apache.org/jira/browse/HIVE-9416
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, 
 HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.patch


 {{Extract Operator}} has been there for legacy reasons. But there is no 
 functionality it provides which cant be provided by {{Select Operator}} 
 Instead of having two operators, one being subset of another we should just 
 get rid of {{Extract}} and simplify our codebase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8136) Reduce table locking

2015-01-29 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8136:
---
Attachment: HIVE-8136.patch

 Reduce table locking
 

 Key: HIVE-8136
 URL: https://issues.apache.org/jira/browse/HIVE-8136
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: HIVE-8136.patch


 When using ZK for concurrency control, some statements require an exclusive 
 table lock when they are atomic. Such as setting a tables location.
 This JIRA is to analyze the scope of statements like ALTER TABLE and see if 
 we can reduce the locking required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297013#comment-14297013
 ] 

Hive QA commented on HIVE-9471:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695233/HIVE-9471.3.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2569/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2569/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2569/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695233 - PreCommit-HIVE-TRUNK-Build

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296780#comment-14296780
 ] 

Hive QA commented on HIVE-5472:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695173/HIVE-5472.4.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7406 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2567/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2567/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2567/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695173 - PreCommit-HIVE-TRUNK-Build

 support a simple scalar which returns the current timestamp
 ---

 Key: HIVE-5472
 URL: https://issues.apache.org/jira/browse/HIVE-5472
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0
Reporter: N Campbell
Assignee: Jason Dere
 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, 
 HIVE-5472.4.patch


 ISO-SQL has two forms of functions
 local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE 
 and the latter with TIME ZONE
 select cast ( unix_timestamp() as timestamp ) from T
 implement a function which computes LOCAL TIMESTAMP which would be the 
 current timestamp for the users session time zone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9252) Linking custom SerDe jar to table definition.

2015-01-29 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu reassigned HIVE-9252:
--

Assignee: Ferdinand Xu

 Linking custom SerDe jar to table definition.
 -

 Key: HIVE-9252
 URL: https://issues.apache.org/jira/browse/HIVE-9252
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Niels Basjes
Assignee: Ferdinand Xu

 In HIVE-6047 the option was created that a jar file can be hooked to the 
 definition of a function. (See: [Language Manual DDL: Permanent 
 Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions]
  )
 I propose to add something similar that can be used when defining an external 
 table that relies on a custom Serde (I expect to usually only have the 
 Deserializer).
 Something like this:
 {code}
 CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
 ...
 STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] 
 [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
 {code}
 Using this you can define (and share !!!) a Hive table on top of a custom 
 fileformat without the need to let the IT operations people deploy a custom 
 SerDe jar file on all nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Hive 1.0 Release Candidate 1

2015-01-29 Thread Alan Gates
+1.  Downloaded it, checked out the signatures, did a build, checked 
there were no snapshot dependencies.


Alan.


Vikram Dixit K mailto:vikram.di...@gmail.com
January 27, 2015 at 14:28
Apache Hive 1.0 Release Candidate 1 is available here:
http://people.apache.org/~vikram/hive/apache-hive-1.0-rc1/

Maven artifacts are available here:
https://repository.apache.org/content/repositories/orgapachehive-1020/

Source tag for RC1 is at:
http://svn.apache.org/repos/asf/hive/branches/branch-1.0/

Voting will conclude in 72 hours.

Hive PMC Members: Please test and vote.

Thanks

Vikram.




--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297200#comment-14297200
 ] 

Marcelo Vanzin commented on HIVE-9487:
--

Hmm, weird. I definitely did not touch those. Maybe some merge issue, I'll take 
a look.

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file

2015-01-29 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297216#comment-14297216
 ] 

Alan Gates commented on HIVE-9317:
--

I think we're ok without this in 1.0.  It's already been in several releases.  
If we need to roll a new RC I agree this should go in.

 move Microsoft copyright to NOTICE file
 ---

 Key: HIVE-9317
 URL: https://issues.apache.org/jira/browse/HIVE-9317
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.15.0, 1.0.0

 Attachments: hive-9327.txt


 There are a set of files that still have the Microsoft copyright notices. 
 Those notices need to be moved into NOTICES and replaced with the standard 
 Apache headers.
 {code}
 ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
 ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml

2015-01-29 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297297#comment-14297297
 ] 

Ashutosh Chauhan commented on HIVE-8307:


I will put up a patch to remove comments from serde properties. 

 null character in columns.comments schema property breaks jobconf.xml
 -

 Key: HIVE-8307
 URL: https://issues.apache.org/jira/browse/HIVE-8307
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0, 0.13.1
Reporter: Carl Laird

 It would appear that the fix for 
 https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character 
 to show up in job config xml files:
 I get the following when trying to insert into an elasticsearch backed table:
 [Fatal Error] :336:51: Character reference #
 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 Exception in thread main java.lang.RuntimeException: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263)
 at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:416)
 at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604)
 at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; 
 Character reference #
 at 
 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251)
 at 
 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181)
 ... 11 more
 Execution failed with exit status: 1
 Line 336 of jobconf.xml:
 propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property
 See 
 https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ 
 for more discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9392) JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296867#comment-14296867
 ] 

Hive QA commented on HIVE-9392:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695176/HIVE-9392.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7407 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2568/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2568/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2568/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695176 - PreCommit-HIVE-TRUNK-Build

 JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to 
 column names having duplicated fqColumnName
 

 Key: HIVE-9392
 URL: https://issues.apache.org/jira/browse/HIVE-9392
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth Jayachandran
Priority: Critical
 Fix For: 0.15.0

 Attachments: HIVE-9392.1.patch, HIVE-9392.2.patch


 In JoinStatsRule.process the join column statistics are stored in HashMap  
 joinedColStats, the key used which is the ColStatistics.fqColName is 
 duplicated between join column in the same vertex, as a result distinctVals 
 ends up having duplicated values which negatively affects the join 
 cardinality estimation.
 The duplicate keys are usually named KEY.reducesinkkey0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9500) Support nested structs over 24 levels.

2015-01-29 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-9500:
---
Description: 
Customer has deeply nested avro structure and is receiving the following error 
when performing queries.

15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
supported for LazySimpleSerde is 23 Unable to work with level 24
Currently we support up to 24 levels of nested structs when 
hive.serialization.extend.nesting.levels is set to true, while the customers 
have the requirement to support more than that. 

It would be better to make the supported levels configurable or completely 
removed (i.e., we can support any number of levels). 

  was:
Currently we support up to 24 levels of nested structs when 
hive.serialization.extend.nesting.levels is set to true, while the customers 
have the requirement to support more than that. 

It would be better to make the supported levels configurable or completely 
removed (i.e., we can support any number of levels). 


 Support nested structs over 24 levels.
 --

 Key: HIVE-9500
 URL: https://issues.apache.org/jira/browse/HIVE-9500
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
  Labels: SerDe

 Customer has deeply nested avro structure and is receiving the following 
 error when performing queries.
 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
 org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
 supported for LazySimpleSerde is 23 Unable to work with level 24
 Currently we support up to 24 levels of nested structs when 
 hive.serialization.extend.nesting.levels is set to true, while the customers 
 have the requirement to support more than that. 
 It would be better to make the supported levels configurable or completely 
 removed (i.e., we can support any number of levels). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]

2015-01-29 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-9211:

Attachment: HIVE-9211.4-spark.patch

[~brocknoland], what code base is our current Spark installation built upon?  I 
run into some inconsistent jar dependency issue in test, and update Spark 
installation based latest Spark branch-1.2 code fix it. The Hive spark branch 
depends on Hadoop 2.6.0 for hadoop2 now, we may need to build spark consistent 
with it.

 Research on build mini HoS cluster on YARN for unit test[Spark Branch]
 --

 Key: HIVE-9211
 URL: https://issues.apache.org/jira/browse/HIVE-9211
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5
 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, 
 HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch


 HoS on YARN is a common use case in product environment, we'd better enable 
 unit test for this case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297050#comment-14297050
 ] 

Hive QA commented on HIVE-9211:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695277/HIVE-9211.4-spark.patch

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 7404 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/691/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/691/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-691/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase

Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-29 Thread Chaoyu Tang
Congratulations to everyone.

On Thu, Jan 29, 2015 at 10:05 AM, Aihua Xu a...@cloudera.com wrote:

 +1. Cong~ everyone!

 On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com wrote:

 Congratulations everyone !

 On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl




 --
 Philippe Kernévez



 Directeur technique (Suisse),
 pkerne...@octo.com
 +41 79 888 33 32

 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
 OCTO Technology http://www.octo.com





Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-29 Thread Philippe Kernévez
Congratulations everyone !

On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl




-- 
Philippe Kernévez



Directeur technique (Suisse),
pkerne...@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.com


Re: Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]

2015-01-29 Thread Chao Sun


 On Jan. 29, 2015, 4:20 a.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java,
   line 295
  https://reviews.apache.org/r/30388/diff/1/?file=839499#file839499line295
 
  childrenBackupTasks or backChildrenTasks? I suggest more consistent 
  variable/method names. Since the none is task, I suggest child.

Good point. Will change.


 On Jan. 29, 2015, 4:20 a.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java, line 110
  https://reviews.apache.org/r/30388/diff/1/?file=839504#file839504line110
 
  In Spark branch - For Spark

Will change.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30388/#review70150
---


On Jan. 29, 2015, 1:05 a.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30388/
 ---
 
 (Updated Jan. 29, 2015, 1:05 a.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-9103
 https://issues.apache.org/jira/browse/HIVE-9103
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch adds backup task to map join task. The backup task, which uses 
 common join, will be triggered
 in case the mapjoin task failed.
 
 Note that, no matter how many map joins there are in the SparkTask, we will 
 only generate one backup task.
 This means that if the original task failed at the very last map join, the 
 whole task will be re-executed.
 
 The handling of backup task is a little bit different from what MR does, 
 mostly because we convert JOIN to
 MAPJOIN during the operator plan optimization phase, at which time no 
 task/work exist yet. In the patch, we
 cloned the whole operator tree before the JOIN operator is converted. The 
 operator tree will be processed
 and generate a separate work tree for a separate backup SparkTask.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
  69004dc 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java
  79c3e02 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java 
 d57ceff 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
  9ff47c7 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java
  6e0ac38 
   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 773cfbd 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java
  f7586a4 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 3a7477a 
   ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 
   ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a 
 
 Diff: https://reviews.apache.org/r/30388/diff/
 
 
 Testing
 ---
 
 auto_join25.q
 
 
 Thanks,
 
 Chao Sun
 




Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-29 Thread Aihua Xu
+1. Cong~ everyone!
On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com wrote:

 Congratulations everyone !
 
 On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote:
 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen 
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project 
 Management Committee. Please join me in congratulating the these new PMC 
 members!
 
 Thanks.
 
 - Carl
 
 
 
 -- 
 Philippe Kernévez
 
 
 
 Directeur technique (Suisse), 
 pkerne...@octo.com
 +41 79 888 33 32
 
 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
 OCTO Technology http://www.octo.com



[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file

2015-01-29 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297319#comment-14297319
 ] 

Owen O'Malley commented on HIVE-9317:
-

+1 to not rolling a new RC specifically for this one. I just want to make sure 
it goes into to any new RCs.

 move Microsoft copyright to NOTICE file
 ---

 Key: HIVE-9317
 URL: https://issues.apache.org/jira/browse/HIVE-9317
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.15.0, 1.0.0

 Attachments: hive-9327.txt


 There are a set of files that still have the Microsoft copyright notices. 
 Those notices need to be moved into NOTICES and replaced with the standard 
 Apache headers.
 {code}
 ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
 ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]

2015-01-29 Thread Chao Sun

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30388/
---

(Updated Jan. 29, 2015, 6:51 p.m.)


Review request for hive and Xuefu Zhang.


Changes
---

Regenerated golden files (mostly plan change due to the backup task), and added 
auto_join25.q. Also addressed initial feedback from review board.


Bugs: HIVE-9103
https://issues.apache.org/jira/browse/HIVE-9103


Repository: hive-git


Description
---

This patch adds backup task to map join task. The backup task, which uses 
common join, will be triggered
in case the mapjoin task failed.

Note that, no matter how many map joins there are in the SparkTask, we will 
only generate one backup task.
This means that if the original task failed at the very last map join, the 
whole task will be re-executed.

The handling of backup task is a little bit different from what MR does, mostly 
because we convert JOIN to
MAPJOIN during the operator plan optimization phase, at which time no task/work 
exist yet. In the patch, we
cloned the whole operator tree before the JOIN operator is converted. The 
operator tree will be processed
and generate a separate work tree for a separate backup SparkTask.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties f583aaf 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
 69004dc 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java
 79c3e02 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java 
d57ceff 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 9ff47c7 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java
 6e0ac38 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
773cfbd 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java 
f7586a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 
  ql/src/test/results/clientpositive/spark/auto_join0.q.out 7f8eb63 
  ql/src/test/results/clientpositive/spark/auto_join1.q.out b640b9d 
  ql/src/test/results/clientpositive/spark/auto_join10.q.out f01765c 
  ql/src/test/results/clientpositive/spark/auto_join11.q.out 69c10e6 
  ql/src/test/results/clientpositive/spark/auto_join12.q.out bc763ed 
  ql/src/test/results/clientpositive/spark/auto_join13.q.out 935ebf5 
  ql/src/test/results/clientpositive/spark/auto_join14.q.out 830314e 
  ql/src/test/results/clientpositive/spark/auto_join15.q.out 780540b 
  ql/src/test/results/clientpositive/spark/auto_join16.q.out f705339 
  ql/src/test/results/clientpositive/spark/auto_join17.q.out 3144db6 
  ql/src/test/results/clientpositive/spark/auto_join19.q.out f2b0140 
  ql/src/test/results/clientpositive/spark/auto_join2.q.out 2424cca 
  ql/src/test/results/clientpositive/spark/auto_join20.q.out 9258f3b 
  ql/src/test/results/clientpositive/spark/auto_join21.q.out aa8f6dd 
  ql/src/test/results/clientpositive/spark/auto_join22.q.out d49dda9 
  ql/src/test/results/clientpositive/spark/auto_join23.q.out a179d87 
  ql/src/test/results/clientpositive/spark/auto_join24.q.out cfb076e 
  ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a 
  ql/src/test/results/clientpositive/spark/auto_join26.q.out 58821e9 
  ql/src/test/results/clientpositive/spark/auto_join28.q.out d30133b 
  ql/src/test/results/clientpositive/spark/auto_join29.q.out 780c6cb 
  ql/src/test/results/clientpositive/spark/auto_join3.q.out 54e24f3 
  ql/src/test/results/clientpositive/spark/auto_join30.q.out 4c832e2 
  ql/src/test/results/clientpositive/spark/auto_join31.q.out 5980814 
  ql/src/test/results/clientpositive/spark/auto_join32.q.out 9629f53 
  ql/src/test/results/clientpositive/spark/auto_join4.q.out 3366f75 
  ql/src/test/results/clientpositive/spark/auto_join5.q.out b6d8798 
  ql/src/test/results/clientpositive/spark/auto_join8.q.out 5b6cc80 
  ql/src/test/results/clientpositive/spark/auto_join9.q.out 6daf348 
  ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
  ql/src/test/results/clientpositive/spark/auto_join_nulls.q.out 1f37c75 
  ql/src/test/results/clientpositive/spark/auto_join_stats.q.out 1fa1a74 
  ql/src/test/results/clientpositive/spark/auto_join_stats2.q.out c6473d3 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 3d465db 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_10.q.out fe7b96d 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_11.q.out f4e889a 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out c358721 
  

[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification

2015-01-29 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297317#comment-14297317
 ] 

Sushanth Sowmyan commented on HIVE-9501:


+1, works correctly now, I'm able to see the dbname and tablename and able to 
filter on them appropriately.

The test failure reported is unrelated, will go ahead and commit.

 DbNotificationListener doesn't include dbname in create database notification 
 and does not include tablename in create table notification
 -

 Key: HIVE-9501
 URL: https://issues.apache.org/jira/browse/HIVE-9501
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-9501.patch


 This is a hold over from the JMS stuff, where create database is sent on the 
 general topic and create table on the db topic.  But since 
 DbNotificationListener isn't for JMS, keeping this semantic doesn't make 
 sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification

2015-01-29 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9501:
---
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

 DbNotificationListener doesn't include dbname in create database notification 
 and does not include tablename in create table notification
 -

 Key: HIVE-9501
 URL: https://issues.apache.org/jira/browse/HIVE-9501
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 1.2.0

 Attachments: HIVE-9501.patch


 This is a hold over from the JMS stuff, where create database is sent on the 
 general topic and create table on the db topic.  But since 
 DbNotificationListener isn't for JMS, keeping this semantic doesn't make 
 sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification

2015-01-29 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297321#comment-14297321
 ] 

Sushanth Sowmyan commented on HIVE-9501:


Committed to trunk. Thanks, Alan.

(Doc note : No docs required on this as well - DbNotificationListener is 
internal and this is adding additional info that it needed for filters on it to 
work correctly.)

 DbNotificationListener doesn't include dbname in create database notification 
 and does not include tablename in create table notification
 -

 Key: HIVE-9501
 URL: https://issues.apache.org/jira/browse/HIVE-9501
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 1.2.0

 Attachments: HIVE-9501.patch


 This is a hold over from the JMS stuff, where create database is sent on the 
 general topic and create table on the db topic.  But since 
 DbNotificationListener isn't for JMS, keeping this semantic doesn't make 
 sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-29 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9103:
---
Attachment: HIVE-9103.2-spark.patch

Regenerated golden files (mostly plan change due to the backup task), and added 
auto_join25.q. Also addressed initial feedback from review board.

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch, HIVE-9103.2-spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9454) Test failures due to new Calcite version

2015-01-29 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9454:
--
Attachment: HIVE-9454.02.patch

New patch based on Julien's, containing also the changes on golden files using 
Calcite-1.0.0-RC2.

 Test failures due to new Calcite version
 

 Key: HIVE-9454
 URL: https://issues.apache.org/jira/browse/HIVE-9454
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
 Attachments: HIVE-9454.02.patch, HIVE-9454.1.patch


 A bunch of failures have started appearing in patches which seen unrelated. I 
 am thinking we've picked up a new version of Calcite. E.g.:
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2488/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_auto_join12/
 {noformat}
 Running: diff -a 
 /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/auto_join12.q.out
  
 /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/auto_join12.q.out
 32c32
  $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
 ---
  $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
 35c35
  $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src 
 ---
  $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src 
 39c39
  $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
 ---
  $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
 54c54
  $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src 
 ---
  $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9445) Revert HIVE-5700 - enforce single date format for partition column storage

2015-01-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297384#comment-14297384
 ] 

Sergey Shelukhin commented on HIVE-9445:


Hmm, looks like I missed the java part of the change that was not merely code 
move. Partition spec validation should not have been thrown out... Let me file 
a JIRA to add it back

 Revert HIVE-5700 - enforce single date format for partition column storage
 --

 Key: HIVE-9445
 URL: https://issues.apache.org/jira/browse/HIVE-9445
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0, 0.14.1
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Blocker
 Fix For: 0.15.0

 Attachments: HIVE-9445.1.patch, HIVE-9445.1.patch


 HIVE-5700 has the following issues:
 * HIVE-8730 - fails mysql upgrades
 * Does not upgrade all metadata, e.g. {{PARTITIONS.PART_NAME}} See comments 
 in HIVE-5700.
 * Completely corrupts postgres, see below.
 With a postgres metastore on 0.12, I executed the following:
 {noformat}
 CREATE TABLE HIVE5700_DATE_PARTED (line string) PARTITIONED BY (ddate date);
 CREATE TABLE HIVE5700_STRING_PARTED (line string) PARTITIONED BY (ddate 
 string);
 ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='NOT_DATE');
 ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150121');
 ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150122');
 ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='2015-01-23');
 ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='NOT_DATE');
 ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150121');
 ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150122');
 ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='2015-01-23');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_DATE_PARTED PARTITION (ddate='NOT_DATE');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_DATE_PARTED PARTITION (ddate='20150121');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_DATE_PARTED PARTITION (ddate='20150122');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_DATE_PARTED PARTITION (ddate='2015-01-23');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_STRING_PARTED PARTITION (ddate='NOT_DATE');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_STRING_PARTED PARTITION (ddate='20150121');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_STRING_PARTED PARTITION (ddate='20150122');
 LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE 
 HIVE5700_STRING_PARTED PARTITION (ddate='2015-01-23');
 hive show partitions HIVE5700_DATE_PARTED;  
 OK
 ddate=20150121
 ddate=20150122
 ddate=2015-01-23
 ddate=NOT_DATE
 Time taken: 0.052 seconds, Fetched: 4 row(s)
 hive show partitions HIVE5700_STRING_PARTED;
 OK
 ddate=20150121
 ddate=20150122
 ddate=2015-01-23
 ddate=NOT_DATE
 Time taken: 0.051 seconds, Fetched: 4 row(s)
 {noformat}
 I then took a dump of the database named {{postgres-pre-upgrade.sql}} and the 
 data in the dump looks good:
 {noformat}
 [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY 
 PARTITION_KEY_VALS' postgres-pre-upgrade.sql 
 COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, 
 SD_ID, TBL_ID) FROM stdin;
 3 1421943647  0   ddate=NOT_DATE  6   2
 4 1421943647  0   ddate=20150121  7   2
 5 1421943648  0   ddate=20150122  8   2
 6 1421943664  0   ddate=NOT_DATE  9   3
 7 1421943664  0   ddate=20150121  10  3
 8 1421943665  0   ddate=20150122  11  3
 9 1421943694  0   ddate=2015-01-2312  2
 101421943695  0   ddate=2015-01-2313  3
 \.
 --
 COPY PARTITION_KEY_VALS (PART_ID, PART_KEY_VAL, INTEGER_IDX) FROM 
 stdin;
 3 NOT_DATE0
 4 201501210
 5 201501220
 6 NOT_DATE0
 7 201501210
 8 201501220
 9 2015-01-23  0
 102015-01-23  0
 \.
 {noformat}
 I then upgraded to 0.13 and subsequently upgraded the MS with the following 
 command: {{schematool -dbType postgres -upgradeSchema -verbose}}
 The file {{postgres-post-upgrade.sql}} is the post-upgrade db dump. As you 
 can see the data is completely corrupt.
 {noformat}
 [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY 
 PARTITION_KEY_VALS' postgres-post-upgrade.sql 
 COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, 
 SD_ID, TBL_ID) FROM stdin;
 3 1421943647  0   ddate=NOT_DATE  6   2
 4 1421943647  0   ddate=20150121  7   2
 5 

[jira] [Created] (HIVE-9509) Restore partition spec validation removed by HIVE-9445

2015-01-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-9509:
--

 Summary: Restore partition spec validation removed by HIVE-9445
 Key: HIVE-9509
 URL: https://issues.apache.org/jira/browse/HIVE-9509
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9273) Add option to fire metastore event on insert

2015-01-29 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297398#comment-14297398
 ] 

Sushanth Sowmyan commented on HIVE-9273:



a) I like that you changed the return type to FireResponseType from void - that 
allows for future growth if we need to ACK anything.
b) In the FireEventRequest thrift definition, I wondered about whether 
tableName should really be optional, but I think that is important for future 
listener events which might not map to table events exactly. But reasoning 
along that line, shouldn't dbName also be optional? We could have 
warehouse-level events we might want to fire.
c) Given that  FireEventRequestData data in FireEventRequest is marked as 
optional, I think there should be a null-guard on 
HiveMetaStore.fire_listener_event when switching on 
rqst.getData().getSetField() ?
d) This can be tackled as a separate bug, but we should fire a FireEventRequest 
from HCatalog appends as well.

[~ashutoshc], could I have a backup review on the changes this patch makes to 
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ? To me, the changes 
look reasonable, but I'm unsure if this is exhaustive in all the places we 
would need to change to ensure we trigger this event for new files/data being 
added to a table which does not result in a metadata change(i.e. append cases)

 Add option to fire metastore event on insert
 

 Key: HIVE-9273
 URL: https://issues.apache.org/jira/browse/HIVE-9273
 Project: Hive
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-9273.patch


 HIVE-9271 adds the ability for the client to request firing metastore events. 
  This can be used in the MoveTask to fire events when an insert is done that 
 does not add partitions to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Julian Hyde (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde moved CALCITE-578 to HIVE-9510:
---

Affects Version/s: (was: 1.0.0-incubating)
 Workflow: no-reopen-closed, patch-avail  (was: jira)
  Key: HIVE-9510  (was: CALCITE-578)
  Project: Hive  (was: Calcite)

 Throwing null point exception , when get join distinct row count from 
 RelMdUtil.java class
 --

 Key: HIVE-9510
 URL: https://issues.apache.org/jira/browse/HIVE-9510
 Project: Hive
  Issue Type: Bug
Reporter: asko
Assignee: Julian Hyde
 Attachments: log3_cbo5


 Setting log level in logging.properties file as following:
 handlers=java.util.logging.ConsoleHandler
 .level=INFO
 org.apache.calcite.plan.RelOptPlanner.level=ALL
 java.util.logging.ConsoleHandler.level=ALL
 Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
 but running  failed.
 QL:
 set  hive.cbo.enable=true;
 --ANALYZE TABLE customer COMPUTE STATISTICS for columns;
 --ANALYZE TABLE orders COMPUTE STATISTICS for columns;
 --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;
 --Q3
 -- the query
 select 
   l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
 o_shippriority 
 from 
   lineitem l join orders o 
 on l.l_orderkey = o.o_orderkey
   join customer c
 on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
 where 
   o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
 group by l_orderkey, o_orderdate, o_shippriority 
 order by revenue desc, o_orderdate 
 limit 10;
 LOG:
 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner 
 fireRule
 FINE: call#15: Apply rule [FilterProjectTransposeRule] to 
 [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
 '1995-03-15')), 
 rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveFilter#138
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveProject#139
 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner 
 notifyTransformation
 FINE: call#15: Rule FilterProjectTransposeRule arguments 
 [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
 '1995-03-15')), 
 rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
  produced HiveProject#139
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HepRelVertex#140
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveProject#141
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HepRelVertex#142
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
 Foreign Key relation:
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
 HiveJoin(condition=[=($0, $4)], joinType=[inner])
   HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
 l_shipdate=[$10])
 HiveFilter(condition=[($10, '1995-03-15')])
   HiveTableScan(table=[[default.lineitem]])
   HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
 o_shippriority=[$7])
 HiveFilter(condition=[($4, '1995-03-15')])
   HiveTableScan(table=[[default.orders]])
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign 
 Key join:
   fkSide = 1
   FKInfo:FKInfo(rowCount=1.00,ndv=-1.00)
   PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00)
   isPKSideSimple:false
   NDV Scaling Factor:1.00
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
 Foreign Key relation:
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
 HiveJoin(condition=[=($8, $5)], joinType=[inner])
   HiveJoin(condition=[=($0, $4)], joinType=[inner])
 HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
 l_shipdate=[$10])
   HiveFilter(condition=[($10, '1995-03-15')])
 HiveTableScan(table=[[default.lineitem]])
 HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
 o_shippriority=[$7])
   HiveFilter(condition=[($4, '1995-03-15')])
 HiveTableScan(table=[[default.orders]])
   HiveProject(c_custkey=[$0], c_mktsegment=[$6])
 HiveFilter(condition=[=($6, 'BUILDING')])
   HiveTableScan(table=[[default.customer]])
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign 
 Key join:
   fkSide = 1
   FKInfo:FKInfo(rowCount=1.00,ndv=-1.00)
   PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00)
   isPKSideSimple:false
   NDV Scaling Factor:1.00
 Jan 29, 2015 

[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp

2015-01-29 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297406#comment-14297406
 ] 

Jason Dere commented on HIVE-5472:
--

Looks like these test failures have been failing in other precommit runs as 
well. Doesn't look to be related.

 support a simple scalar which returns the current timestamp
 ---

 Key: HIVE-5472
 URL: https://issues.apache.org/jira/browse/HIVE-5472
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0
Reporter: N Campbell
Assignee: Jason Dere
 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, 
 HIVE-5472.4.patch


 ISO-SQL has two forms of functions
 local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE 
 and the latter with TIME ZONE
 select cast ( unix_timestamp() as timestamp ) from T
 implement a function which computes LOCAL TIMESTAMP which would be the 
 current timestamp for the users session time zone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297412#comment-14297412
 ] 

Julian Hyde commented on HIVE-9510:
---

I moved this from CALCITE to HIVE because even though the error comes from 
Calcite code, it should be investigated as a Hive issue. The likely cause is 
that Hive did not set up its metadata provider correctly.

 Throwing null point exception , when get join distinct row count from 
 RelMdUtil.java class
 --

 Key: HIVE-9510
 URL: https://issues.apache.org/jira/browse/HIVE-9510
 Project: Hive
  Issue Type: Bug
Reporter: asko
Assignee: Julian Hyde
 Attachments: log3_cbo5


 Setting log level in logging.properties file as following:
 handlers=java.util.logging.ConsoleHandler
 .level=INFO
 org.apache.calcite.plan.RelOptPlanner.level=ALL
 java.util.logging.ConsoleHandler.level=ALL
 Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
 but running  failed.
 QL:
 set  hive.cbo.enable=true;
 --ANALYZE TABLE customer COMPUTE STATISTICS for columns;
 --ANALYZE TABLE orders COMPUTE STATISTICS for columns;
 --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;
 --Q3
 -- the query
 select 
   l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
 o_shippriority 
 from 
   lineitem l join orders o 
 on l.l_orderkey = o.o_orderkey
   join customer c
 on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
 where 
   o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
 group by l_orderkey, o_orderdate, o_shippriority 
 order by revenue desc, o_orderdate 
 limit 10;
 LOG:
 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner 
 fireRule
 FINE: call#15: Apply rule [FilterProjectTransposeRule] to 
 [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
 '1995-03-15')), 
 rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveFilter#138
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveProject#139
 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner 
 notifyTransformation
 FINE: call#15: Rule FilterProjectTransposeRule arguments 
 [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
 '1995-03-15')), 
 rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
  produced HiveProject#139
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HepRelVertex#140
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HiveProject#141
 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
 FINEST: new HepRelVertex#142
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
 Foreign Key relation:
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
 HiveJoin(condition=[=($0, $4)], joinType=[inner])
   HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
 l_shipdate=[$10])
 HiveFilter(condition=[($10, '1995-03-15')])
   HiveTableScan(table=[[default.lineitem]])
   HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
 o_shippriority=[$7])
 HiveFilter(condition=[($4, '1995-03-15')])
   HiveTableScan(table=[[default.orders]])
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign 
 Key join:
   fkSide = 1
   FKInfo:FKInfo(rowCount=1.00,ndv=-1.00)
   PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00)
   isPKSideSimple:false
   NDV Scaling Factor:1.00
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
 Foreign Key relation:
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
 HiveJoin(condition=[=($8, $5)], joinType=[inner])
   HiveJoin(condition=[=($0, $4)], joinType=[inner])
 HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
 l_shipdate=[$10])
   HiveFilter(condition=[($10, '1995-03-15')])
 HiveTableScan(table=[[default.lineitem]])
 HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
 o_shippriority=[$7])
   HiveFilter(condition=[($4, '1995-03-15')])
 HiveTableScan(table=[[default.orders]])
   HiveProject(c_custkey=[$0], c_mktsegment=[$6])
 HiveFilter(condition=[=($6, 'BUILDING')])
   HiveTableScan(table=[[default.customer]])
 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign 
 Key join:
   fkSide = 1
   FKInfo:FKInfo(rowCount=1.00,ndv=-1.00)
   PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00)
   isPKSideSimple:false
   NDV 

[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml

2015-01-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8307:
---
Attachment: HIVE-8307.patch

Patch to remove comments from serde properties. Expecting more failures for 
golden file updates. [~hagleitn] Can you take a look?

 null character in columns.comments schema property breaks jobconf.xml
 -

 Key: HIVE-8307
 URL: https://issues.apache.org/jira/browse/HIVE-8307
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Carl Laird
 Attachments: HIVE-8307.patch


 It would appear that the fix for 
 https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character 
 to show up in job config xml files:
 I get the following when trying to insert into an elasticsearch backed table:
 [Fatal Error] :336:51: Character reference #
 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 Exception in thread main java.lang.RuntimeException: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263)
 at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:416)
 at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604)
 at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; 
 Character reference #
 at 
 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251)
 at 
 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181)
 ... 11 more
 Execution failed with exit status: 1
 Line 336 of jobconf.xml:
 propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property
 See 
 https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ 
 for more discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml

2015-01-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8307:
---
 Assignee: Ashutosh Chauhan
Affects Version/s: 0.14.0
   Status: Patch Available  (was: Open)

 null character in columns.comments schema property breaks jobconf.xml
 -

 Key: HIVE-8307
 URL: https://issues.apache.org/jira/browse/HIVE-8307
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1, 0.14.0, 0.13.0
Reporter: Carl Laird
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8307.patch


 It would appear that the fix for 
 https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character 
 to show up in job config xml files:
 I get the following when trying to insert into an elasticsearch backed table:
 [Fatal Error] :336:51: Character reference #
 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 Exception in thread main java.lang.RuntimeException: 
 org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character 
 reference #
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263)
 at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129)
 at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:416)
 at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604)
 at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; 
 Character reference #
 at 
 com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251)
 at 
 com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300)
 at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
 at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181)
 ... 11 more
 Execution failed with exit status: 1
 Line 336 of jobconf.xml:
 propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property
 See 
 https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ 
 for more discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30422: remove comments from serde properties.

2015-01-29 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30422/
---

Review request for hive and Gunther Hagleitner.


Bugs: HIVE-8307
https://issues.apache.org/jira/browse/HIVE-8307


Repository: hive-git


Description
---

remove comments from serde properties.


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
612f927 
  ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3204af8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out cea9eb5 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out b696e83 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out c58fa36 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 772ccec 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out ea7b8ff 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 24011a3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 969189f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out f458c33 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 52a69e8 
  ql/src/test/results/clientpositive/binary_output_format.q.out a0e8e83 
  ql/src/test/results/clientpositive/bucket1.q.out 13ec735 
  ql/src/test/results/clientpositive/bucket2.q.out 32a77c3 
  ql/src/test/results/clientpositive/bucket3.q.out ff7173e 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 69a61d4 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out fc55855 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 48e9f10 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out 695feb1 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out b3929f3 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out cd81f9e 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out ef45b4a 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out 62edc1d 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out bd79ff2 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out b6a9ad2 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out b8e4b41 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out 493e038 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 3a4b2b5 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out 537f19f 
  ql/src/test/results/clientpositive/bucketmapjoin13.q.out a296197 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 6d48156 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 01d7cc9 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out e4c87fa 
  ql/src/test/results/clientpositive/char_serde.q.out 8f6f8ce 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out e431b0f 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out de21af8 
  ql/src/test/results/clientpositive/date_serde.q.out ff09f70 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
a209ae9 
  ql/src/test/results/clientpositive/filter_join_breaktask.q.out 3631412 
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out 71a6578 
  ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
4414b79 
  ql/src/test/results/clientpositive/groupby_ppr.q.out 4fdcbfd 
  ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out cd3454c 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out 4e5c96f 
  ql/src/test/results/clientpositive/input23.q.out 0bd543b 
  ql/src/test/results/clientpositive/input42.q.out 95e8553 
  ql/src/test/results/clientpositive/input_part1.q.out b71faff 
  ql/src/test/results/clientpositive/input_part2.q.out 77da2eb 
  ql/src/test/results/clientpositive/input_part7.q.out 6094f9c 
  ql/src/test/results/clientpositive/input_part9.q.out 6e60679 
  ql/src/test/results/clientpositive/join17.q.out 26aabcf 
  ql/src/test/results/clientpositive/join26.q.out 148479a 
  ql/src/test/results/clientpositive/join32.q.out 9a24d8c 
  ql/src/test/results/clientpositive/join32_lessSize.q.out 20858cb 
  ql/src/test/results/clientpositive/join33.q.out 9a24d8c 
  ql/src/test/results/clientpositive/join34.q.out a20e49f 
  ql/src/test/results/clientpositive/join35.q.out 937539c 
  ql/src/test/results/clientpositive/join9.q.out 8421036 
  ql/src/test/results/clientpositive/join_filters_overlap.q.out 00ca0e5 
  ql/src/test/results/clientpositive/join_map_ppr.q.out 349c9f5 
  

Re: Review Request 30422: remove comments from serde properties.

2015-01-29 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30422/
---

(Updated Jan. 29, 2015, 7:41 p.m.)


Review request for hive and Gunther Hagleitner.


Bugs: HIVE-8307
https://issues.apache.org/jira/browse/HIVE-8307


Repository: hive-git


Description
---

remove comments from serde properties.


Diffs
-

  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
612f927 
  ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3204af8 
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out cea9eb5 
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out b696e83 
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out c58fa36 
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 772ccec 
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out ea7b8ff 
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 24011a3 
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 969189f 
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out f458c33 
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 52a69e8 
  ql/src/test/results/clientpositive/binary_output_format.q.out a0e8e83 
  ql/src/test/results/clientpositive/bucket1.q.out 13ec735 
  ql/src/test/results/clientpositive/bucket2.q.out 32a77c3 
  ql/src/test/results/clientpositive/bucket3.q.out ff7173e 
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 69a61d4 
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out fc55855 
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 48e9f10 
  ql/src/test/results/clientpositive/bucketcontext_2.q.out 695feb1 
  ql/src/test/results/clientpositive/bucketcontext_3.q.out b3929f3 
  ql/src/test/results/clientpositive/bucketcontext_4.q.out cd81f9e 
  ql/src/test/results/clientpositive/bucketcontext_5.q.out ef45b4a 
  ql/src/test/results/clientpositive/bucketcontext_6.q.out 62edc1d 
  ql/src/test/results/clientpositive/bucketcontext_7.q.out bd79ff2 
  ql/src/test/results/clientpositive/bucketcontext_8.q.out b6a9ad2 
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out b8e4b41 
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out 493e038 
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 3a4b2b5 
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out 537f19f 
  ql/src/test/results/clientpositive/bucketmapjoin13.q.out a296197 
  ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
  ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
  ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
  ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 6d48156 
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out 01d7cc9 
  ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
  ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out e4c87fa 
  ql/src/test/results/clientpositive/char_serde.q.out 8f6f8ce 
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out e431b0f 
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out de21af8 
  ql/src/test/results/clientpositive/date_serde.q.out ff09f70 
  ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out 
a209ae9 
  ql/src/test/results/clientpositive/filter_join_breaktask.q.out 3631412 
  ql/src/test/results/clientpositive/groupby_map_ppr.q.out 71a6578 
  ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
4414b79 
  ql/src/test/results/clientpositive/groupby_ppr.q.out 4fdcbfd 
  ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out cd3454c 
  ql/src/test/results/clientpositive/groupby_sort_6.q.out 4e5c96f 
  ql/src/test/results/clientpositive/input23.q.out 0bd543b 
  ql/src/test/results/clientpositive/input42.q.out 95e8553 
  ql/src/test/results/clientpositive/input_part1.q.out b71faff 
  ql/src/test/results/clientpositive/input_part2.q.out 77da2eb 
  ql/src/test/results/clientpositive/input_part7.q.out 6094f9c 
  ql/src/test/results/clientpositive/input_part9.q.out 6e60679 
  ql/src/test/results/clientpositive/join17.q.out 26aabcf 
  ql/src/test/results/clientpositive/join26.q.out 148479a 
  ql/src/test/results/clientpositive/join32.q.out 9a24d8c 
  ql/src/test/results/clientpositive/join32_lessSize.q.out 20858cb 
  ql/src/test/results/clientpositive/join33.q.out 9a24d8c 
  ql/src/test/results/clientpositive/join34.q.out a20e49f 
  ql/src/test/results/clientpositive/join35.q.out 937539c 
  ql/src/test/results/clientpositive/join9.q.out 8421036 
  ql/src/test/results/clientpositive/join_filters_overlap.q.out 00ca0e5 
  ql/src/test/results/clientpositive/join_map_ppr.q.out 349c9f5 
  

[jira] [Commented] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-29 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297426#comment-14297426
 ] 

Mithun Radhakrishnan commented on HIVE-9471:


Unrelated test-failures, methinks. 

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9509) Restore partition spec validation removed by HIVE-9445

2015-01-29 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9509:
---
Status: Patch Available  (was: Open)

 Restore partition spec validation removed by HIVE-9445
 --

 Key: HIVE-9509
 URL: https://issues.apache.org/jira/browse/HIVE-9509
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-9509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9509) Restore partition spec validation removed by HIVE-9445

2015-01-29 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9509:
---
Attachment: HIVE-9509.patch

[~ashutoshc] [~brocknoland] can you take a look?

 Restore partition spec validation removed by HIVE-9445
 --

 Key: HIVE-9509
 URL: https://issues.apache.org/jira/browse/HIVE-9509
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-9509.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297457#comment-14297457
 ] 

Marcelo Vanzin commented on HIVE-9487:
--

I failed git branch management 101. New patch should be correct.

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin updated HIVE-9487:
-
Attachment: HIVE-9487.2-spark.patch

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8379) NanoTimeUtils performs some work needlessly

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8379:
--
Status: Patch Available  (was: Open)

 NanoTimeUtils performs some work needlessly
 ---

 Key: HIVE-8379
 URL: https://issues.apache.org/jira/browse/HIVE-8379
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-8379.1.patch


 Portions of the math done with the constants can be pre-computed:
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java#L70



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8379) NanoTimeUtils performs some work needlessly

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8379:
--
Attachment: HIVE-8379.1.patch

Patch attached that makes the code more readable by using constants names 
specific to the nano time.

I run some JMH micro benchmark times look almost the same for both approaches.

 NanoTimeUtils performs some work needlessly
 ---

 Key: HIVE-8379
 URL: https://issues.apache.org/jira/browse/HIVE-8379
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-8379.1.patch


 Portions of the math done with the constants can be pre-computed:
 https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java#L70



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-29 Thread Brock Noland
Congratulations!! :)

On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl



[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-9510:
---
Description: 
Setting log level in logging.properties file as following:
{noformat}
handlers=java.util.logging.ConsoleHandler.level=INFO
org.apache.calcite.plan.RelOptPlanner.level=ALL
java.util.logging.ConsoleHandler.level=ALL
{noformat}

Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
but running  failed.
QL:
{code:sql}
set  hive.cbo.enable=true;
--ANALYZE TABLE customer COMPUTE STATISTICS for columns;
--ANALYZE TABLE orders COMPUTE STATISTICS for columns;
--ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;

--Q3
-- the query
select 
  l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
o_shippriority 
from 
  lineitem l join orders o 
on l.l_orderkey = o.o_orderkey
  join customer c
on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
where 
  o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
group by l_orderkey, o_orderdate, o_shippriority 
order by revenue desc, o_orderdate 
limit 10;
{code}

LOG:






  was:
Setting log level in logging.properties file as following:
{noformat}
handlers=java.util.logging.ConsoleHandler.level=INFO
org.apache.calcite.plan.RelOptPlanner.level=ALL
java.util.logging.ConsoleHandler.level=ALL
{noformat}

Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
but running  failed.
QL:
set  hive.cbo.enable=true;
--ANALYZE TABLE customer COMPUTE STATISTICS for columns;
--ANALYZE TABLE orders COMPUTE STATISTICS for columns;
--ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;

--Q3
-- the query
{code:sql}
select 
  l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
o_shippriority 
from 
  lineitem l join orders o 
on l.l_orderkey = o.o_orderkey
  join customer c
on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
where 
  o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
group by l_orderkey, o_orderdate, o_shippriority 
order by revenue desc, o_orderdate 
limit 10;
{code}

LOG:
Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner fireRule
FINE: call#15: Apply rule [FilterProjectTransposeRule] to 
[rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
'1995-03-15')), 
rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
FINEST: new HiveFilter#138
Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
FINEST: new HiveProject#139
Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner 
notifyTransformation
FINE: call#15: Rule FilterProjectTransposeRule arguments 
[rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, 
'1995-03-15')), 
rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)]
 produced HiveProject#139
Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
FINEST: new HepRelVertex#140
Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
FINEST: new HiveProject#141
Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init
FINEST: new HepRelVertex#142
15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
Foreign Key relation:
15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
HiveJoin(condition=[=($0, $4)], joinType=[inner])
  HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
l_shipdate=[$10])
HiveFilter(condition=[($10, '1995-03-15')])
  HiveTableScan(table=[[default.lineitem]])
  HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
o_shippriority=[$7])
HiveFilter(condition=[($4, '1995-03-15')])
  HiveTableScan(table=[[default.orders]])

15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key 
join:
fkSide = 1
FKInfo:FKInfo(rowCount=1.00,ndv=-1.00)
PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00)
isPKSideSimple:false
NDV Scaling Factor:1.00

15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - 
Foreign Key relation:
15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: 
HiveJoin(condition=[=($8, $5)], joinType=[inner])
  HiveJoin(condition=[=($0, $4)], joinType=[inner])
HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], 
l_shipdate=[$10])
  HiveFilter(condition=[($10, '1995-03-15')])
HiveTableScan(table=[[default.lineitem]])
HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], 
o_shippriority=[$7])
  HiveFilter(condition=[($4, '1995-03-15')])
HiveTableScan(table=[[default.orders]])
  HiveProject(c_custkey=[$0], c_mktsegment=[$6])
HiveFilter(condition=[=($6, 'BUILDING')])
  

[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-9510:
---
Description: 
Setting log level in logging.properties file as following:
{noformat}
handlers=java.util.logging.ConsoleHandler.level=INFO
org.apache.calcite.plan.RelOptPlanner.level=ALL
java.util.logging.ConsoleHandler.level=ALL
{noformat}

Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
but running  failed.
QL:
{code:sql}
set  hive.cbo.enable=true;
--ANALYZE TABLE customer COMPUTE STATISTICS for columns;
--ANALYZE TABLE orders COMPUTE STATISTICS for columns;
--ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;

--Q3
-- the query
select 
  l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
o_shippriority 
from 
  lineitem l join orders o 
on l.l_orderkey = o.o_orderkey
  join customer c
on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
where 
  o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
group by l_orderkey, o_orderdate, o_shippriority 
order by revenue desc, o_orderdate 
limit 10;
{code}

LOG:
see log.txt





  was:
Setting log level in logging.properties file as following:
{noformat}
handlers=java.util.logging.ConsoleHandler.level=INFO
org.apache.calcite.plan.RelOptPlanner.level=ALL
java.util.logging.ConsoleHandler.level=ALL
{noformat}

Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
but running  failed.
QL:
{code:sql}
set  hive.cbo.enable=true;
--ANALYZE TABLE customer COMPUTE STATISTICS for columns;
--ANALYZE TABLE orders COMPUTE STATISTICS for columns;
--ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;

--Q3
-- the query
select 
  l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
o_shippriority 
from 
  lineitem l join orders o 
on l.l_orderkey = o.o_orderkey
  join customer c
on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
where 
  o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
group by l_orderkey, o_orderdate, o_shippriority 
order by revenue desc, o_orderdate 
limit 10;
{code}

LOG:







 Throwing null point exception , when get join distinct row count from 
 RelMdUtil.java class
 --

 Key: HIVE-9510
 URL: https://issues.apache.org/jira/browse/HIVE-9510
 Project: Hive
  Issue Type: Bug
Reporter: asko
Assignee: Julian Hyde
 Attachments: log.txt, log3_cbo5


 Setting log level in logging.properties file as following:
 {noformat}
 handlers=java.util.logging.ConsoleHandler.level=INFO
 org.apache.calcite.plan.RelOptPlanner.level=ALL
 java.util.logging.ConsoleHandler.level=ALL
 {noformat}
 Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
 but running  failed.
 QL:
 {code:sql}
 set  hive.cbo.enable=true;
 --ANALYZE TABLE customer COMPUTE STATISTICS for columns;
 --ANALYZE TABLE orders COMPUTE STATISTICS for columns;
 --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;
 --Q3
 -- the query
 select 
   l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
 o_shippriority 
 from 
   lineitem l join orders o 
 on l.l_orderkey = o.o_orderkey
   join customer c
 on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
 where 
   o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
 group by l_orderkey, o_orderdate, o_shippriority 
 order by revenue desc, o_orderdate 
 limit 10;
 {code}
 LOG:
 see log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-9510:
---
Attachment: log.txt

 Throwing null point exception , when get join distinct row count from 
 RelMdUtil.java class
 --

 Key: HIVE-9510
 URL: https://issues.apache.org/jira/browse/HIVE-9510
 Project: Hive
  Issue Type: Bug
Reporter: asko
Assignee: Julian Hyde
 Attachments: log.txt, log3_cbo5


 Setting log level in logging.properties file as following:
 {noformat}
 handlers=java.util.logging.ConsoleHandler.level=INFO
 org.apache.calcite.plan.RelOptPlanner.level=ALL
 java.util.logging.ConsoleHandler.level=ALL
 {noformat}
 Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
 but running  failed.
 QL:
 {code:sql}
 set  hive.cbo.enable=true;
 --ANALYZE TABLE customer COMPUTE STATISTICS for columns;
 --ANALYZE TABLE orders COMPUTE STATISTICS for columns;
 --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;
 --Q3
 -- the query
 select 
   l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
 o_shippriority 
 from 
   lineitem l join orders o 
 on l.l_orderkey = o.o_orderkey
   join customer c
 on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
 where 
   o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
 group by l_orderkey, o_orderdate, o_shippriority 
 order by revenue desc, o_orderdate 
 limit 10;
 {code}
 LOG:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9511) Switch Tez to 0.6.0

2015-01-29 Thread Damien Carol (JIRA)
Damien Carol created HIVE-9511:
--

 Summary: Switch Tez to 0.6.0
 Key: HIVE-9511
 URL: https://issues.apache.org/jira/browse/HIVE-9511
 Project: Hive
  Issue Type: Improvement
Reporter: Damien Carol


Tez 0.6.0 has been released.
Research to switch to version 0.6.0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-9430:
-

Assignee: Sergio Peña

 NullPointerException on ALTER TABLE ADD PARTITION if no value given
 ---

 Key: HIVE-9430
 URL: https://issues.apache.org/jira/browse/HIVE-9430
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Danny Lade
Assignee: Sergio Peña

 ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException:
 {code:java}
 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver 
 (SessionState.java:printError(545)) - FAILED: NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 Therefore there is currently no way to add a partition to an already existing 
 table.:
 {code:SQL}
 alter table XXX add partition (YYY = 'VALUE');
 FAILED: SemanticException table is not partitioned but partition spec exists: 
 {YYY=VALUE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-9430:
--
Attachment: HIVE-9430.1.patch

 NullPointerException on ALTER TABLE ADD PARTITION if no value given
 ---

 Key: HIVE-9430
 URL: https://issues.apache.org/jira/browse/HIVE-9430
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Danny Lade
Assignee: Sergio Peña
 Attachments: HIVE-9430.1.patch


 ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException:
 {code:java}
 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver 
 (SessionState.java:printError(545)) - FAILED: NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 Therefore there is currently no way to add a partition to an already existing 
 table.:
 {code:SQL}
 alter table XXX add partition (YYY = 'VALUE');
 FAILED: SemanticException table is not partitioned but partition spec exists: 
 {YYY=VALUE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given

2015-01-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-9430:
--
Status: Patch Available  (was: Open)

 NullPointerException on ALTER TABLE ADD PARTITION if no value given
 ---

 Key: HIVE-9430
 URL: https://issues.apache.org/jira/browse/HIVE-9430
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Danny Lade
Assignee: Sergio Peña
 Attachments: HIVE-9430.1.patch


 ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException:
 {code:java}
 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver 
 (SessionState.java:printError(545)) - FAILED: NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 Therefore there is currently no way to add a partition to an already existing 
 table.:
 {code:SQL}
 alter table XXX add partition (YYY = 'VALUE');
 FAILED: SemanticException table is not partitioned but partition spec exists: 
 {YYY=VALUE}
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.

2015-01-29 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297809#comment-14297809
 ] 

Thejas M Nair commented on HIVE-9500:
-

[~aihuaxu] Is it failing for create table with avro format ? Does avro use 
lazysimpleserde ?


 Support nested structs over 24 levels.
 --

 Key: HIVE-9500
 URL: https://issues.apache.org/jira/browse/HIVE-9500
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
  Labels: SerDe

 Customer has deeply nested avro structure and is receiving the following 
 error when performing queries.
 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
 org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
 supported for LazySimpleSerde is 23 Unable to work with level 24
 Currently we support up to 24 levels of nested structs when 
 hive.serialization.extend.nesting.levels is set to true, while the customers 
 have the requirement to support more than that. 
 It would be better to make the supported levels configurable or completely 
 removed (i.e., we can support any number of levels). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class

2015-01-29 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-9510:
---
Component/s: CBO

 Throwing null point exception , when get join distinct row count from 
 RelMdUtil.java class
 --

 Key: HIVE-9510
 URL: https://issues.apache.org/jira/browse/HIVE-9510
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: asko
Assignee: Julian Hyde
 Attachments: log.txt, log3_cbo5


 Setting log level in logging.properties file as following:
 {noformat}
 handlers=java.util.logging.ConsoleHandler.level=INFO
 org.apache.calcite.plan.RelOptPlanner.level=ALL
 java.util.logging.ConsoleHandler.level=ALL
 {noformat}
 Running Q3 in TPCH-full  after modifying , in order to  test join reorder,
 but running  failed.
 QL:
 {code:sql}
 set  hive.cbo.enable=true;
 --ANALYZE TABLE customer COMPUTE STATISTICS for columns;
 --ANALYZE TABLE orders COMPUTE STATISTICS for columns;
 --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns;
 --Q3
 -- the query
 select 
   l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, 
 o_shippriority 
 from 
   lineitem l join orders o 
 on l.l_orderkey = o.o_orderkey
   join customer c
 on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey 
 where 
   o_orderdate  '1995-03-15' and l_shipdate  '1995-03-15' 
 group by l_orderkey, o_orderdate, o_shippriority 
 order by revenue desc, o_orderdate 
 limit 10;
 {code}
 LOG:
 see log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9498) Update golden files of join38 subquery_in on trunk due to 9327

2015-01-29 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297729#comment-14297729
 ] 

Prasanth Jayachandran commented on HIVE-9498:
-

+1

 Update golden files of join38  subquery_in on trunk due to 9327
 

 Key: HIVE-9498
 URL: https://issues.apache.org/jira/browse/HIVE-9498
 Project: Hive
  Issue Type: Task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-9498.patch


 Missed updating golden files for these tests while committing HIVE-9327



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297755#comment-14297755
 ] 

Hive QA commented on HIVE-9487:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695341/HIVE-9487.2-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7361 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/693/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/693/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-693/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695341 - PreCommit-HIVE-SPARK-Build

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference

2015-01-29 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297776#comment-14297776
 ] 

Xuefu Zhang commented on HIVE-9468:
---

parquet_types.q
{code}
 1 121 1   8   1.174970197678  2.062159062730128
---
 1 121 1   8   1.174970197678  2.0621590627301285
238c238
 3 120 1   7   1.171428578240531   1.8
---
 3 120 1   7   1.171428578240531   1.7996
{code}

 Test groupby3_map_skew.q fails due to decimal precision difference
 --

 Key: HIVE-9468
 URL: https://issues.apache.org/jira/browse/HIVE-9468
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Xuefu Zhang

 From test run, 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport:
  
 {code}
 Running: diff -a 
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out
  
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out
 162c162
  130091.0260.182 256.10355987055016  98.00.0 
 142.92680950752379  143.06995106518903  20428.07288 20469.0109
 ---
  130091.0260.182 256.10355987055016  98.00.0 
  142.9268095075238   143.06995106518906  20428.07288 20469.0109
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9514) schematool is broken in hive 1.0.0

2015-01-29 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297827#comment-14297827
 ] 

Thejas M Nair commented on HIVE-9514:
-

I ran all metastore tool unit tests for this and they pass. Also manually 
verified schema initialization and upgrade with derby. Also ran queries with 
hive.metastore.schema.verification=true



 schematool is broken in hive 1.0.0
 --

 Key: HIVE-9514
 URL: https://issues.apache.org/jira/browse/HIVE-9514
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.0.0

 Attachments: HIVE-9514.1.patch


 Schematool gives following error - 
 {code}
 bin/schematool -dbType derby -initSchema
 Starting metastore schema initialization to 1.0
 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified 
 for initialization: 1.0
 {code}
 Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for 
 new .sql files for 1.0.0. However, schematool needs to be made aware of the 
 metastore schema equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297605#comment-14297605
 ] 

Hive QA commented on HIVE-9103:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695326/HIVE-9103.2-spark.patch

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 7358 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_rearrange
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/692/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/692/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-692/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695326 - PreCommit-HIVE-SPARK-Build

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch, HIVE-9103.2-spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9514) schematool is broken in hive 1.0.0

2015-01-29 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9514:

Attachment: HIVE-9514.1.patch

 schematool is broken in hive 1.0.0
 --

 Key: HIVE-9514
 URL: https://issues.apache.org/jira/browse/HIVE-9514
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.0.0

 Attachments: HIVE-9514.1.patch


 Schematool gives following error - 
 {code}
 bin/schematool -dbType derby -initSchema
 Starting metastore schema initialization to 1.0
 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified 
 for initialization: 1.0
 {code}
 Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for 
 new .sql files for 1.0.0. However, schematool needs to be made aware of the 
 metastore schema equivalence.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9512) HIVE-9327 causing regression in stats annotation

2015-01-29 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297643#comment-14297643
 ] 

Prasanth Jayachandran commented on HIVE-9512:
-

[~jcamachorodriguez] Any idea why?

 HIVE-9327 causing regression in stats annotation
 

 Key: HIVE-9512
 URL: https://issues.apache.org/jira/browse/HIVE-9512
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran

 HIVE-9327 causes regression to statistics annotation test case. Regression 
 can be seen here
 https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/annotate_stats_select.q.out#L1065
 The expected data size is 194 but 0 is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9512) HIVE-9327 causing regression in stats annotation

2015-01-29 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-9512:
---

 Summary: HIVE-9327 causing regression in stats annotation
 Key: HIVE-9512
 URL: https://issues.apache.org/jira/browse/HIVE-9512
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran


HIVE-9327 causes regression to statistics annotation test case. Regression can 
be seen here

https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/annotate_stats_select.q.out#L1065

The expected data size is 194 but 0 is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.

2015-01-29 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297676#comment-14297676
 ] 

Aihua Xu commented on HIVE-9500:


It fails exactly the same place as the one in HIVE-3253 during the 
initialization of SerDe for the table creation and the query as well.

 Support nested structs over 24 levels.
 --

 Key: HIVE-9500
 URL: https://issues.apache.org/jira/browse/HIVE-9500
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
  Labels: SerDe

 Customer has deeply nested avro structure and is receiving the following 
 error when performing queries.
 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException 
 org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
 supported for LazySimpleSerde is 23 Unable to work with level 24
 Currently we support up to 24 levels of nested structs when 
 hive.serialization.extend.nesting.levels is set to true, while the customers 
 have the requirement to support more than that. 
 It would be better to make the supported levels configurable or completely 
 removed (i.e., we can support any number of levels). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9498) Update golden files of join38 subquery_in on trunk due to 9327

2015-01-29 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9498:
---
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Update golden files of join38  subquery_in on trunk due to 9327
 

 Key: HIVE-9498
 URL: https://issues.apache.org/jira/browse/HIVE-9498
 Project: Hive
  Issue Type: Task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.0

 Attachments: HIVE-9498.patch


 Missed updating golden files for these tests while committing HIVE-9327



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9392) JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName

2015-01-29 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297825#comment-14297825
 ] 

Prasanth Jayachandran commented on HIVE-9392:
-

There is another case where data size becomes 0. I am suspecting it to be 
caused by HIVE-9512.

 JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to 
 column names having duplicated fqColumnName
 

 Key: HIVE-9392
 URL: https://issues.apache.org/jira/browse/HIVE-9392
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth Jayachandran
Priority: Critical
 Fix For: 0.15.0

 Attachments: HIVE-9392.1.patch, HIVE-9392.2.patch


 In JoinStatsRule.process the join column statistics are stored in HashMap  
 joinedColStats, the key used which is the ColStatistics.fqColName is 
 duplicated between join column in the same vertex, as a result distinctVals 
 ends up having duplicated values which negatively affects the join 
 cardinality estimation.
 The duplicate keys are usually named KEY.reducesinkkey0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >