[jira] [Commented] (HIVE-9474) truncate table changes permissions on the target

2015-01-28 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295202#comment-14295202
 ] 

Aihua Xu commented on HIVE-9474:


The test failures are unrelated to the change.

 truncate table changes permissions on the target
 

 Key: HIVE-9474
 URL: https://issues.apache.org/jira/browse/HIVE-9474
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor
 Fix For: 0.15.0

 Attachments: HIVE-9474.1.patch, HIVE-9474.2.patch, HIVE-9474.3.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 Create a table test(a string); 
 Hive create table test(key string);
 Change the /user/hive/warehouse/test  permission to something else other than 
 the default, like 777.
 Hive dfs -chmod 777 /user/hive/warehouse/test;
 Hive dfs -ls -d /user/hive/warehouse/test;
 drwxrwxrwx   - axu wheel 68 2015-01-26 18:45 /user/hive/warehouse/test
 Then truncate table test; 
 Hive truncate table test;
 The permission goes back to the default.
 hive dfs -ls -d /user/hive/warehouse/test;
 drwxr-xr-x   - axu wheel 68 2015-01-27 10:09 /user/hive/warehouse/test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]

2015-01-28 Thread Tim Robertson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295212#comment-14295212
 ] 

Tim Robertson commented on HIVE-7387:
-

This affects anyone trying to use a custom UDF from the Hive CLI when the UDF 
depends on later Guava methods too.  
Suggest reopening this as a valid issue.

 Guava version conflict between hadoop and spark [Spark-Branch]
 --

 Key: HIVE-7387
 URL: https://issues.apache.org/jira/browse/HIVE-7387
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-7387-spark.patch


 The guava conflict happens in hive driver compile stage, as in the follow 
 exception stacktrace, conflict happens while initiate spark RDD in 
 SparkClient, hive driver take both guava 11 from hadoop classpath and spark 
 assembly jar which contains guava 14 classes in its classpath, spark invoked 
 HashFunction.hasInt which method does not exists in guava 11 version, obvious 
 the guava 11 version HashFunction is loaded into the JVM, which lead to a  
 NoSuchMethodError during initiate spark RDD.
 {code}
 java.lang.NoSuchMethodError: 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
   at 
 org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
   at 
 org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
   at 
 org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
   at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
   at 
 org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
   at 
 org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
   at 
 org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
   at 
 org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75)
   at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92)
   at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661)
   at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546)
   at 
 org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812)
   at 
 org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35)
   at 
 org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29)
   at 
 org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
   at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776)
   at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112)
   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 {code}
 NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9495) Map Side aggregation affecting map performance

2015-01-28 Thread Anand Sridharan (JIRA)
Anand Sridharan created HIVE-9495:
-

 Summary: Map Side aggregation affecting map performance
 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
Hortonworks Hadoop 2.2
Reporter: Anand Sridharan


When trying to run a simple aggregation query with hive.map.aggr=true, map 
tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.

e.g.
Consider the query:
INSERT OVERWRITE TABLE lineitem_tgt_agg SELECT alias.a0 as a0, alias.a2 as a1, 
alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 FROM (SELECT alias.a0 as a0, 
SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as 
a4 FROM (SELECT lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity 
* lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + 
lineitem_sf500.l_tax) AS DOUBLE) as a1, lineitem_sf500.l_quantity as a2, 
CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
lineitem_sf500.l_discount AS DOUBLE) as a3, CAST(lineitem_sf500.l_quantity * 
lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax AS DOUBLE) as a4 FROM 
lineitem_sf500) alias GROUP BY alias.a0) alias;

The above query was run with ~376GB of data / ~3billion records in the source.
It takes ~10 minutes with hive.map.aggr=false.
With map side aggregation set to true, the map tasks don't complete even after 
an hour.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9253) MetaStore server should support timeout for long running requests

2015-01-28 Thread Dong Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-9253:

Attachment: HIVE-9253.4.patch

Update V4 to address RB comments. Thank you [~leftylev], [~brocknoland] for 
your review and feedback.

With regard to client setting the timeout value, I left some reply comments in 
RB. 
A {{SessionPropertiesListener}} is added for handling client requesting timeout 
change. Client could use {{set 
metaconf:hive.metastore.server.running.method.timeout 500s}} to change timeout. 
If this solution is ok, we may need to document it for user.

 MetaStore server should support timeout for long running requests
 -

 Key: HIVE-9253
 URL: https://issues.apache.org/jira/browse/HIVE-9253
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Dong Chen
Assignee: Dong Chen
 Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, 
 HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.patch


 In the description of HIVE-7195, one issue is that MetaStore client timeout 
 is quite dumb. The client will timeout and the server has no idea the client 
 is gone.
 The server should support timeout when the request from client runs a long 
 time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9302) Beeline add jar local to client

2015-01-28 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-9302:
---
Attachment: HIVE-9302.2.patch

 Beeline add jar local to client
 ---

 Key: HIVE-9302
 URL: https://issues.apache.org/jira/browse/HIVE-9302
 Project: Hive
  Issue Type: New Feature
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, 
 HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, 
 postgresql-9.3.jdbc3.jar


 At present if a beeline user uses {{add jar}} the path they give is actually 
 on the HS2 server. It'd be great to allow beeline users to add local jars as 
 well.
 It might be useful to do this in the jdbc driver itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29807: HIVE-9253: MetaStore server should support timeout for long running requests

2015-01-28 Thread Dong Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29807/
---

(Updated Jan. 28, 2015, 8:58 a.m.)


Review request for hive.


Changes
---

Address comments from Lefty and Brock.


Repository: hive-git


Description
---

HIVE-9253: MetaStore server should support timeout for long running requests


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b 
  metastore/src/java/org/apache/hadoop/hive/metastore/Deadline.java 
PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/DeadlineException.java 
PRE-CREATION 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
fc6f067 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
574141c 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
01ad36a 
  
metastore/src/java/org/apache/hadoop/hive/metastore/SessionPropertiesListener.java
 PRE-CREATION 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestDeadline.java 
PRE-CREATION 
  
metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTimeout.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/29807/diff/


Testing
---

UT passed


Thanks,

Dong Chen



[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]

2015-01-28 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9477:
-
Status: Patch Available  (was: Open)

 No error thrown when global limit optimization failed to find enough number 
 of rows [Spark Branch]
 --

 Key: HIVE-9477
 URL: https://issues.apache.org/jira/browse/HIVE-9477
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-9477.1-spark.patch


 MR will throw an error in such a case and rerun the query with the 
 optimization disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294952#comment-14294952
 ] 

Hive QA commented on HIVE-9489:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694904/HIVE-9489.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7403 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2546/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2546/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2546/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694904 - PreCommit-HIVE-TRUNK-Build

 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance

2015-01-28 Thread Anand Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Sridharan updated HIVE-9495:
--
Attachment: profiler_screenshot.PNG

Profiler screenshot showing GroupByOperator.processHashAggr as hotspot.

 Map Side aggregation affecting map performance
 --

 Key: HIVE-9495
 URL: https://issues.apache.org/jira/browse/HIVE-9495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: RHEL 6.4
 Hortonworks Hadoop 2.2
Reporter: Anand Sridharan
 Attachments: profiler_screenshot.PNG


 When trying to run a simple aggregation query with hive.map.aggr=true, map 
 tasks take a lot of time in Hive 0.14 as against  with hive.map.aggr=false.
 e.g.
 Consider the query:
 INSERT OVERWRITE TABLE lineitem_tgt_agg SELECT alias.a0 as a0, alias.a2 as 
 a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 FROM (SELECT alias.a0 as 
 a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, 
 SUM(alias.a4) as a4 FROM (SELECT lineitem_sf500.l_orderkey as a0, 
 CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - 
 lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) AS DOUBLE) as a1, 
 lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * 
 lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount AS DOUBLE) as a3, 
 CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * 
 lineitem_sf500.l_tax AS DOUBLE) as a4 FROM lineitem_sf500) alias GROUP BY 
 alias.a0) alias;
 The above query was run with ~376GB of data / ~3billion records in the source.
 It takes ~10 minutes with hive.map.aggr=false.
 With map side aggregation set to true, the map tasks don't complete even 
 after an hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation

2015-01-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294894#comment-14294894
 ] 

Lefty Leverenz commented on HIVE-9489:
--

Hmph.  Not many typos for me to find.  ;)

{{+   * Certain optimizations should not be applied if UDF is not 
deterministic}}

... needs a period at end of line.

{{+   * don't apply for such UDFS, as they need to be invoked for each record.}}

... UDFs, not UDFS.

{{+   * A UDF is considered distinctLike if the udf can be evaluated on just 
the}}

... udf should be UDF.


 add javadoc for UDFType annotation
 --

 Key: HIVE-9489
 URL: https://issues.apache.org/jira/browse/HIVE-9489
 Project: Hive
  Issue Type: Bug
  Components: Documentation, UDF
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 1.2.0

 Attachments: HIVE-9489.1.patch


 It is not clearly described, when a UDF should be marked as deterministic, 
 stateful or distinctLike.
 Adding javadoc for now. This information should also be incorporated in the 
 wikidoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294883#comment-14294883
 ] 

Hive QA commented on HIVE-9486:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694894/HIVE-9486.1.patch.txt

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2545/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2545/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2545/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-2545/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java'
Reverted 
'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java'
Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py'
Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py'
Reverted 
'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote'
Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp'
Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp'
Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h'
Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h'
Reverted 
'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp'
Reverted 'metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb'
Reverted 'metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb'
Reverted 
'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FireEventRequest.java'
Reverted 
'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java'
Reverted 
'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java'
Reverted 'metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php'
Reverted 'metastore/src/gen/thrift/gen-php/metastore/Types.php'
Reverted 'metastore/if/hive_metastore.thrift'
Reverted 
'itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageDeserializer.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONMessageDeserializer.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONInsertMessage.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONMessageFactory.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/InsertMessage.java'
Reverted 
'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageFactory.java'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status 

[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted

2015-01-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294911#comment-14294911
 ] 

Lefty Leverenz commented on HIVE-8966:
--

Any documentation needed?

 Delta files created by hive hcatalog streaming cannot be compacted
 --

 Key: HIVE-8966
 URL: https://issues.apache.org/jira/browse/HIVE-8966
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
 Environment: hive
Reporter: Jihong Liu
Assignee: Alan Gates
Priority: Critical
 Fix For: 1.0.0

 Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, 
 HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, 
 HIVE-8966.patch


 hive hcatalog streaming will also create a file like bucket_n_flush_length in 
 each delta directory. Where n is the bucket number. But the 
 compactor.CompactorMR think this file also needs to compact. However this 
 file of course cannot be compacted, so compactor.CompactorMR will not 
 continue to do the compaction. 
 Did a test, after removed the bucket_n_flush_length file, then the alter 
 table partition compact finished successfully. If don't delete that file, 
 nothing will be compacted. 
 This is probably a very severity bug. Both 0.13 and 0.14 have this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]

2015-01-28 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9477:
-
Attachment: HIVE-9477.1-spark.patch

Rerun query when global limit optimization fails.

 No error thrown when global limit optimization failed to find enough number 
 of rows [Spark Branch]
 --

 Key: HIVE-9477
 URL: https://issues.apache.org/jira/browse/HIVE-9477
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-9477.1-spark.patch


 MR will throw an error in such a case and rerun the query with the 
 optimization disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader

2015-01-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9486:

Attachment: HIVE-9486.2.patch.txt

 Use session classloader instead of application loader
 -

 Key: HIVE-9486
 URL: https://issues.apache.org/jira/browse/HIVE-9486
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt


 From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html
 Looks reasonable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29898: HIVE-9298: Support reading alternate timestamp formats

2015-01-28 Thread Jason Dere


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  common/pom.xml, lines 59-63
  https://reviews.apache.org/r/29898/diff/2/?file=825966#file825966line59
 
  Since joda jar will be shipped to task nodes, this needs to be added in 
  hive-exec jar. I think we keep that list in one of the pom files. We need 
  to add this dep there.

Do you mean the artifact set for the shaded JAR goal in ql/pom.xml? I'll take a 
look at doing this.


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  common/src/java/org/apache/hive/common/util/TimestampParser.java, line 76
  https://reviews.apache.org/r/29898/diff/2/?file=825967#file825967line76
 
  Name suggests this can be an instance object. If we do that way, than 
  we can avoid creating this object per invocation, which will be nice if 
  possible.

The way this is currently set up the LazyTimestampObjectInspector (which I 
believe could be shared by different threads) points to a single 
TimestampParser. The Joda DateTimeFormatter is thread safe, so everything in 
parseTimestamp() should be thread safe except for mdt which is why I was 
creating a new object. I guess mdt could be made thread safe by making it a 
thread-local instance.  I'll make that change.


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  common/src/java/org/apache/hive/common/util/TimestampParser.java, line 127
  https://reviews.apache.org/r/29898/diff/2/?file=825967#file825967line127
 
  Can't we do Long.valueOf()? That will be faster than BD parsing, I 
  presume.

If we don't want to worry about fractional millisecond values, then we can do 
this. We're throwing away the fractional portion anyway since Joda does not 
have precision less than 1 ms. I'll change this.


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java,
   line 114
  https://reviews.apache.org/r/29898/diff/2/?file=825979#file825979line114
 
  This is thrift generated file. Instead of hand modifying you need to 
  put this in thrift file and generate it via thrift compiler.

Whoops missed that, thanks for pointing that out, will fix.


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java, 
  lines 135-137
  https://reviews.apache.org/r/29898/diff/2/?file=825983#file825983line135
 
  I wonder why these and lastColtakeRest are not included in 
  LazyOIParams. Seems to me, they should be included too. If you think 
  otherwise, it will be good to add a comment here about what distinguishes 
  these two set of params.

So I thought these params had more to do with the SerDe and handling of rows 
than they did with actual values and ObjectInspector-related handling, which is 
why I left those out of the lazy OI params. Admittedly it does look a bit odd 
to bundle some of the params together and leave others out. If you think I 
should just include those in I can do so.


 On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote:
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java, 
  line 649
  https://reviews.apache.org/r/29898/diff/2/?file=825983#file825983line649
 
  I think there is a helper method in apache commons (or guava) which can 
  let you do such parsing. Will be good to reuse that, if available.

Not sure if the commons/guava libs have something to escape commas (please 
correct me if I am wrong). I see that Hive uses opencsv which handles CSV-style 
escaping, I will use this to parse the list.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29898/#review69930
---


On Jan. 20, 2015, 12:34 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29898/
 ---
 
 (Updated Jan. 20, 2015, 12:34 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Bugs: HIVE-9298
 https://issues.apache.org/jira/browse/HIVE-9298
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add new SerDe parameter timestamp.formats to specify alternate timestamp 
 patterns
 
 
 Diffs
 -
 
   common/pom.xml ede8aea 
   common/src/java/org/apache/hive/common/util/TimestampParser.java 
 PRE-CREATION 
   common/src/test/org/apache/hive/common/util/TestTimestampParser.java 
 PRE-CREATION 
   data/files/ts_formats.txt PRE-CREATION 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/DefaultHBaseKeyFactory.java
  98bc73f 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseLazyObjectFactory.java
  78f23cb 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/struct/AvroHBaseValueFactory.java
  a2ba827 
   
 

[jira] [Commented] (HIVE-9493) Failed job may not throw exceptions [Spark Branch]

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294876#comment-14294876
 ] 

Hive QA commented on HIVE-9493:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694934/HIVE-9493.1-spark.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7357 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/688/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/688/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-688/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694934 - PreCommit-HIVE-SPARK-Build

 Failed job may not throw exceptions [Spark Branch]
 --

 Key: HIVE-9493
 URL: https://issues.apache.org/jira/browse/HIVE-9493
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-9493.1-spark.patch


 Currently remote driver assumes exception will be thrown when job fails to 
 run. This may not hold since job is submitted asynchronously. And we have to 
 check the futures before we decide the job is successful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29807: HIVE-9253: MetaStore server should support timeout for long running requests

2015-01-28 Thread Dong Chen


 On Jan. 21, 2015, 6:43 a.m., Lefty Leverenz wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, lines 372-374
  https://reviews.apache.org/r/29807/diff/2/?file=827704#file827704line372
 
  Shouldn't long  LONG be included in the names 
  hive.metastore.server.running.method.timeout  
  METASTORE_SERVER_RUNNING_METHOD_TIMEOUT?
  
  Also, please specify the JIRA number (HIVE-9253) in this review 
  request, either under Bugs in the Information section or in the Summary, or 
  both.

Thanks for your review and suggestion! Lefty. 
I have renamed it in the new patch.


- Dong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29807/#review68878
---


On Jan. 22, 2015, 8:22 a.m., Dong Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29807/
 ---
 
 (Updated Jan. 22, 2015, 8:22 a.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-9253: MetaStore server should support timeout for long running requests
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5e00575 
   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 caad948 
   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
 564ac8b 
   metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
 01ad36a 
   metastore/src/java/org/apache/hadoop/hive/metastore/RuntimeTimeout.java 
 PRE-CREATION 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/RuntimeTimeoutException.java
  PRE-CREATION 
   
 metastore/src/java/org/apache/hadoop/hive/metastore/SessionPropertiesListener.java
  PRE-CREATION 
   
 metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTimeout.java
  PRE-CREATION 
   metastore/src/test/org/apache/hadoop/hive/metastore/TestRuntimeTimeout.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/29807/diff/
 
 
 Testing
 ---
 
 UT passed
 
 
 Thanks,
 
 Dong Chen
 




[jira] [Commented] (HIVE-9302) Beeline add jar local to client

2015-01-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294853#comment-14294853
 ] 

Ferdinand Xu commented on HIVE-9302:


Sorry, I meant to. 
There are two kinds of use cases. One is to add an existing known driver like 
mysql driver or postgres driver. Current supported driver are postgres and 
mysql.
{noformat}
# beeline
beeline !addlocaldriverjar /path/to/mysql-connector-java-5.1.27-bin.jar
beeline !connect mysql://host:3306/testdb
{noformat}
And another is to add a customized driver.
{noformat}
# beeline
beeline!addlocaldriverjar /path/to/DummyDriver-1.0-SNAPSHOT.jar
beeline!!addlocaldrivername org.apache.dummy.DummyDrive
beeline !connect mysql://host:3306/testdb
{noformat}

 Beeline add jar local to client
 ---

 Key: HIVE-9302
 URL: https://issues.apache.org/jira/browse/HIVE-9302
 Project: Hive
  Issue Type: New Feature
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, 
 HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar


 At present if a beeline user uses {{add jar}} the path they give is actually 
 on the HS2 server. It'd be great to allow beeline users to add local jars as 
 well.
 It might be useful to do this in the jdbc driver itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9460) LLAP: Fix some static vars in the operator pipeline

2015-01-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294854#comment-14294854
 ] 

Lefty Leverenz commented on HIVE-9460:
--

Doc note:  This adds configuration parameter *hive.execution.mode* to 
HiveConf.java, so it will need to be documented in the wiki when the LLAP 
branch gets merged to trunk.

Should we add a TODOC-LLAP label to keep track of these doc issues?

 LLAP: Fix some static vars in the operator pipeline
 ---

 Key: HIVE-9460
 URL: https://issues.apache.org/jira/browse/HIVE-9460
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-9460.1.patch


 There are a few static vars left in the operator pipeline. Can't have those 
 with multi-threaded execution...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9273) Add option to fire metastore event on insert

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294880#comment-14294880
 ] 

Hive QA commented on HIVE-9273:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694889/HIVE-9273.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7405 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2544/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2544/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2544/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694889 - PreCommit-HIVE-TRUNK-Build

 Add option to fire metastore event on insert
 

 Key: HIVE-9273
 URL: https://issues.apache.org/jira/browse/HIVE-9273
 Project: Hive
  Issue Type: New Feature
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-9273.patch


 HIVE-9271 adds the ability for the client to request firing metastore events. 
  This can be used in the MoveTask to fire events when an insert is done that 
 does not add partitions to a table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294985#comment-14294985
 ] 

Hive QA commented on HIVE-9477:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694949/HIVE-9477.1-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7354 tests executed
*Failed tests:*
{noformat}
TestSQLStdHiveAccessControllerHS2 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/689/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/689/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-689/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694949 - PreCommit-HIVE-SPARK-Build

 No error thrown when global limit optimization failed to find enough number 
 of rows [Spark Branch]
 --

 Key: HIVE-9477
 URL: https://issues.apache.org/jira/browse/HIVE-9477
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-9477.1-spark.patch


 MR will throw an error in such a case and rerun the query with the 
 optimization disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9292) CBO (Calcite Return Path): Inline GroupBy, Properties

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9292:
--
Attachment: HIVE-9292.06.patch

New patch; addressed [~jpullokkaran] comments to remove the usage of 
ParseContext in RewriteQueryUsingAggregateIndexCtx after HIVE-9327 went in.

 CBO (Calcite Return Path): Inline GroupBy, Properties
 -

 Key: HIVE-9292
 URL: https://issues.apache.org/jira/browse/HIVE-9292
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9292.01.patch, HIVE-9292.02.patch, 
 HIVE-9292.03.patch, HIVE-9292.04.patch, HIVE-9292.05.patch, 
 HIVE-9292.06.patch, HIVE-9292.patch, HIVE-9292.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29763: HIVE-9292

2015-01-28 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29763/
---

(Updated Jan. 28, 2015, 10:59 a.m.)


Review request for hive and John Pullokkaran.


Changes
---

New patch; addressed John's comments to remove the usage of ParseContext in 
RewriteQueryUsingAggregateIndexCtx after HIVE-9327 went in.


Bugs: HIVE-9292
https://issues.apache.org/jira/browse/HIVE-9292


Repository: hive-git


Description
---

CBO (Calcite Return Path): Inline GroupBy, Properties


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java
 3097385b92d4398ee57d3544354b383fe24719dd 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java
 69a5a4409164fc6cb725b315de08ec9d090b7f22 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 
dda4f75592209d88f25b5ca09ea9f32c77ea4ac6 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
c9a5ce53ffc3d5c791e0826be0cac771a4d20254 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 
0116c85979f02ea0f88bbf8085a7590694eb2dfb 

Diff: https://reviews.apache.org/r/29763/diff/


Testing
---

Existing tests.


Thanks,

Jesús Camacho Rodríguez



[jira] [Commented] (HIVE-9302) Beeline add jar local to client

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295029#comment-14295029
 ] 

Hive QA commented on HIVE-9302:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694940/HIVE-9302.2.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7415 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[1]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0]
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2548/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2548/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2548/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694940 - PreCommit-HIVE-TRUNK-Build

 Beeline add jar local to client
 ---

 Key: HIVE-9302
 URL: https://issues.apache.org/jira/browse/HIVE-9302
 Project: Hive
  Issue Type: New Feature
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, 
 HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, 
 postgresql-9.3.jdbc3.jar


 At present if a beeline user uses {{add jar}} the path they give is actually 
 on the HS2 server. It'd be great to allow beeline users to add local jars as 
 well.
 It might be useful to do this in the jdbc driver itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295450#comment-14295450
 ] 

Hive QA commented on HIVE-9253:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12694989/HIVE-9253.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7407 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2551/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2551/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2551/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12694989 - PreCommit-HIVE-TRUNK-Build

 MetaStore server should support timeout for long running requests
 -

 Key: HIVE-9253
 URL: https://issues.apache.org/jira/browse/HIVE-9253
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Dong Chen
Assignee: Dong Chen
 Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, 
 HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.patch


 In the description of HIVE-7195, one issue is that MetaStore client timeout 
 is quite dumb. The client will timeout and the server has no idea the client 
 is gone.
 The server should support timeout when the request from client runs a long 
 time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9317) move Microsoft copyright to NOTICE file

2015-01-28 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-9317:

   Resolution: Fixed
Fix Version/s: 1.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed this. Thanks for the review, Alan.

 move Microsoft copyright to NOTICE file
 ---

 Key: HIVE-9317
 URL: https://issues.apache.org/jira/browse/HIVE-9317
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.15.0, 1.0.0

 Attachments: hive-9327.txt


 There are a set of files that still have the Microsoft copyright notices. 
 Those notices need to be moved into NOTICES and replaced with the standard 
 Apache headers.
 {code}
 ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
 ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Xuefu Zhang
Congratulations to all!

--Xuefu

On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl



[jira] [Commented] (HIVE-9303) Parquet files are written with incorrect definition levels

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295780#comment-14295780
 ] 

Hive QA commented on HIVE-9303:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695040/HIVE-9303.1.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7400 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
TestPigHBaseStorageHandler - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2553/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2553/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2553/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695040 - PreCommit-HIVE-TRUNK-Build

 Parquet files are written with incorrect definition levels
 --

 Key: HIVE-9303
 URL: https://issues.apache.org/jira/browse/HIVE-9303
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Skye Wanderman-Milne
Assignee: Sergio Peña
 Attachments: HIVE-9303.1.patch


 The definition level, which determines which level of nesting is NULL, 
 appears to always be n or n-1, where n is the maximum definition level. This 
 means that only the innermost level of nesting can be NULL. This is only 
 relevant for Parquet files. For example:
 {code:sql}
 CREATE TABLE text_tbl (a STRUCTb:STRUCTc:INT)
 STORED AS TEXTFILE;
 INSERT OVERWRITE TABLE text_tbl
 SELECT IF(false, named_struct(b, named_struct(c, 1)), NULL)
 FROM tbl LIMIT 1;
 CREATE TABLE parq_tbl
 STORED AS PARQUET
 AS SELECT * FROM text_tbl;
 SELECT * FROM text_tbl;
 = NULL # right
 SELECT * FROM parq_tbl;
 = {b:{c:null}} # wrong
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]

2015-01-28 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9477:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Test failures above don't seem related to this patch in any way. Patch 
committed to spark branch. Thanks, Rui.

 No error thrown when global limit optimization failed to find enough number 
 of rows [Spark Branch]
 --

 Key: HIVE-9477
 URL: https://issues.apache.org/jira/browse/HIVE-9477
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
Priority: Blocker
 Fix For: spark-branch

 Attachments: HIVE-9477.1-spark.patch


 MR will throw an error in such a case and rerun the query with the 
 optimization disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Szehon Ho
Thanks and congrats to Vikram, Jason, Owen, and Prasanth !

On Wed, Jan 28, 2015 at 1:28 PM, Hari Subramaniyan 
hsubramani...@hortonworks.com wrote:

  ​Congrats everyone!


  Thanks

 Hari
  --
 *From:* cwsteinb...@gmail.com cwsteinb...@gmail.com on behalf of Carl
 Steinbach c...@apache.org
 *Sent:* Wednesday, January 28, 2015 1:15 PM
 *To:* dev@hive.apache.org; u...@hive.apache.org
 *Cc:* sze...@apache.org; vik...@apache.org; jd...@apache.org; Owen
 O'Malley; prasan...@apache.org
 *Subject:* [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit,
 Jason Dere, Owen O'Malley and Prasanth Jayachandran

   I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

  Thanks.

  - Carl



[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml

2015-01-28 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295829#comment-14295829
 ] 

Vikram Dixit K commented on HIVE-8807:
--

If I end up rolling out a new release and we have a patch for this by then, I 
will include this in the next roll-out.

Thanks
Vikram.

 Obsolete default values in webhcat-default.xml
 --

 Key: HIVE-8807
 URL: https://issues.apache.org/jira/browse/HIVE-8807
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Lefty Leverenz
 Fix For: 0.14.1


 The defaults for templeton.pig.path  templeton.hive.path are 0.11 in 
 webhcat-default.xml but they ought to match current release numbers.
 The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Vaibhav Gumashta
Congratulations e’one!

—Vaibhav
On Jan 28, 2015, at 1:20 PM, Xuefu Zhang 
xzh...@cloudera.commailto:xzh...@cloudera.com wrote:

Congratulations to all!

--Xuefu

On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach 
c...@apache.orgmailto:c...@apache.org wrote:
I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen 
O'Malley and Prasanth Jayachandran have been elected to the Hive Project 
Management Committee. Please join me in congratulating the these new PMC 
members!

Thanks.

- Carl




Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Hari Subramaniyan
?Congrats everyone!


Thanks

Hari


From: cwsteinb...@gmail.com cwsteinb...@gmail.com on behalf of Carl Steinbach 
c...@apache.org
Sent: Wednesday, January 28, 2015 1:15 PM
To: dev@hive.apache.org; u...@hive.apache.org
Cc: sze...@apache.org; vik...@apache.org; jd...@apache.org; Owen O'Malley; 
prasan...@apache.org
Subject: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, 
Owen O'Malley and Prasanth Jayachandran

I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen 
O'Malley and Prasanth Jayachandran have been elected to the Hive Project 
Management Committee. Please join me in congratulating the these new PMC 
members!

Thanks.

- Carl


[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295892#comment-14295892
 ] 

Laljo John Pullokkaran commented on HIVE-9431:
--

[~jcamachorodriguez] Could you rebase the patch and resubmit the patch? Build 
seems to be failing with the patch

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9482) Hive parquet timestamp compatibility

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295905#comment-14295905
 ] 

Hive QA commented on HIVE-9482:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695072/HIVE-9482.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7406 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_external_time
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2554/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2554/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2554/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695072 - PreCommit-HIVE-TRUNK-Build

 Hive parquet timestamp compatibility
 

 Key: HIVE-9482
 URL: https://issues.apache.org/jira/browse/HIVE-9482
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.15.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 0.15.0

 Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, 
 parquet_external_time.parq


 In current Hive implementation, timestamps are stored in UTC (converted from 
 current timezone), based on original parquet timestamp spec.
 However, we find this is not compatibility with other tools, and after some 
 investigation it is not the way of the other file formats, or even some 
 databases (Hive Timestamp is more equivalent of 'timestamp without timezone' 
 datatype).
 This is the first part of the fix, which will restore compatibility with 
 parquet-timestamp files generated by external tools by skipping conversion on 
 reading.
 Later fix will change the write path to not convert, and stop the 
 read-conversion even for files written by Hive itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9431:
--
Attachment: HIVE-9431.02.patch

Rebasing patch.

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9485) Update trunk to 1.2.0-SNAPSHOT

2015-01-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295775#comment-14295775
 ] 

Thejas M Nair commented on HIVE-9485:
-

+1

 Update trunk to 1.2.0-SNAPSHOT
 --

 Key: HIVE-9485
 URL: https://issues.apache.org/jira/browse/HIVE-9485
 Project: Hive
  Issue Type: Task
Affects Versions: 1.2.0
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.2.0

 Attachments: HIVE-9485.1.patch


 As discussed on list, 0.14.1 will be 1.0 and 0.15 will be 1.1. As such we 
 should change trunk to 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-9485) Update trunk to 1.2.0-SNAPSHOT

2015-01-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295775#comment-14295775
 ] 

Thejas M Nair edited comment on HIVE-9485 at 1/28/15 8:16 PM:
--

+1
Thanks [~brocknoland]!


was (Author: thejas):
+1

 Update trunk to 1.2.0-SNAPSHOT
 --

 Key: HIVE-9485
 URL: https://issues.apache.org/jira/browse/HIVE-9485
 Project: Hive
  Issue Type: Task
Affects Versions: 1.2.0
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 1.2.0

 Attachments: HIVE-9485.1.patch


 As discussed on list, 0.14.1 will be 1.0 and 0.15 will be 1.1. As such we 
 should change trunk to 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Chao Sun
Congrats!!!

On Wed, Jan 28, 2015 at 1:21 PM, Vaibhav Gumashta vgumas...@hortonworks.com
 wrote:

 Congratulations e’one!

 —Vaibhav
 On Jan 28, 2015, at 1:20 PM, Xuefu Zhang xzh...@cloudera.commailto:
 xzh...@cloudera.com wrote:

 Congratulations to all!

 --Xuefu

 On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.orgmailto:
 c...@apache.org wrote:
 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl





-- 
Best,
Chao


Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/#review70062
---



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/29900/#comment115035

we should add a boolean false argument at the end here, so that it does not 
show up in the hive-default.xml.template file.  See hive.in.test for example. 

There are other hive.test params to be fixed similarly as well, we can do 
it as part of this one or a separate jira.



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
https://reviews.apache.org/r/29900/#comment115039

SessionState can be shared across multiple query executions, in 
hiveserver2. One case where this happens is when one user in Hue opens multiple 
tabs and runs queries from each of them simultaneously.

This means that there can be race conditions where multiple get_timestamp 
invocations in single query returns different results because there was another 
query whose compilation started in between. (This will happen once the lock 
around compile is removed in HS2).

We need to store this in a real query specific variable. I am still 
thinking what the best place for that is ..



ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
https://reviews.apache.org/r/29900/#comment115041

I have seen at least another place where we have a test timestamp getting 
injected. I might make sense to use some kind of getTimestamp class that can be 
customized to give a specific timestamp. 
But this does not have to be addressed in this jira.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java
https://reviews.apache.org/r/29900/#comment115019

I think it would be good to clarify that for all calls within a query this 
returns same value. For example, if the query lifetime crosses a date boundary, 
you would not see two different dates for different records.

Maybe reword it something like this - Returns the current date as of 
starting of query.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java
https://reviews.apache.org/r/29900/#comment115018

looks like we can consider this to be deterministic, since the value does 
not change within a query.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
https://reviews.apache.org/r/29900/#comment115021

we should update description to clarify that this is timestamp at begening 
of query evaluation/execution.
maybe evaluation is a better word.


- Thejas Nair


On Jan. 19, 2015, 10:01 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29900/
 ---
 
 (Updated Jan. 19, 2015, 10:01 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-5472
 https://issues.apache.org/jira/browse/HIVE-5472
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
 from the SessionState.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 25cccd7 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 0226f28 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7c4ca7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java f45b20a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
   ql/src/test/results/clientpositive/current_date_timestamp.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 9ecb0a0 
 
 Diff: https://reviews.apache.org/r/29900/diff/
 
 
 Testing
 ---
 
 qfile test added
 
 
 Thanks,
 
 Jason Dere
 




Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Chao Sun
Congrats!!!

On Wed, Jan 28, 2015 at 1:21 PM, Vaibhav Gumashta vgumas...@hortonworks.com
 wrote:

 Congratulations e’one!

 —Vaibhav
 On Jan 28, 2015, at 1:20 PM, Xuefu Zhang xzh...@cloudera.commailto:
 xzh...@cloudera.com wrote:

 Congratulations to all!

 --Xuefu

 On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.orgmailto:
 c...@apache.org wrote:
 I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
 O'Malley and Prasanth Jayachandran have been elected to the Hive Project
 Management Committee. Please join me in congratulating the these new PMC
 members!

 Thanks.

 - Carl





[jira] [Commented] (HIVE-9482) Hive parquet timestamp compatibility

2015-01-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295911#comment-14295911
 ] 

Szehon Ho commented on HIVE-9482:
-

Test failures dont look related (these spark tests also failed in other 
builds). 

parquet_external_time will fail until the attached parquet file is checked in 
(/data/files/parquet_external_time.parq).

 Hive parquet timestamp compatibility
 

 Key: HIVE-9482
 URL: https://issues.apache.org/jira/browse/HIVE-9482
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.15.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 0.15.0

 Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, 
 parquet_external_time.parq


 In current Hive implementation, timestamps are stored in UTC (converted from 
 current timezone), based on original parquet timestamp spec.
 However, we find this is not compatibility with other tools, and after some 
 investigation it is not the way of the other file formats, or even some 
 databases (Hive Timestamp is more equivalent of 'timestamp without timezone' 
 datatype).
 This is the first part of the fix, which will restore compatibility with 
 parquet-timestamp files generated by external tools by skipping conversion on 
 reading.
 Later fix will change the write path to not convert, and stop the 
 read-conversion even for files written by Hive itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-28 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin updated HIVE-9487:
-
Status: Patch Available  (was: Open)

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/
---

(Updated Jan. 28, 2015, 11:22 p.m.)


Review request for hive and Thejas Nair.


Changes
---

Update patch based on review comments


Bugs: HIVE-5472
https://issues.apache.org/jira/browse/HIVE-5472


Repository: hive-git


Description
---

Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
from the SessionState.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
 PRE-CREATION 
  ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
  ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 36c8743 

Diff: https://reviews.apache.org/r/29900/diff/


Testing
---

qfile test added


Thanks,

Jason Dere



Review Request 30385: Use SASL to establish the remote context connection.

2015-01-28 Thread Marcelo Vanzin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30385/
---

Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang.


Bugs: HIVE-9487
https://issues.apache.org/jira/browse/HIVE-9487


Repository: hive-git


Description
---

Instead of the insecure, ad-hoc auth mechanism currently used, perform
a SASL negotiation to establish trust. This requires the secret to be
distributed through some secure channel (just like before).

Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and
probably wouldn't work well here) also allows us to add encryption
without the need for SSL (yay?).

Only DIGEST-MD5 has been really tested. Supporting other mechanisms
will probably mean adding new callback handlers in the client and
server portions, but shouldn't be hard if desired.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
d4d98d7c0c28cdb1d19c700e20537ef405be2e01 
  spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 
ce2f9b6b132dc47f899798e47d18a1f6b0dd707f 
  
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 
3a7149341bac086e5efe931595143d3bebbdb5db 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
5f9be658a855cc15c576f1a98376fcd85475e3b7 
  
spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java
 0c29c9441fb3e9daf690510a2c9b5716671e2571 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 
2c858a121aaeca6af20f5e332de207694348a030 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java 
fffe24b3cbe6a5d7387e751adbc65f5b140c9089 
  
spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java
 eff640f7b24348043dbce734510698d9294579c6 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 
5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java 
PRE-CREATION 
  
spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java
 af534375a3ed86a3a9ad57c2f21a9a8bf6113714 
  spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java 
ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 

Diff: https://reviews.apache.org/r/30385/diff/


Testing
---

Unit tests.


Thanks,

Marcelo Vanzin



[jira] [Updated] (HIVE-5472) support a simple scalar which returns the current timestamp

2015-01-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5472:
-
Attachment: HIVE-5472.3.patch

Patch v3, incorporating review feedback.

 support a simple scalar which returns the current timestamp
 ---

 Key: HIVE-5472
 URL: https://issues.apache.org/jira/browse/HIVE-5472
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0
Reporter: N Campbell
Assignee: Jason Dere
 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch


 ISO-SQL has two forms of functions
 local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE 
 and the latter with TIME ZONE
 select cast ( unix_timestamp() as timestamp ) from T
 implement a function which computes LOCAL TIMESTAMP which would be the 
 current timestamp for the users session time zone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9292) CBO (Calcite Return Path): Inline GroupBy, Properties

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296082#comment-14296082
 ] 

Jesus Camacho Rodriguez commented on HIVE-9292:
---

Fails are not related to the patch (see HIVE-9498).

 CBO (Calcite Return Path): Inline GroupBy, Properties
 -

 Key: HIVE-9292
 URL: https://issues.apache.org/jira/browse/HIVE-9292
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9292.01.patch, HIVE-9292.02.patch, 
 HIVE-9292.03.patch, HIVE-9292.04.patch, HIVE-9292.05.patch, 
 HIVE-9292.06.patch, HIVE-9292.07.patch, HIVE-9292.patch, HIVE-9292.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-28 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9436:
---
   Resolution: Fixed
Fix Version/s: 1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk

 RetryingMetaStoreClient does not retry JDOExceptions
 

 Key: HIVE-9436
 URL: https://issues.apache.org/jira/browse/HIVE-9436
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.2.0

 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch


 RetryingMetaStoreClient has a bug in the following bit of code:
 {code}
 } else if ((e.getCause() instanceof MetaException) 
 e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) {
   caughtException = (MetaException) e.getCause();
 } else {
   throw e.getCause();
 }
 {code}
 The bug here is that java String.matches matches the entire string to the 
 regex, and thus, that match will fail if the message contains anything before 
 or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
 should match (?s).\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan

2015-01-28 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-9504:
---
Attachment: (was: HIVE-9504.00.patch)

 [beeline] ZipException when using !scan
 ---

 Key: HIVE-9504
 URL: https://issues.apache.org/jira/browse/HIVE-9504
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 0.15.0

 Attachments: HIVE-9504.00.patch


 Notice this while mucking around:
 {noformat}
 0: jdbc:hive2://localhost:1/ !scan
 java.util.zip.ZipException: error in opening zip file
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:220)
 at java.util.zip.ZipFile.init(ZipFile.java:150)
 at java.util.jar.JarFile.init(JarFile.java:166)
 at java.util.jar.JarFile.init(JarFile.java:130)
 at 
 org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579)
 at org.apache.hive.beeline.Commands.scan(Commands.java:278)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at 
 org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740)
 at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan

2015-01-28 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-9504:
---
Attachment: HIVE-9504.00.patch

 [beeline] ZipException when using !scan
 ---

 Key: HIVE-9504
 URL: https://issues.apache.org/jira/browse/HIVE-9504
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 0.15.0

 Attachments: HIVE-9504.00.patch


 Notice this while mucking around:
 {noformat}
 0: jdbc:hive2://localhost:1/ !scan
 java.util.zip.ZipException: error in opening zip file
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:220)
 at java.util.zip.ZipFile.init(ZipFile.java:150)
 at java.util.jar.JarFile.init(JarFile.java:166)
 at java.util.jar.JarFile.init(JarFile.java:130)
 at 
 org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579)
 at org.apache.hive.beeline.Commands.scan(Commands.java:278)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at 
 org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740)
 at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan

2015-01-28 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-9504:
---
Status: Patch Available  (was: Open)

 [beeline] ZipException when using !scan
 ---

 Key: HIVE-9504
 URL: https://issues.apache.org/jira/browse/HIVE-9504
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 0.15.0

 Attachments: HIVE-9504.00.patch


 Notice this while mucking around:
 {noformat}
 0: jdbc:hive2://localhost:1/ !scan
 java.util.zip.ZipException: error in opening zip file
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:220)
 at java.util.zip.ZipFile.init(ZipFile.java:150)
 at java.util.jar.JarFile.init(JarFile.java:166)
 at java.util.jar.JarFile.init(JarFile.java:130)
 at 
 org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579)
 at org.apache.hive.beeline.Commands.scan(Commands.java:278)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at 
 org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740)
 at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30385: Use SASL to establish the remote context connection.

2015-01-28 Thread Marcelo Vanzin


 On Jan. 29, 2015, 12:36 a.m., Xuefu Zhang wrote:
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java, line 
  20
  https://reviews.apache.org/r/30385/diff/1/?file=839319#file839319line20
 
  Nit: if you need to submit another patch, let's not auto reorg the 
  imports.

I changed this because someone broke it... now it's in line with the usual 
order you see in the rest of Hive code.


- Marcelo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30385/#review70119
---


On Jan. 28, 2015, 11:22 p.m., Marcelo Vanzin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30385/
 ---
 
 (Updated Jan. 28, 2015, 11:22 p.m.)
 
 
 Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang.
 
 
 Bugs: HIVE-9487
 https://issues.apache.org/jira/browse/HIVE-9487
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Instead of the insecure, ad-hoc auth mechanism currently used, perform
 a SASL negotiation to establish trust. This requires the secret to be
 distributed through some secure channel (just like before).
 
 Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and
 probably wouldn't work well here) also allows us to add encryption
 without the need for SSL (yay?).
 
 Only DIGEST-MD5 has been really tested. Supporting other mechanisms
 will probably mean adding new callback handlers in the client and
 server portions, but shouldn't be hard if desired.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
 d4d98d7c0c28cdb1d19c700e20537ef405be2e01 
   spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 
 ce2f9b6b132dc47f899798e47d18a1f6b0dd707f 
   
 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java
  3a7149341bac086e5efe931595143d3bebbdb5db 
   
 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
 5f9be658a855cc15c576f1a98376fcd85475e3b7 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java
  0c29c9441fb3e9daf690510a2c9b5716671e2571 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 
 2c858a121aaeca6af20f5e332de207694348a030 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java 
 fffe24b3cbe6a5d7387e751adbc65f5b140c9089 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java
  eff640f7b24348043dbce734510698d9294579c6 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 
 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java 
 PRE-CREATION 
   
 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java
  af534375a3ed86a3a9ad57c2f21a9a8bf6113714 
   spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java 
 ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 
 
 Diff: https://reviews.apache.org/r/30385/diff/
 
 
 Testing
 ---
 
 Unit tests.
 
 
 Thanks,
 
 Marcelo Vanzin
 




[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296148#comment-14296148
 ] 

Xuefu Zhang commented on HIVE-9487:
---

Patch looks good to me. I left a minor comment on RB. [~chengxiang li] Could 
you also take a look?

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-28 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9103:
---
Attachment: HIVE-9103-1.spark.patch

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-28 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9103:
---
Attachment: (was: HIVE-9103-1.spark.patch)

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296187#comment-14296187
 ] 

Xuefu Zhang commented on HIVE-9503:
---

I see. I guess the overhead is bearable. It gives much better a user experience 
than if we auto convert the task and the query fails, leaving the user in the 
blue.

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296178#comment-14296178
 ] 

Xuefu Zhang edited comment on HIVE-9503 at 1/29/15 1:01 AM:


Yeah, Hive has long reached to a point where the properties are confusing and 
sometimes contradicting and duplicating. These two properties plus 
hive.auto.convert.join are an example. The two properties are meant to be used 
together. Ignoring one while taking the other doesn't seem to be a clean 
solution. While it's already a legacy for MR and Tez, I'd like to have a 
cleaner solution for Spark since we still have the chance. If we want to be 
consistent across engines, I'd rather fix Spark to be consistent with MR.


was (Author: xuefuz):
Yeah, Hive has long reached to a point where the properties are confusing and 
sometimes contradicting and duplicating. These two properties plus 
hive.auto.convert.join are an example. The two properties are meant to be used 
together. Ignoring one while taking the other doesn't seem to be a clean 
solution. While it's already a legacy for MR and Tez, I'd like to have a 
cleaner solution for Spark since we still have the chance. 

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-9399) ppd_multi_insert.q generate same output in different order, when mapred.reduce.tasks is set to larger than 1

2015-01-28 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao reassigned HIVE-9399:
--

Assignee: Chao

 ppd_multi_insert.q generate same output in different order, when 
 mapred.reduce.tasks is set to larger than 1
 

 Key: HIVE-9399
 URL: https://issues.apache.org/jira/browse/HIVE-9399
 Project: Hive
  Issue Type: Test
Reporter: Chao
Assignee: Chao

 If running ppd_multi_insert.q with {{set mapred.reduce.tasks=3}}, the output 
 order is different, even with {{SORT_QUERY_RESULTS}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9470) Use a generic writable object to run ColumnaStorageBench write/read tests

2015-01-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296201#comment-14296201
 ] 

Ferdinand Xu commented on HIVE-9470:


Thank you for your update.  +1

 Use a generic writable object to run ColumnaStorageBench write/read tests 
 --

 Key: HIVE-9470
 URL: https://issues.apache.org/jira/browse/HIVE-9470
 Project: Hive
  Issue Type: Improvement
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-9470.1.patch, HIVE-9470.2.patch


 The ColumnarStorageBench benchmark class is using a Parquet writable object 
 to run all write/read/serialize/deserialize tests. It would be better to use 
 a more generic writable object (like text writables) to get better benchmark 
 results between format storages.
 Using parquet writables may add advantage when writing parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296198#comment-14296198
 ] 

Hive QA commented on HIVE-9487:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695117/HIVE-9487.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7359 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/690/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/690/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-690/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695117 - PreCommit-HIVE-SPARK-Build

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file

2015-01-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295858#comment-14295858
 ] 

Thejas M Nair commented on HIVE-9317:
-

This is not in 1.0 RC1, do we need another RC for this ? Looks like we need 
one. cc [~vikram.dixit] 


 move Microsoft copyright to NOTICE file
 ---

 Key: HIVE-9317
 URL: https://issues.apache.org/jira/browse/HIVE-9317
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.15.0, 1.0.0

 Attachments: hive-9327.txt


 There are a set of files that still have the Microsoft copyright notices. 
 Those notices need to be moved into NOTICES and replaced with the standard 
 Apache headers.
 {code}
 ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java
 ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
 ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java
 ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/#review70083
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java
https://reviews.apache.org/r/29900/#comment115045

//you can use substring(0, 10) to get date part of the timestamp
String dtStr = 
SessionState.get().getQueryCurrentTimestamp().toString().substring(0,10);
dateVal = Date.valueOf(dtStr);


- Alexander Pivovarov


On Jan. 19, 2015, 10:01 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29900/
 ---
 
 (Updated Jan. 19, 2015, 10:01 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-5472
 https://issues.apache.org/jira/browse/HIVE-5472
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
 from the SessionState.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 25cccd7 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 0226f28 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7c4ca7 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java f45b20a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
   ql/src/test/results/clientpositive/current_date_timestamp.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 9ecb0a0 
 
 Diff: https://reviews.apache.org/r/29900/diff/
 
 
 Testing
 ---
 
 qfile test added
 
 
 Thanks,
 
 Jason Dere
 




[ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran

2015-01-28 Thread Carl Steinbach
I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen
O'Malley and Prasanth Jayachandran have been elected to the Hive Project
Management Committee. Please join me in congratulating the these new PMC
members!

Thanks.

- Carl


[jira] [Created] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Szehon Ho (JIRA)
Szehon Ho created HIVE-9503:
---

 Summary: Update 'hive.auto.convert.join.noconditionaltask.*' 
descriptions
 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor


'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, 
and only to MR (which has the legacy conditional mapjoin)

However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
Spark, Tez, and MR, even though the description indicates it only applies if 
the above flag is on, which is true only for MR.

These configs should be updated to reflect this case.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9188) BloomFilter in ORC row group index

2015-01-28 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-9188:

Attachment: HIVE-9188.5.patch

[~owen.omalley] Updated patch creates separate streams for bloom filter. It 
only has row group level bloom filter are dropped. The disk IO is merged while 
reading row index.

[~gopalv] Addressed all your review comments. Additionally FileDump will 
aggregate the bloom filters to stripe level and will print the stats. You might 
want to use
{code}
hive --orcfiledump --rowindex=column_index_csv_list file_path
{code}

 BloomFilter in ORC row group index
 --

 Key: HIVE-9188
 URL: https://issues.apache.org/jira/browse/HIVE-9188
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.15.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
  Labels: orcfile
 Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, 
 HIVE-9188.4.patch, HIVE-9188.5.patch


 BloomFilters are well known probabilistic data structure for set membership 
 checking. We can use bloom filters in ORC index for better row group pruning. 
 Currently, ORC row group index uses min/max statistics to eliminate row 
 groups (stripes as well) that do not satisfy predicate condition specified in 
 the query. But in some cases, the efficiency of min/max based elimination is 
 not optimal (unsorted columns with wide range of entries). Bloom filters can 
 be an effective and efficient alternative for row group/split elimination for 
 point queries or queries with IN clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296112#comment-14296112
 ] 

Szehon Ho commented on HIVE-9503:
-

Well its not overriding the meaning, the value means the same thing (size of 
small-tables), except for the clause that it depends on the first property.  
The name also makes sense as we don't use a conditional task in Spark.  So I 
think having a Spark-only property for size of small-tables in mapjoin might be 
more confusing, as users will need to set both properties to get the same 
behavior in different execution engines.


 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296140#comment-14296140
 ] 

Hive QA commented on HIVE-9431:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695096/HIVE-9431.02.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2557/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2557/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2557/

Messages:
{noformat}
 This message was trimmed, see log for full details 
As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:526:5: 
Decision can match input such as {AMPERSAND..BITWISEXOR, DIV..DIVIDE, 
EQUAL..EQUAL_NS, GREATERTHAN..GREATERTHANOREQUALTO, KW_AND, KW_ARRAY, 
KW_BETWEEN..KW_BOOLEAN, KW_CASE, KW_DOUBLE, KW_FLOAT, KW_IF, KW_IN, KW_INT, 
KW_LIKE, KW_MAP, KW_NOT, KW_OR, KW_REGEXP, KW_RLIKE, KW_SMALLINT, 
KW_STRING..KW_STRUCT, KW_TINYINT, KW_UNIONTYPE, KW_WHEN, 
LESSTHAN..LESSTHANOREQUALTO, MINUS..NOTEQUAL, PLUS, STAR, TILDE} using 
multiple alternatives: 1, 3

As a result, alternative(s) 3 were disabled for that input
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-exec ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-exec 
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-exec ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-exec ---
[INFO] Compiling 2087 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/ql/target/classes
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,1]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,3]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,5]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,5]
  expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,17]
 ';' expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,23]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,24]
 ';' expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,1]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,3]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,5]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,7]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,17]
 ')' expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,23]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,24]
 ';' expected
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,1]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,4]
 illegal start of expression
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,7]
 illegal start of expression
[ERROR] 

Re: Review Request 30385: Use SASL to establish the remote context connection.

2015-01-28 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30385/#review70119
---



spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java
https://reviews.apache.org/r/30385/#comment115111

Nit: if you need to submit another patch, let's not auto reorg the imports.


- Xuefu Zhang


On Jan. 28, 2015, 11:22 p.m., Marcelo Vanzin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30385/
 ---
 
 (Updated Jan. 28, 2015, 11:22 p.m.)
 
 
 Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang.
 
 
 Bugs: HIVE-9487
 https://issues.apache.org/jira/browse/HIVE-9487
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Instead of the insecure, ad-hoc auth mechanism currently used, perform
 a SASL negotiation to establish trust. This requires the secret to be
 distributed through some secure channel (just like before).
 
 Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and
 probably wouldn't work well here) also allows us to add encryption
 without the need for SSL (yay?).
 
 Only DIGEST-MD5 has been really tested. Supporting other mechanisms
 will probably mean adding new callback handlers in the client and
 server portions, but shouldn't be hard if desired.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
 d4d98d7c0c28cdb1d19c700e20537ef405be2e01 
   spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java 
 ce2f9b6b132dc47f899798e47d18a1f6b0dd707f 
   
 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java
  3a7149341bac086e5efe931595143d3bebbdb5db 
   
 spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 
 5f9be658a855cc15c576f1a98376fcd85475e3b7 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java
  0c29c9441fb3e9daf690510a2c9b5716671e2571 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 
 2c858a121aaeca6af20f5e332de207694348a030 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java 
 fffe24b3cbe6a5d7387e751adbc65f5b140c9089 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java
  eff640f7b24348043dbce734510698d9294579c6 
   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 
 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c 
   
 spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java 
 PRE-CREATION 
   
 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java
  af534375a3ed86a3a9ad57c2f21a9a8bf6113714 
   spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java 
 ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 
 
 Diff: https://reviews.apache.org/r/30385/diff/
 
 
 Testing
 ---
 
 Unit tests.
 
 
 Thanks,
 
 Marcelo Vanzin
 




[jira] [Created] (HIVE-9504) [beeline] ZipException when using !scan

2015-01-28 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created HIVE-9504:
--

 Summary: [beeline] ZipException when using !scan
 Key: HIVE-9504
 URL: https://issues.apache.org/jira/browse/HIVE-9504
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 0.15.0


Notice this while mucking around:

{noformat}
0: jdbc:hive2://localhost:1/ !scan
java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:220)
at java.util.zip.ZipFile.init(ZipFile.java:150)
at java.util.jar.JarFile.init(JarFile.java:166)
at java.util.jar.JarFile.init(JarFile.java:130)
at 
org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128)
at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589)
at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579)
at org.apache.hive.beeline.Commands.scan(Commands.java:278)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935)
at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Thejas Nair


 On Jan. 28, 2015, 8:40 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java, line 1404
  https://reviews.apache.org/r/29900/diff/2/?file=825431#file825431line1404
 
  I have seen at least another place where we have a test timestamp 
  getting injected. I might make sense to use some kind of getTimestamp class 
  that can be customized to give a specific timestamp. 
  But this does not have to be addressed in this jira.
 
 Jason Dere wrote:
 Where does this occur?

in 'show grants' , it shows the timestamp when the grant was made, there is 
code to return -1 for timestamp in the test mode.


 On Jan. 28, 2015, 8:40 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java,
   line 32
  https://reviews.apache.org/r/29900/diff/2/?file=825432#file825432line32
 
  looks like we can consider this to be deterministic, since the value 
  does not change within a query.
 
 Jason Dere wrote:
 ok, will change. If we ever have a new descriptor to specify 
 deterministic within the same query but different for different queries, this 
 would fit the description. Would be needed for stuff like determining 
 suitability of queries for materialized views.

good point about materialized views


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/#review70062
---


On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29900/
 ---
 
 (Updated Jan. 28, 2015, 11:22 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-5472
 https://issues.apache.org/jira/browse/HIVE-5472
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
 from the SessionState.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
   ql/src/test/results/clientpositive/current_date_timestamp.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 36c8743 
 
 Diff: https://reviews.apache.org/r/29900/diff/
 
 
 Testing
 ---
 
 qfile test added
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan

2015-01-28 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-9504:
---
Attachment: HIVE-9504.00.patch

 [beeline] ZipException when using !scan
 ---

 Key: HIVE-9504
 URL: https://issues.apache.org/jira/browse/HIVE-9504
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 0.15.0

 Attachments: HIVE-9504.00.patch


 Notice this while mucking around:
 {noformat}
 0: jdbc:hive2://localhost:1/ !scan
 java.util.zip.ZipException: error in opening zip file
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:220)
 at java.util.zip.ZipFile.init(ZipFile.java:150)
 at java.util.jar.JarFile.init(JarFile.java:166)
 at java.util.jar.JarFile.init(JarFile.java:130)
 at 
 org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589)
 at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579)
 at org.apache.hive.beeline.Commands.scan(Commands.java:278)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at 
 org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52)
 at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935)
 at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778)
 at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740)
 at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470)
 at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9431:
--
Attachment: HIVE-9431.03.patch

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, 
 HIVE-9431.03.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296154#comment-14296154
 ] 

Xuefu Zhang commented on HIVE-9503:
---

I'm not sure of the difference between backup task and the conditional task 
that these two properties are referring to, but I don't feel we need a property 
to control whether to have a backup task. As long as we auto converted a join, 
we should have a backup task.

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9431:
--
Status: Open  (was: Patch Available)

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, 
 HIVE-9431.03.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext

2015-01-28 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9431:
--
Status: Patch Available  (was: Open)

 CBO (Calcite Return Path): Removing AST from ParseContext
 -

 Key: HIVE-9431
 URL: https://issues.apache.org/jira/browse/HIVE-9431
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.15.0

 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, 
 HIVE-9431.03.patch, HIVE-9431.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296175#comment-14296175
 ] 

Chao commented on HIVE-9503:


In MR, both conditional task AND backup task are used, but for us, only backup 
task is needed, since no decision needs to be made (only one mapjoin task). If 
we always use backup task for auto converted join, it will add overhead to plan 
compilation, because to generate a backup task we need to clone the whole 
operator tree.

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml

2015-01-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296193#comment-14296193
 ] 

Thejas M Nair commented on HIVE-8807:
-

+1

 Obsolete default values in webhcat-default.xml
 --

 Key: HIVE-8807
 URL: https://issues.apache.org/jira/browse/HIVE-8807
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.13.0, 0.14.0
Reporter: Lefty Leverenz
Assignee: Eugene Koifman
 Fix For: 1.0.0

 Attachments: HIVE8807.patch


 The defaults for templeton.pig.path  templeton.hive.path are 0.11 in 
 webhcat-default.xml but they ought to match current release numbers.
 The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml).
 no precommit tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]

2015-01-28 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin updated HIVE-9487:
-
Attachment: HIVE-9487.1-spark.patch

 Make Remote Spark Context secure [Spark Branch]
 ---

 Key: HIVE-9487
 URL: https://issues.apache.org/jira/browse/HIVE-9487
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Marcelo Vanzin
Assignee: Marcelo Vanzin
 Attachments: HIVE-9487.1-spark.patch


 The RSC currently uses an ad-hoc, insecure authentication mechanism. We 
 should instead use a proper auth mechanism and add encryption to the mix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference

2015-01-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296057#comment-14296057
 ] 

Szehon Ho commented on HIVE-9468:
-

One more: udaf_covar_pop.q

{noformat}
Running: diff -a 
/home/hiveptest/54.145.215.245-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_covar_pop.q.out
 
/home/hiveptest/54.145.215.245-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_covar_pop.q.out
91c91
 3.625
---
 3.624
{noformat}

 Test groupby3_map_skew.q fails due to decimal precision difference
 --

 Key: HIVE-9468
 URL: https://issues.apache.org/jira/browse/HIVE-9468
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Xuefu Zhang

 From test run, 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport:
  
 {code}
 Running: diff -a 
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out
  
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out
 162c162
  130091.0260.182 256.10355987055016  98.00.0 
 142.92680950752379  143.06995106518903  20428.07288 20469.0109
 ---
  130091.0260.182 256.10355987055016  98.00.0 
  142.9268095075238   143.06995106518906  20428.07288 20469.0109
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9474) truncate table changes permissions on the target

2015-01-28 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-9474:

   Resolution: Fixed
Fix Version/s: (was: 0.15.0)
   1.2.0
   Status: Resolved  (was: Patch Available)

Committed to trunk, thanks Aihua!

 truncate table changes permissions on the target
 

 Key: HIVE-9474
 URL: https://issues.apache.org/jira/browse/HIVE-9474
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor
 Fix For: 1.2.0

 Attachments: HIVE-9474.1.patch, HIVE-9474.2.patch, HIVE-9474.3.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 Create a table test(a string); 
 Hive create table test(key string);
 Change the /user/hive/warehouse/test  permission to something else other than 
 the default, like 777.
 Hive dfs -chmod 777 /user/hive/warehouse/test;
 Hive dfs -ls -d /user/hive/warehouse/test;
 drwxrwxrwx   - axu wheel 68 2015-01-26 18:45 /user/hive/warehouse/test
 Then truncate table test; 
 Hive truncate table test;
 The permission goes back to the default.
 hive dfs -ls -d /user/hive/warehouse/test;
 drwxr-xr-x   - axu wheel 68 2015-01-27 10:09 /user/hive/warehouse/test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296096#comment-14296096
 ] 

Sushanth Sowmyan commented on HIVE-9436:


The test failure now reported is down to 3, and they're unconnected to the 
issue being fixed here, so I will go ahead and commit this.

 RetryingMetaStoreClient does not retry JDOExceptions
 

 Key: HIVE-9436
 URL: https://issues.apache.org/jira/browse/HIVE-9436
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch


 RetryingMetaStoreClient has a bug in the following bit of code:
 {code}
 } else if ((e.getCause() instanceof MetaException) 
 e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) {
   caughtException = (MetaException) e.getCause();
 } else {
   throw e.getCause();
 }
 {code}
 The bug here is that java String.matches matches the entire string to the 
 regex, and thus, that match will fail if the message contains anything before 
 or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
 should match (?s).\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296098#comment-14296098
 ] 

Xuefu Zhang commented on HIVE-9503:
---

If hive.auto.convert.join.noconditionaltask.size is used by spark regardless of 
hive.auto.convert.join.noconditionaltask, we should probably have a different 
property. Reusing the same property while overwriting its meaning could cause 
confusion for either existing users or new users.

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions

2015-01-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296104#comment-14296104
 ] 

Sushanth Sowmyan commented on HIVE-9436:


[~vikram.dixit] - if you do respin an rc for 1.0, it would be useful to have 
this fix in as well - it's a simple fix which fixes retries from the client, 
and is a robustness fix. It isn't, however a breaking bug as I see it, since it 
is used only in the case of connection issues, where we give it a chance to 
retry instead of failing directly.

 RetryingMetaStoreClient does not retry JDOExceptions
 

 Key: HIVE-9436
 URL: https://issues.apache.org/jira/browse/HIVE-9436
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.2.0

 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch


 RetryingMetaStoreClient has a bug in the following bit of code:
 {code}
 } else if ((e.getCause() instanceof MetaException) 
 e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) {
   caughtException = (MetaException) e.getCause();
 } else {
   throw e.getCause();
 }
 {code}
 The bug here is that java String.matches matches the entire string to the 
 regex, and thus, that match will fail if the message contains anything before 
 or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we 
 should match (?s).\*JDO[a-zA-Z]\*Exception.\*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296105#comment-14296105
 ] 

Xuefu Zhang commented on HIVE-9468:
---

Yet another one: udaf_covar_samp.q
{code}
Running: diff -a 
/home/hiveptest/50.18.32.237-hiveptest-0/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_covar_samp.q.out
 
/home/hiveptest/50.18.32.237-hiveptest-0/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_covar_samp.q.out
91c91
 4.833
---
 4.832
{code}

 Test groupby3_map_skew.q fails due to decimal precision difference
 --

 Key: HIVE-9468
 URL: https://issues.apache.org/jira/browse/HIVE-9468
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Xuefu Zhang

 From test run, 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport:
  
 {code}
 Running: diff -a 
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out
  
 /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out
 162c162
  130091.0260.182 256.10355987055016  98.00.0 
 142.92680950752379  143.06995106518903  20428.07288 20469.0109
 ---
  130091.0260.182 256.10355987055016  98.00.0 
  142.9268095075238   143.06995106518906  20428.07288 20469.0109
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9302) Beeline add commands to register local jdbc driver names and jars

2015-01-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296117#comment-14296117
 ] 

Ferdinand Xu commented on HIVE-9302:


Thanks [~thejas] for your update!

 Beeline add commands to register local jdbc driver names and jars
 -

 Key: HIVE-9302
 URL: https://issues.apache.org/jira/browse/HIVE-9302
 Project: Hive
  Issue Type: New Feature
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, 
 HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, 
 postgresql-9.3.jdbc3.jar


 At present if a beeline user uses {{add jar}} the path they give is actually 
 on the HS2 server. It'd be great to allow beeline users to add local jdbc 
 driver jars and register custom jdbc driver names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification

2015-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296133#comment-14296133
 ] 

Hive QA commented on HIVE-9501:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12695060/HIVE-9501.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7403 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2556/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2556/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2556/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12695060 - PreCommit-HIVE-TRUNK-Build

 DbNotificationListener doesn't include dbname in create database notification 
 and does not include tablename in create table notification
 -

 Key: HIVE-9501
 URL: https://issues.apache.org/jira/browse/HIVE-9501
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-9501.patch


 This is a hold over from the JMS stuff, where create database is sent on the 
 general topic and create table on the db topic.  But since 
 DbNotificationListener isn't for JMS, keeping this semantic doesn't make 
 sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296141#comment-14296141
 ] 

Chao commented on HIVE-9503:


For backup task (HIVE-9103), I'm thinking about reusing 
hive.auto.convert.join.noconditionaltask to specify whether backup task is 
needed.
This is slightly misleading, but we can add some description to the property. 
Thoughts?

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/#review70128
---

Ship it!


Ship It!

- Thejas Nair


On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29900/
 ---
 
 (Updated Jan. 28, 2015, 11:22 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-5472
 https://issues.apache.org/jira/browse/HIVE-5472
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
 from the SessionState.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
   ql/src/test/results/clientpositive/current_date_timestamp.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 36c8743 
 
 Diff: https://reviews.apache.org/r/29900/diff/
 
 
 Testing
 ---
 
 qfile test added
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Updated] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called

2015-01-28 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-9473:

Status: Patch Available  (was: Open)

 sql std auth should disallow built-in udfs that allow any java methods to be 
 called
 ---

 Key: HIVE-9473
 URL: https://issues.apache.org/jira/browse/HIVE-9473
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-9473.1.patch


 As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java 
 methods. This should be disallowed when sql standard authorization is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-28 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9103:
---
Status: Patch Available  (was: Open)

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]

2015-01-28 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9103:
---
Attachment: HIVE-9103-1.spark.patch

 Support backup task for join related optimization [Spark Branch]
 

 Key: HIVE-9103
 URL: https://issues.apache.org/jira/browse/HIVE-9103
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chao
Priority: Blocker
 Attachments: HIVE-9103-1.spark.patch


 In MR, backup task can be executed if the original task, which probably 
 contains certain (join) optimization fails. This JIRA is to track this topic 
 for Spark. We need to determine if we need this and implement if necessary.
 This is a followup of HIVE-9099.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp

2015-01-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296168#comment-14296168
 ] 

Thejas M Nair commented on HIVE-5472:
-

+1

 support a simple scalar which returns the current timestamp
 ---

 Key: HIVE-5472
 URL: https://issues.apache.org/jira/browse/HIVE-5472
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0
Reporter: N Campbell
Assignee: Jason Dere
 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch


 ISO-SQL has two forms of functions
 local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE 
 and the latter with TIME ZONE
 select cast ( unix_timestamp() as timestamp ) from T
 implement a function which computes LOCAL TIMESTAMP which would be the 
 current timestamp for the users session time zone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions

2015-01-28 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296178#comment-14296178
 ] 

Xuefu Zhang commented on HIVE-9503:
---

Yeah, Hive has long reached to a point where the properties are confusing and 
sometimes contradicting and duplicating. These two properties plus 
hive.auto.convert.join are an example. The two properties are meant to be used 
together. Ignoring one while taking the other doesn't seem to be a clean 
solution. While it's already a legacy for MR and Tez, I'd like to have a 
cleaner solution for Spark since we still have the chance. 

 Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
 

 Key: HIVE-9503
 URL: https://issues.apache.org/jira/browse/HIVE-9503
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Szehon Ho
Priority: Minor

 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or 
 Tez, and only to MR (which has the legacy conditional mapjoin)
 However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to 
 Spark, Tez, and MR, even though the description indicates it only applies if 
 the above flag is on, which is true only for MR.
 These configs should be updated to reflect this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]

2015-01-28 Thread Chao Sun

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30388/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-9103
https://issues.apache.org/jira/browse/HIVE-9103


Repository: hive-git


Description
---

This patch adds backup task to map join task. The backup task, which uses 
common join, will be triggered
in case the mapjoin task failed.

Note that, no matter how many map joins there are in the SparkTask, we will 
only generate one backup task.
This means that if the original task failed at the very last map join, the 
whole task will be re-executed.

The handling of backup task is a little bit different from what MR does, mostly 
because we convert JOIN to
MAPJOIN during the operator plan optimization phase, at which time no task/work 
exist yet. In the patch, we
cloned the whole operator tree before the JOIN operator is converted. The 
operator tree will be processed
and generate a separate work tree for a separate backup SparkTask.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java
 69004dc 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java
 79c3e02 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java 
d57ceff 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java
 9ff47c7 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java
 6e0ac38 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
773cfbd 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java 
f7586a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 
  ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a 

Diff: https://reviews.apache.org/r/30388/diff/


Testing
---

auto_join25.q


Thanks,

Chao Sun



Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp

2015-01-28 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29900/#review70130
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java
https://reviews.apache.org/r/29900/#comment115137

you can remove  = null. Class attributes in java are null by default


- Alexander Pivovarov


On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29900/
 ---
 
 (Updated Jan. 28, 2015, 11:22 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-5472
 https://issues.apache.org/jira/browse/HIVE-5472
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add current_date/current_timestamp. The UDFs get the current_date/timestamp 
 from the SessionState.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a 
   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION 
   ql/src/test/results/clientpositive/current_date_timestamp.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/show_functions.q.out 36c8743 
 
 Diff: https://reviews.apache.org/r/29900/diff/
 
 
 Testing
 ---
 
 qfile test added
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-28 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Attachment: (was: HIVE-9471.1.patch)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: data.txt, orc_bad_seek_failure_case.hive, 
 orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

2015-01-28 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-9471:
---
Status: Open  (was: Patch Available)

 Bad seek in uncompressed ORC, at row-group boundary.
 

 Key: HIVE-9471
 URL: https://issues.apache.org/jira/browse/HIVE-9471
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9471.1.patch, data.txt, 
 orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive


 Under at least one specific condition, using index-filters in ORC causes a 
 bad seek into the ORC row-group.
 {code:title=stacktrace}
 java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for 
 column 2 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
   at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
 ...
 Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 
 kind DATA to 0 is outside of the data
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
 {code}
 I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9496) Sl4j warning in hive command

2015-01-28 Thread Philippe Kernevez (JIRA)
Philippe Kernevez created HIVE-9496:
---

 Summary: Sl4j warning in hive command
 Key: HIVE-9496
 URL: https://issues.apache.org/jira/browse/HIVE-9496
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
 Environment: HDP 2.2.0 on CentOS.
With Horton Sand Box and my own cluster.
Reporter: Philippe Kernevez
Priority: Minor


Each time 'hive' command is ran, we have an Sl4J warning about multiple jars 
containing SL4J classes.

This bug is similar to Hive-6162, but doesn't seems to be solved.

Logging initialized using configuration in 
file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.2.0.0-1084/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.2.0.0-1084/hive/lib/hive-jdbc-0.14.0.2.2.0.0-1084-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6308) COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.

2015-01-28 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295305#comment-14295305
 ] 

Yongzhi Chen commented on HIVE-6308:


Thank you Szehon!

This fix treats creating Avro tables without col defs in hive the same as 
creating table with all col defs. 
This fix does not address this kind of avro tables created before the fix.

Tested with hive command:  analyze table compute statistics for column. 

 COLUMNS_V2 Metastore table not populated for tables created without an 
 explicit column list.
 

 Key: HIVE-6308
 URL: https://issues.apache.org/jira/browse/HIVE-6308
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.10.0
Reporter: Alexander Behm
Assignee: Yongzhi Chen
 Fix For: 1.2.0

 Attachments: HIVE-6308.1.patch


 Consider this example table:
 CREATE TABLE avro_test
 ROW FORMAT SERDE
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
 STORED as INPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
 OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
 TBLPROPERTIES (
 'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc');
 When I try to run an ANALYZE TABLE for computing column stats on any of the 
 columns, then I get:
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 NoSuchObjectException(message:Column o_orderpriority for which stats 
 gathering is requested doesn't exist.)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.updateTableColumnStatistics(Hive.java:2280)
 at 
 org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:331)
 at 
 org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 The root cause appears to be that the COLUMNS_V2 table in the Metastore isn't 
 populated properly during the table creation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds

2015-01-28 Thread Sergio Pena


 On Ene. 28, 2015, 5:23 a.m., cheng xu wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java,
   lines 218-225
  https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line218
 
  How about the following code snippet?
  recordConsummer.startField(fieldName, i);
  if(i % 2 == 0){
writeValue(keyElement, KeyInspector, fieldType);
  }else{
writeValue(valueElement, valueInspector, fieldType);
  }
  recordConsumer.endField(fieldName, i);

The parquet API does not accept NULL values inside startField/endField. This is 
why I had to check if key or value are nulls before starting the field. Or in 
the change I did, we check for null values everywhere, and then call 
startField/endField on writePrimitive. You can see the 
TestDataWritableWriter.testMapType() method for how null values should work. 

This is how Parquet adds map value 'key3 = null'

startGroup();
  startField(key, 0);
addString(key3);
  endField(key, 0);
endGroup();


 On Ene. 28, 2015, 5:23 a.m., cheng xu wrote:
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java,
   line 76
  https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line76
 
  Hi Sergio, I am a little confused about the purpose of pushing 
  startFiled endField down. As the method name writeGroupFields indicates, 
  it will write fields of group one by one. My suggestion is moving back 
  these two lines. If I missed anything, please tell me your consideration 
  about this change.

See the comment regarind thte writeMap() method. We can go back to the original 
implemenation to make it look better, but writeMap() will look not very clean. 
The thing is that we cannot add null values inside startField/endField.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30281/#review69935
---


On Ene. 27, 2015, 6:47 p.m., Sergio Pena wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30281/
 ---
 
 (Updated Ene. 27, 2015, 6:47 p.m.)
 
 
 Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
 
 
 Bugs: HIVE-9333
 https://issues.apache.org/jira/browse/HIVE-9333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch moves the ParquetHiveSerDe.serialize() implementation to 
 DataWritableWriter class in order to save time in materializing data on 
 serialize().
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
  ea4109d358f7c48d1e2042e5da299475de4a0a29 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
  060b1b722d32f3b2f88304a1a73eb249e150294b 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  41b5f1c3b0ab43f734f8a211e3e03d5060c75434 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
  e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 
 a693aff18516d133abf0aae4847d3fe00b9f1c96 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
  667d3671547190d363107019cd9a2d105d26d336 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 007a665529857bcec612f638a157aa5043562a15 
   serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/30281/diff/
 
 
 Testing
 ---
 
 The tests run were the following:
 
 1. JMH (Java microbenchmark)
 
 This benchmark called parquet serialize/write methods using text writable 
 objects. 
 
 Class.method  Before Change (ops/s)  After Change (ops/s) 
   
 ---
 ParquetHiveSerDe.serialize:  19,113   249,528   -  
 19x speed increase
 DataWritableWriter.write: 5,033 5,201   -  
 3.34% speed increase
 
 
 2. Write 20 million rows (~1GB file) from Text to Parquet
 
 I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format 
 using the following
 statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text;
 
 Time (s) it took to write the whole file BEFORE changes: 93.758 s
 Time (s) it took to write the whole file AFTER changes: 83.903 s
 
 It got a 10% of speed inscrease.
 
 
 Thanks,
 
 Sergio Pena
 




  1   2   3   >