date:20141006

[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2014-10-06 Thread Navis (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160033#comment-14160033
 ] 

Navis commented on HIVE-8319:
-

[~ashutoshc] This is trivial enough but able to provide various expansions to 
hiveserver2. Could you review this?

 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160055#comment-14160055
 ] 

Hive QA commented on HIVE-8352:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673055/HIVE-8352.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6739 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-196/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673055

 Enable windowing.q for spark
 

 Key: HIVE-8352
 URL: https://issues.apache.org/jira/browse/HIVE-8352
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
 hive-8385.patch


 We should enable windowing.q for basic windowing coverage. After checking out 
 the spark branch, we would build:
 {noformat}
 $ mvn clean install -DskipTests -Phadoop-2
 $ cd itests/
 $ mvn clean install -DskipTests -Phadoop-2
 {noformat}
 Then generate the windowing.q.out file:
 {noformat}
 $ cd qtest-spark/
 $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
 -Dtest.output.overwrite=true
 {noformat}
 Compare the output against MapReduce:
 {noformat}
 $ diff -y -W 150 
 ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
 ../../ql/src/test/results/clientpositive/windowing.q.out| less
 {noformat}
 And if everything looks good, add it to {{spark.query.files}} in 
 {{./itests/src/test/resources/testconfiguration.properties}}
 then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160067#comment-14160067
 ] 

Hive QA commented on HIVE-7205:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673048/HIVE-7205.4.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6525 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1128/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673048

 Wrong results when union all of grouping followed by group by with 
 correlation optimization
 ---

 Key: HIVE-7205
 URL: https://issues.apache.org/jira/browse/HIVE-7205
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: dima machlin
Assignee: Navis
Priority: Critical
 Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, 
 HIVE-7205.3.patch.txt, HIVE-7205.4.patch.txt


 use case :
 table TBL (a string,b string) contains single row : 'a','a'
 the following query :
 {code:sql}
 select b, sum(cc) from (
 select b,count(1) as cc from TBL group by b
 union all
 select a as b,count(1) as cc from TBL group by a
 ) z
 group by b
 {code}
 returns 
 a 1
 a 1
 while set hive.optimize.correlation=true;
 if we change set hive.optimize.correlation=false;
 it returns correct results : a 2
 The plan with correlation optimization :
 {code:sql}
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM 
 (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
 TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR 
 (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY 
 (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION 
 (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) 
 (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL 
 a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
 (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum 
 (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 null-subquery1:z-subquery1:TBL 
   TableScan
 alias: TBL
 Select Operator
   expressions:
 expr: b
 type: string
   outputColumnNames: b
   Group By Operator
 aggregations:
   expr: count(1)
 bucketGroup: false
 keys:
   expr: b
   type: string
 mode: hash
 outputColumnNames: _col0, _col1
 Reduce Output Operator
   key expressions:
 expr: _col0
 type: string
   sort order: +
   Map-reduce partition columns:
 expr: _col0
 type: string
   tag: 0
   value expressions:
 expr: _col1
 type: bigint
 null-subquery2:z-subquery2:TBL 
   TableScan
 alias: TBL
 Select Operator
   expressions:
 expr: a
 type: string
   outputColumnNames: a
   Group By Operator
 aggregations:
   expr: count(1)
 bucketGroup: false
 keys:
   expr: a
   type: string
 mode: hash
 outputColumnNames: _col0, _col1
 Reduce Output Operator
   key

[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160068#comment-14160068
 ] 

Vaibhav Gumashta commented on HIVE-8193:


[~thejas] None of the failures are related. Thanks!

 Hook HiveServer2 dynamic service discovery with session time out
 

 Key: HIVE-8193
 URL: https://issues.apache.org/jira/browse/HIVE-8193
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8193.1.patch


 For dynamic service discovery, if the HiveServer2 instance is removed from 
 ZooKeeper, currently, on the last client close, the server shuts down. 
 However, we need to ensure that this also happens when a session is closed on 
 timeout and no current sessions exit on this instance of HiveServer2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160070#comment-14160070
 ] 

Vaibhav Gumashta commented on HIVE-8172:


cc [~thejas]

 HiveServer2 dynamic service discovery should let the JDBC client use default 
 ZooKeeper namespace
 

 Key: HIVE-8172
 URL: https://issues.apache.org/jira/browse/HIVE-8172
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Critical
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8172.1.patch


 Currently the client provides a url like:
  
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.
  
 The zooKeeperNamespace param when not provided should use the default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8324) Shim KerberosName (causes build failure on hadoop-1)

2014-10-06 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8324:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

The test failure is not related.

Patch committed to trunk and 14. Thanks for reviewing [~szehon]!

 Shim KerberosName (causes build failure on hadoop-1)
 

 Key: HIVE-8324
 URL: https://issues.apache.org/jira/browse/HIVE-8324
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Szehon Ho
Assignee: Vaibhav Gumashta
Priority: Blocker
 Fix For: 0.14.0

 Attachments: HIVE-8324.1.patch, HIVE-8324.2.patch


 Unfortunately even after HIVE-8265, there are still more compile failures.
 {code}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-service: Compilation failure: Compilation failure:
 [ERROR] 
 /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[35,54]
  cannot find symbol
 [ERROR] symbol:   class KerberosName
 [ERROR] location: package org.apache.hadoop.security.authentication.util
 [ERROR] 
 /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,7]
  cannot find symbol
 [ERROR] symbol:   class KerberosName
 [ERROR] location: class 
 org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
 [ERROR] 
 /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,43]
  cannot find symbol
 [ERROR] symbol:   class KerberosName
 [ERROR] location: class 
 org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
 [ERROR] 
 /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,7]
  cannot find symbol
 [ERROR] symbol:   class KerberosName
 [ERROR] location: class 
 org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
 [ERROR] 
 /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,43]
  cannot find symbol
 [ERROR] symbol:   class KerberosName
 [ERROR] location: class 
 org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.

2014-10-06 Thread David Chen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24136/
---

(Updated Oct. 6, 2014, 8:03 a.m.)


Review request for hive.


Changes
---

Rebase on trunk. Disable specific test methods for storage formats.


Bugs: HIVE-4329
https://issues.apache.org/jira/browse/HIVE-4329


Repository: hive-git


Description
---

HIVE-4329: HCatalog should use getHiveRecordWriter.


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
4fdb5c985108bb3225cf945024ae679745e5f3bc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java
 3a07b0ca7c1956d45e611005cbc5ba2464596471 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java
 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java
 4df912a935221e527c106c754ff233d212df9246 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1a7595fd6dd0a5ffbe529bc24015c482068233bf 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java
 2a883d6517bfe732b6a6dffa647d9d44e4145b38 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657cd1b16aec664aab3e22b430b304a3698d 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java
 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java
 b651cb323771843da43667016a7dd2c9d9a1ddac 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
 9b979395e47e54aac87487cb990824e3c3a2ee19 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java
 d83b003f9c16e78a39b3cc7ce810ff19f70848c2 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java
 5905b46178b510b3a43311739fea2b95f47b4ed7 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java
 b3ea76e6a79f94e09972bc060c06105f60087b71 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
 ee57f3fd126af2e36039f84686a4169ef6267593 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalDynamicPartitioned.java
 58764a5d093524a0a3566e6db817fdb4b2364ac8 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalNonPartitioned.java
 6e060c08ce03b71a4f2216f5137d73b468e5be46 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalPartitioned.java
 9f16b3b9811c2020adfb6a2da7eb76ac1bc8cfb9 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableDynamicPartitioned.java
 5b18739d0e9a92b94a6cc2647bc37d1aa0c0e5ca 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableNonPartitioned.java
 354ae109adbec93363a5f3813413dcc50bd8ffa3 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutablePartitioned.java
 a22a993c8f154fcbf2faaaea2ab1ce69c4f13717 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatNonPartitioned.java
 174a92f443cb5deeb4972f4016109ecedae8bd3e 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java
 a386415fb406bb0cda18f7913650874d6a236e21 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 36221b77d52474393668284d12877fd6b43c88d6 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoader.java
 5eabba151b6b39b8e251fbbce2ffd4b9f7b503c6 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderComplexSchema.java
 447f39fade0b5d562dd30915377a3ddf8dd422cd 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorer.java
 a380f619493c12c440679f501a401d0a61788838 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java
 0c3ec8bd93f2a50d2d44c2d892180142613dc68d 
  
hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestUtil.java
 8a652f0bb9323497bbcc7fd4a76f616ee8917c1e 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 2ad7330365b8327e6f1b78ad5b9760e252d1339b

[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-10-06 Thread David Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-4329:
-
Attachment: HIVE-4329.4.patch

Attaching a new patch rebased on master, incorporating the test utils from 
HIVE-7286 to disable specific test methods for given storage formats.

 HCatalog should use getHiveRecordWriter rather than getRecordWriter
 ---

 Key: HIVE-4329
 URL: https://issues.apache.org/jira/browse/HIVE-4329
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sean Busbey
Assignee: David Chen
 Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
 HIVE-4329.3.patch, HIVE-4329.4.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-6692) Location for new table or partition should be a write entity

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis resolved HIVE-6692.
-
Resolution: Won't Fix

Because locations for new table or partition is decided to be read entities by 
policy, which is felt strange to me still, the remaining part of the patch is 
whether to use qualified path or simple string for path type entities. I'll 
close this and make a new issue for that.

 Location for new table or partition should be a write entity
 

 Key: HIVE-6692
 URL: https://issues.apache.org/jira/browse/HIVE-6692
 Project: Hive
  Issue Type: Task
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6692.1.patch.txt


 Locations for create table and alter table add partitionshould be write 
 entities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)

Navis created HIVE-8357:
---

 Summary: Path type entities should use qualified path rather than 
string
 Key: HIVE-8357
 URL: https://issues.apache.org/jira/browse/HIVE-8357
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8357:

Attachment: HIVE-8357.1.patch.txt

Running preliminary test, expecting many test fails.

 Path type entities should use qualified path rather than string
 ---

 Key: HIVE-8357
 URL: https://issues.apache.org/jira/browse/HIVE-8357
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8357.1.patch.txt






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-8357:

Status: Patch Available  (was: Open)

 Path type entities should use qualified path rather than string
 ---

 Key: HIVE-8357
 URL: https://issues.apache.org/jira/browse/HIVE-8357
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8357.1.patch.txt






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8186) Self join may fail if one side has VCs and other doesn't

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160159#comment-14160159
 ] 

Hive QA commented on HIVE-8186:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673047/HIVE-8186.3.patch.txt

{color:green}SUCCESS:{color} +1 6525 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1129/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673047

 Self join may fail if one side has VCs and other doesn't
 

 Key: HIVE-8186
 URL: https://issues.apache.org/jira/browse/HIVE-8186
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8186.1.patch.txt, HIVE-8186.2.patch.txt, 
 HIVE-8186.3.patch.txt


 See comments. This also fails on trunk, although not on original join_vc query



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160271#comment-14160271
 ] 

Hive QA commented on HIVE-7733:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673054/HIVE-7733.5.patch.txt

{color:red}ERROR:{color} -1 due to 54 failed/errored test(s), 6526 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned_json
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_field_garbage
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_explain_rewrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_explain_rewrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_subquery1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_compare_java_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_view_inputs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_as_select_with_partition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_failure6
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col0
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view7
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_select_column_with_subquery
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalidate_view1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_recursive_view
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1130/testReport
Console output:

[jira] [Commented] (HIVE-7641) INSERT ... SELECT with no source table leads to NPE

2014-10-06 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160302#comment-14160302
 ] 

Xuefu Zhang commented on HIVE-7641:
---

Looking at the patch, it seems making more sense to return an error in the 
case, in order to be consist with regular select x from table query, in which 
error is given if from table is missed.

 INSERT ... SELECT with no source table leads to NPE
 ---

 Key: HIVE-7641
 URL: https://issues.apache.org/jira/browse/HIVE-7641
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1
Reporter: Lenni Kuff
Assignee: Navis
 Attachments: HIVE-7641.1.patch.txt


 When no source table is provided for an INSERT statement Hive fails with NPE. 
 {code}
 0: jdbc:hive2://localhost:11050/default create table test_tbl(i int);
 No rows affected (0.333 seconds)
 0: jdbc:hive2://localhost:11050/default insert into table test_tbl select 1;
 Error: Error while compiling statement: FAILED: NullPointerException null 
 (state=42000,code=4)
 -- Get a NPE even when using incorrect syntax (no TABLE keyword)
 0: jdbc:hive2://localhost:11050/default insert into test_tbl select 1;
 Error: Error while compiling statement: FAILED: NullPointerException null 
 (state=42000,code=4)
 -- Works when a source table is provided
 0: jdbc:hive2://localhost:11050/default insert into table test_tbl select 1 
 from foo;
 No rows affected (5.751 seconds)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160352#comment-14160352
 ] 

Hive QA commented on HIVE-4329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673066/HIVE-4329.4.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6563 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.testPigPopulation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1131/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673066

 HCatalog should use getHiveRecordWriter rather than getRecordWriter
 ---

 Key: HIVE-4329
 URL: https://issues.apache.org/jira/browse/HIVE-4329
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.14.0
 Environment: discovered in Pig, but it looks like the root cause 
 impacts all non-Hive users
Reporter: Sean Busbey
Assignee: David Chen
 Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
 HIVE-4329.3.patch, HIVE-4329.4.patch


 Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
 with the following stacktrace:
 {code}
 java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
 cast to org.apache.hadoop.io.LongWritable
   at 
 org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
   at 
 org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
   at 
 org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
 {code}
 The proximal cause of this failure is that the AvroContainerOutputFormat's 
 signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
 forces a NullWritable. I'm not sure of a general fix, other than redefining 
 HiveOutputFormat to mandate a WritableComparable.
 It looks like accepting WritableComparable is what's done in the other Hive 
 OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
 be changed, since it's ignoring the key. That way fixing things so 
 FileRecordWriterContainer can always use NullWritable could get spun into a 
 different issue?
 The underlying cause for failure to write to AvroSerde tables is that 
 AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
 fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2014-10-06 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160428#comment-14160428
 ] 

Ashutosh Chauhan commented on HIVE-8319:


cc: [~thejas] , [~vgumashta]

 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8137) Empty ORC file handling

2014-10-06 Thread Pankit Thapar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160456#comment-14160456
 ] 

Pankit Thapar commented on HIVE-8137:
-

[~gopalv] , could you please comment on the failures. I don't think that the 
above failures are due to my patch. 
Could you please comment on the same?
Also, could you please review the patch as well?


 Empty ORC file handling
 ---

 Key: HIVE-8137
 URL: https://issues.apache.org/jira/browse/HIVE-8137
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Pankit Thapar
 Fix For: 0.14.0

 Attachments: HIVE-8137.patch


 Hive 13 does not handle reading of a zero size Orc File properly. An Orc file 
 is suposed to have a post-script
 which the ReaderIml class tries to read and initialize the footer with it. 
 But in case, the file is empty 
 or is of zero size, then it runs into an IndexOutOfBound Exception because of 
 ReaderImpl trying to read in its constructor.
 Code Snippet : 
 //get length of PostScript
 int psLen = buffer.get(readSize - 1)  0xff; 
 In the above code, readSize for an empty file is zero.
 I see that ensureOrcFooter() method performs some sanity checks for footer , 
 so, either we can move the above code snippet to ensureOrcFooter() and throw 
 a Malformed ORC file exception or we can create a dummy Reader that does 
 not initialize footer and basically has hasNext() set to false so that it 
 returns false on the first call.
 Basically, I would like to know what might be the correct way to handle an 
 empty ORC file in a mapred job?
 Should we neglect it and not throw an exception or we can throw an exeption 
 that the ORC file is malformed.
 Please let me know your thoughts on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken

2014-10-06 Thread Ken Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160480#comment-14160480
 ] 

Ken Williams commented on HIVE-6050:


I'm also looking for a workaround to this - I'm seeing the error when trying to 
connect to a 0.13 Hive.

 JDBC backward compatibility is broken
 -

 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Carl Steinbach
Priority: Blocker

 Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
 Hive 0.10 (TProtocolVersion=v1), will return the following exception:
 {noformat}
 java.sql.SQLException: Could not establish connection to 
 jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
 unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
   at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at java.sql.DriverManager.getConnection(DriverManager.java:571)
   at java.sql.DriverManager.getConnection(DriverManager.java:187)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
   at 
 org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)
 Caused by: org.apache.thrift.TApplicationException: Required field 
 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
   ... 37 more
 {noformat}
 On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
 which doesn't seem to be backward-compatible.  Look at the code path in the 
 generated file 'TOpenSessionReq.java', method 
 TOpenSessionReqStandardScheme.read():
 1. The method will call 'TProtocolVersion.findValue()' on the thrift 
 protocol's byte stream, which returns null if the client is sending an enum 
 value unknown to the server.  (v4 is unknown to server)
 2. The method will then call struct.validate(), which will throw the above 
 exception because of null version.  
 So doesn't look like the current backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-8358:
--

 Summary: Constant folding should happen before predicate pushdown
 Key: HIVE-8358
 URL: https://issues.apache.org/jira/browse/HIVE-8358
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


So, that partition pruning and transitive predicate propagation may take 
advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8358:
---
Status: Patch Available  (was: Open)

 Constant folding should happen before predicate pushdown
 

 Key: HIVE-8358
 URL: https://issues.apache.org/jira/browse/HIVE-8358
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8358.patch


 So, that partition pruning and transitive predicate propagation may take 
 advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8358:
---
Attachment: HIVE-8358.patch

Running tests to see if there any failures. Not ready for review yet.

 Constant folding should happen before predicate pushdown
 

 Key: HIVE-8358
 URL: https://issues.apache.org/jira/browse/HIVE-8358
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8358.patch


 So, that partition pruning and transitive predicate propagation may take 
 advantage of constant folding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160519#comment-14160519
 ] 

Brock Noland commented on HIVE-6050:


AFAIK there is no present work around. The server must be higher or equal to 
the client.

 JDBC backward compatibility is broken
 -

 Key: HIVE-6050
 URL: https://issues.apache.org/jira/browse/HIVE-6050
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Carl Steinbach
Priority: Blocker

 Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of 
 Hive 0.10 (TProtocolVersion=v1), will return the following exception:
 {noformat}
 java.sql.SQLException: Could not establish connection to 
 jdbc:hive2://localhost:1/default: Required field 'client_protocol' is 
 unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336)
   at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158)
   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
   at java.sql.DriverManager.getConnection(DriverManager.java:571)
   at java.sql.DriverManager.getConnection(DriverManager.java:187)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73)
   at 
 org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187)
   at 
 org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
   at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
   at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
   at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914)
 Caused by: org.apache.thrift.TApplicationException: Required field 
 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null)
   at 
 org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147)
   at 
 org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327)
   ... 37 more
 {noformat}
 On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, 
 which doesn't seem to be backward-compatible.  Look at the code path in the 
 generated file 'TOpenSessionReq.java', method 
 TOpenSessionReqStandardScheme.read():
 1. The method will call 'TProtocolVersion.findValue()' on the thrift 
 protocol's byte stream, which returns null if the client is sending an enum 
 value unknown to the server.  (v4 is unknown to server)
 2. The method will then call struct.validate(), which will throw the above 
 exception because of null version.  
 So doesn't look like the current backward-compatibility scheme will work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160524#comment-14160524
 ] 

Brock Noland commented on HIVE-8352:


[~jxiang] does parallel.q pass for you locally, without 
test.overwrite.output=true? If it does pass, can you open a subtask of 
HIVE-7292 to investigate the flakiness? 

+1 pending resolution of parallel.q

 Enable windowing.q for spark
 

 Key: HIVE-8352
 URL: https://issues.apache.org/jira/browse/HIVE-8352
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
 hive-8385.patch


 We should enable windowing.q for basic windowing coverage. After checking out 
 the spark branch, we would build:
 {noformat}
 $ mvn clean install -DskipTests -Phadoop-2
 $ cd itests/
 $ mvn clean install -DskipTests -Phadoop-2
 {noformat}
 Then generate the windowing.q.out file:
 {noformat}
 $ cd qtest-spark/
 $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
 -Dtest.output.overwrite=true
 {noformat}
 Compare the output against MapReduce:
 {noformat}
 $ diff -y -W 150 
 ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
 ../../ql/src/test/results/clientpositive/windowing.q.out| less
 {noformat}
 And if everything looks good, add it to {{spark.query.files}} in 
 {{./itests/src/test/resources/testconfiguration.properties}}
 then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA

Frédéric TERRAZZONI created HIVE-8359:
-

 Summary: Map containing null values are not correctly written in 
Parquet files
 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI


Tried write a mapstring,string column in a Parquet file. The table should 
contain :
{code}
{key3:val3,key4:null}
{key3:val3,key4:null}
{key1:null,key2:val2}
{key3:val3,key4:null}
{key3:val3,key4:null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{key3:val3}
{key4:val3}
{key3:val2}
{key4:val3}
{key1:val3}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 
Don't know how to attach it here though ... :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric TERRAZZONI updated HIVE-8359:
--
Description: 
Tried write a mapstring,string column in a Parquet file. The table should 
contain :
{code}
{key3:val3,key4:null}
{key3:val3,key4:null}
{key1:null,key2:val2}
{key3:val3,key4:null}
{key3:val3,key4:null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{key3:val3}
{key4:val3}
{key3:val2}
{key4:val3}
{key1:val3}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 

  was:
Tried write a mapstring,string column in a Parquet file. The table should 
contain :
{code}
{key3:val3,key4:null}
{key3:val3,key4:null}
{key1:null,key2:val2}
{key3:val3,key4:null}
{key3:val3,key4:null}
{code}
... and when you do a query like {code}SELECT * from mytable{code}
We can see that the table is corrupted :
{code}
{key3:val3}
{key4:val3}
{key3:val2}
{key4:val3}
{key1:val3}
{code}

I've not been able to read the Parquet file in our software afterwards, and 
consequently I suspect it to be corrupted. 

For those who are interested, I generated this Parquet table from an Avro file. 
Don't know how to attach it here though ... :)


 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI

 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-10-06 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frédéric TERRAZZONI updated HIVE-8359:
--
Attachment: map_null_val.avro

Avro file containing the sample data. To reproduce the issue, just create a 
Hive table from this file and issue a 
{code}
CREATE TABLE broken_parquet_table STORED AS PARQUET
AS SELECT * FROM the_avro_table;

SELECT * FROM broken_parquet_table;
{code}

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
 Attachments: map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160536#comment-14160536
 ] 

Thejas M Nair commented on HIVE-8319:
-

This patch is making the Service interface public. We should mark it with 
@public annotation in that case, and probably @unstable or (at least @evolving) 
as well.
The interface also needs some cleanup, so that unused functions are removed 
(such as register/unregister). We should also clarify the public/private api 
status of the classes within org.apache.hive.service package, as users might 
also end up using classes like CompositeService. (I think marking them as 
@private unless it is clear that users would benefit from it and it can be kept 
stable).



 Add configuration for custom services in hiveserver2
 

 Key: HIVE-8319
 URL: https://issues.apache.org/jira/browse/HIVE-8319
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8319.1.patch.txt


 NO PRECOMMIT TESTS
 Register services to hiveserver2, for example, 
 {noformat}
 property
   namehive.server2.service.classesname
   
 valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue
 /property
 property
   nameazkaban.ssl.portname
   name...name
 /property
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8357) Path type entities should use qualified path rather than string

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160549#comment-14160549
 ] 

Hive QA commented on HIVE-8357:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673067/HIVE-8357.1.patch.txt

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 6524 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_udf_local_resource
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithDfsResource
org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithLocalResource
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1132/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12673067

 Path type entities should use qualified path rather than string
 ---

 Key: HIVE-8357
 URL: https://issues.apache.org/jira/browse/HIVE-8357
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-8357.1.patch.txt






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Status: Patch Available  (was: Open)

 CBO trunk merge: union11 test fails due to incorrect plan
 -

 Key: HIVE-8225
 URL: https://issues.apache.org/jira/browse/HIVE-8225
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
 HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
 HIVE-8225.inprogress.patch, HIVE-8225.patch


 The result changes to as if the union didn't have count() inside. The issue 
 can be fixed by using srcunion.value outside the subquery in count (replace 
 count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
 from union-ed queries is not present in AST at all, which might cause this 
 result.
 -Interestingly, adding group by to each query in a union produces completely 
 weird result (count(1) is 309 for each key, whereas it should be 1 and the 
 logical incorrect value if internal count is lost is 500)- Nm, that groups 
 by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26209: CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread pengcheng xiong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26209/
---

(Updated Oct. 6, 2014, 5:39 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

create a derived table with new proj and aggr to address it


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/PlanModifierForASTConv.java
 3d90ae7 
  ql/src/test/queries/clientpositive/cbo_correctness.q f7f0722 
  ql/src/test/results/clientpositive/cbo_correctness.q.out 3335d4d 
  ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 5920612 

Diff: https://reviews.apache.org/r/26209/diff/


Testing
---


Thanks,

pengcheng xiong

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Status: Open  (was: Patch Available)

 CBO trunk merge: union11 test fails due to incorrect plan
 -

 Key: HIVE-8225
 URL: https://issues.apache.org/jira/browse/HIVE-8225
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
 HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
 HIVE-8225.inprogress.patch, HIVE-8225.patch


 The result changes to as if the union didn't have count() inside. The issue 
 can be fixed by using srcunion.value outside the subquery in count (replace 
 count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
 from union-ed queries is not present in AST at all, which might cause this 
 result.
 -Interestingly, adding group by to each query in a union produces completely 
 weird result (count(1) is 309 for each key, whereas it should be 1 and the 
 logical incorrect value if internal count is lost is 500)- Nm, that groups 
 by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan

2014-10-06 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8225:
--
Attachment: HIVE-8225.5.patch

 CBO trunk merge: union11 test fails due to incorrect plan
 -

 Key: HIVE-8225
 URL: https://issues.apache.org/jira/browse/HIVE-8225
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, 
 HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, 
 HIVE-8225.inprogress.patch, HIVE-8225.patch


 The result changes to as if the union didn't have count() inside. The issue 
 can be fixed by using srcunion.value outside the subquery in count (replace 
 count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node 
 from union-ed queries is not present in AST at all, which might cause this 
 result.
 -Interestingly, adding group by to each query in a union produces completely 
 weird result (count(1) is 309 for each key, whereas it should be 1 and the 
 logical incorrect value if internal count is lost is 500)- Nm, that groups 
 by table column called key, which is weird but is what Hive does



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8340:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk and 14. Thanks for the patch [~xiaobingo]!

 HiveServer2 service doesn't stop backend jvm process, which prevents 
 follow-up service start.
 -

 Key: HIVE-8340
 URL: https://issues.apache.org/jira/browse/HIVE-8340
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
 HIVE-8340.4.patch


 On stopping the HS2 service from the services tab, it only kills the root 
 process and does not kill the child java process. As a result resources are 
 not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160567#comment-14160567
 ] 

Vaibhav Gumashta commented on HIVE-8340:


Thanks for reviewing the configs [~leftylev]

 HiveServer2 service doesn't stop backend jvm process, which prevents 
 follow-up service start.
 -

 Key: HIVE-8340
 URL: https://issues.apache.org/jira/browse/HIVE-8340
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
 HIVE-8340.4.patch


 On stopping the HS2 service from the services tab, it only kills the root 
 process and does not kill the child java process. As a result resources are 
 not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Aswathy Chellammal Sreekumar (JIRA)

Aswathy Chellammal Sreekumar created HIVE-8360:
--

 Summary: Add cross cluster support for webhcat E2E tests
 Key: HIVE-8360
 URL: https://issues.apache.org/jira/browse/HIVE-8360
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
 Environment: Secure cluster
Reporter: Aswathy Chellammal Sreekumar


In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
since the realm name for user principles are not included in the kinit command. 
This patch concatenates the realm name to the user principal there by resulting 
in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Aswathy Chellammal Sreekumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-8360:
---
Attachment: AD-MIT.patch

Including the patch that implements cross domain support in secure cluster for 
E2E tests. Please review the same.

 Add cross cluster support for webhcat E2E tests
 ---

 Key: HIVE-8360
 URL: https://issues.apache.org/jira/browse/HIVE-8360
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
 Environment: Secure cluster
Reporter: Aswathy Chellammal Sreekumar
 Attachments: AD-MIT.patch


 In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
 since the realm name for user principles are not included in the kinit 
 command. This patch concatenates the realm name to the user principal there 
 by resulting in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-8340:
-
Labels: TODOC14  (was: )

 HiveServer2 service doesn't stop backend jvm process, which prevents 
 follow-up service start.
 -

 Key: HIVE-8340
 URL: https://issues.apache.org/jira/browse/HIVE-8340
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
 HIVE-8340.4.patch


 On stopping the HS2 service from the services tab, it only kills the root 
 process and does not kill the child java process. As a result resources are 
 not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160612#comment-14160612
 ] 

Lefty Leverenz commented on HIVE-8340:
--

Doc note:  This adds *hive.hadoop.classpath* to HiveConf.java, so it needs to 
be documented in the wiki. Although the parameter doesn't start with 
hive.server2..., it belongs in the HiveServer2 section:

* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

 HiveServer2 service doesn't stop backend jvm process, which prevents 
 follow-up service start.
 -

 Key: HIVE-8340
 URL: https://issues.apache.org/jira/browse/HIVE-8340
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, 
 HIVE-8340.4.patch


 On stopping the HS2 service from the services tab, it only kills the root 
 process and does not kill the child java process. As a result resources are 
 not freed and this throws an error on restarting from command line.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8336:
-
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch .14

 Update pom, now that Optiq is renamed to Calcite
 

 Key: HIVE-8336
 URL: https://issues.apache.org/jira/browse/HIVE-8336
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Gunther Hagleitner
 Fix For: 0.14.0

 Attachments: HIVE-8336.1.patch


 Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
 and OPTIQ-430.
 There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160619#comment-14160619
 ] 

Gunther Hagleitner commented on HIVE-8336:
--

[~leftylev] i've changed the name in hiveconf on commit.

 Update pom, now that Optiq is renamed to Calcite
 

 Key: HIVE-8336
 URL: https://issues.apache.org/jira/browse/HIVE-8336
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Gunther Hagleitner
 Fix For: 0.14.0

 Attachments: HIVE-8336.1.patch


 Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
 and OPTIQ-430.
 There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite

2014-10-06 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160622#comment-14160622
 ] 

Vikram Dixit K commented on HIVE-8336:
--

+1 for 0.14

 Update pom, now that Optiq is renamed to Calcite
 

 Key: HIVE-8336
 URL: https://issues.apache.org/jira/browse/HIVE-8336
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Gunther Hagleitner
 Fix For: 0.14.0

 Attachments: HIVE-8336.1.patch


 Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 
 and OPTIQ-430.
 There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 
 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Open  (was: Patch Available)

The unit test is failing due to timing issues.

 Compactor cleaners can be starved on a busy table or partition.
 ---

 Key: HIVE-8258
 URL: https://issues.apache.org/jira/browse/HIVE-8258
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
 HIVE-8258.patch


 Currently the cleaning thread in the compactor does not run on a table or 
 partition while any locks are held on this partition.  This leaves it open to 
 starvation in the case of a busy table or partition.  It only needs to wait 
 until all locks on the table/partition at the time of the compaction have 
 expired.  Any jobs initiated after that (and thus any locks obtained) will be 
 for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Attachment: HIVE-8258.4.patch

A new version of the patch that actually makes sure the cleaner goes through 
the loop rather than relying on timing and hoping it works out.

 Compactor cleaners can be starved on a busy table or partition.
 ---

 Key: HIVE-8258
 URL: https://issues.apache.org/jira/browse/HIVE-8258
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
 HIVE-8258.patch


 Currently the cleaning thread in the compactor does not run on a table or 
 partition while any locks are held on this partition.  This leaves it open to 
 starvation in the case of a busy table or partition.  It only needs to wait 
 until all locks on the table/partition at the time of the compaction have 
 expired.  Any jobs initiated after that (and thus any locks obtained) will be 
 for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Patch Available  (was: Open)

 Compactor cleaners can be starved on a busy table or partition.
 ---

 Key: HIVE-8258
 URL: https://issues.apache.org/jira/browse/HIVE-8258
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
 HIVE-8258.patch


 Currently the cleaning thread in the compactor does not run on a table or 
 partition while any locks are held on this partition.  This leaves it open to 
 starvation in the case of a busy table or partition.  It only needs to wait 
 until all locks on the table/partition at the time of the compaction have 
 expired.  Any jobs initiated after that (and thus any locks obtained) will be 
 for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Status: Open  (was: Patch Available)

 Hive on Tez sets mapreduce.framework.name to yarn-tez
 -

 Key: HIVE-8344
 URL: https://issues.apache.org/jira/browse/HIVE-8344
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch


 This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
 we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Status: Patch Available  (was: Open)

 Hive on Tez sets mapreduce.framework.name to yarn-tez
 -

 Key: HIVE-8344
 URL: https://issues.apache.org/jira/browse/HIVE-8344
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch


 This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
 we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8344:
-
Attachment: HIVE-8344.2.patch

 Hive on Tez sets mapreduce.framework.name to yarn-tez
 -

 Key: HIVE-8344
 URL: https://issues.apache.org/jira/browse/HIVE-8344
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch


 This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, 
 we don't switch back when the user specifies MR as exec engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160650#comment-14160650
 ] 

Szehon Ho commented on HIVE-7375:
-

[~brocknoland] I had filed this sometime back to try to catch hadoop-1 compile 
errors in precommit.  (At the time trying to avoid having to fund an additional 
precommit machine cluster for hadoop-1).  Are you thinking we can get funding 
for one more cluster for hadoop-1 in the near future, as HIVE-8351 suggests?  
If so , I can resolve this JIRA in favor of that one.

 Add option in test infra to compile in other profiles (like hadoop-1)
 -

 Key: HIVE-7375
 URL: https://issues.apache.org/jira/browse/HIVE-7375
 Project: Hive
  Issue Type: Test
Reporter: Szehon Ho
Assignee: Szehon Ho

 As we are seeing some commits breaking hadoop-1 compilation due to lack of 
 pre-commit converage, it might be nice to add an option in the test infra to 
 compile on optional profiles as a pre-step before testing on the main profile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160667#comment-14160667
 ] 

Brock Noland commented on HIVE-7375:


Yes, I think we can resolve this one in favor of HIVE-8351.

 Add option in test infra to compile in other profiles (like hadoop-1)
 -

 Key: HIVE-7375
 URL: https://issues.apache.org/jira/browse/HIVE-7375
 Project: Hive
  Issue Type: Test
Reporter: Szehon Ho
Assignee: Szehon Ho

 As we are seeing some commits breaking hadoop-1 compilation due to lack of 
 pre-commit converage, it might be nice to add an option in the test infra to 
 compile on optional profiles as a pre-step before testing on the main profile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)

Harish Butani created HIVE-8361:
---

 Summary: NPE in PTFOperator when there are empty partitions
 Key: HIVE-8361
 URL: https://issues.apache.org/jira/browse/HIVE-8361
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani


Here is a simple query to reproduce this:
{code}
select sum(p_size) over (partition by p_mfgr )
from part where p_mfgr = 'some non existent mfgr';
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-8361:

Status: Patch Available  (was: Open)

 NPE in PTFOperator when there are empty partitions
 --

 Key: HIVE-8361
 URL: https://issues.apache.org/jira/browse/HIVE-8361
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-8361.1.patch


 Here is a simple query to reproduce this:
 {code}
 select sum(p_size) over (partition by p_mfgr )
 from part where p_mfgr = 'some non existent mfgr';
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Harish Butani (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-8361:

Attachment: HIVE-8361.1.patch

 NPE in PTFOperator when there are empty partitions
 --

 Key: HIVE-8361
 URL: https://issues.apache.org/jira/browse/HIVE-8361
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-8361.1.patch


 Here is a simple query to reproduce this:
 {code}
 select sum(p_size) over (partition by p_mfgr )
 from part where p_mfgr = 'some non existent mfgr';
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Mostafa Mokhtar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-8292:
--
Attachment: HIVE-8292.1.patch

This patch addresses the regression but doesn't handle multiple inputs for SMB 
join.

 Reading from partitioned bucketed tables has high overhead in 
 MapOperator.cleanUpInputFileChangedOp
 ---

 Key: HIVE-8292
 URL: https://issues.apache.org/jira/browse/HIVE-8292
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: cn105
Reporter: Mostafa Mokhtar
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch


 Reading from bucketed partitioned tables has significantly higher overhead 
 compared to non-bucketed non-partitioned files.
 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
 5% the CPU in 
 {code}
  Path onepath = normalizePath(onefile);
 {code}
 And 
 45% the CPU in 
 {code}
  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
 {code}
 From the profiler 
 {code}
 Stack Trace   Sample CountPercentage(%)
 hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
  java.net.URI.relativize(URI) 3,903   45.681
 java.net.URI.relativize(URI, URI) 3,903   
 45.681
java.net.URI.normalize(String) 2,169   
 25.386
java.net.URI.equal(String, String) 
 526 6.156
java.net.URI.equalIgnoringCase(String, 
 String) 1   0.012
java.lang.String.substring(int)
 1   0.012
 hive.ql.exec.MapOperator.normalizePath(String)506 5.922
 org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
 0.375
  java.net.URI.equals(Object)  12  0.14
  java.util.HashMap$KeySet.iterator()  5   
 0.059
  java.util.HashMap.get(Object)4   
 0.047
  java.util.LinkedHashMap.get(Object)  3   
 0.035
  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160683#comment-14160683
 ] 

Mostafa Mokhtar commented on HIVE-8292:
---

[~vikram.dixit]
Patch which addresses the regression attached.

 Reading from partitioned bucketed tables has high overhead in 
 MapOperator.cleanUpInputFileChangedOp
 ---

 Key: HIVE-8292
 URL: https://issues.apache.org/jira/browse/HIVE-8292
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: cn105
Reporter: Mostafa Mokhtar
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch


 Reading from bucketed partitioned tables has significantly higher overhead 
 compared to non-bucketed non-partitioned files.
 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
 5% the CPU in 
 {code}
  Path onepath = normalizePath(onefile);
 {code}
 And 
 45% the CPU in 
 {code}
  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
 {code}
 From the profiler 
 {code}
 Stack Trace   Sample CountPercentage(%)
 hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
  java.net.URI.relativize(URI) 3,903   45.681
 java.net.URI.relativize(URI, URI) 3,903   
 45.681
java.net.URI.normalize(String) 2,169   
 25.386
java.net.URI.equal(String, String) 
 526 6.156
java.net.URI.equalIgnoringCase(String, 
 String) 1   0.012
java.lang.String.substring(int)
 1   0.012
 hive.ql.exec.MapOperator.normalizePath(String)506 5.922
 org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
 0.375
  java.net.URI.equals(Object)  12  0.14
  java.util.HashMap$KeySet.iterator()  5   
 0.059
  java.util.HashMap.get(Object)4   
 0.047
  java.util.LinkedHashMap.get(Object)  3   
 0.035
  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160690#comment-14160690
 ] 

Szehon Ho commented on HIVE-6500:
-

Hi [~leftylev] I had a question about docs.  I came across an outdated wiki 
page still mentioning db as the only option, should that page be maintained as 
FS is now supported?  
[https://cwiki.apache.org/confluence/display/Hive/StatsDev|https://cwiki.apache.org/confluence/display/Hive/StatsDev]
  It is actually not linked from the top, but it does seem useful.  Not sure 
the policy for these pages?

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
  Labels: TODOC14
 Fix For: 0.13.0

 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]

2014-10-06 Thread Jimmy Xiang (JIRA)

Jimmy Xiang created HIVE-8362:
-

 Summary: Investigate flaky test parallel.q [Spark Branch]
 Key: HIVE-8362
 URL: https://issues.apache.org/jira/browse/HIVE-8362
 Project: Hive
  Issue Type: Sub-task
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


Test parallel.q is flaky. It fails sometimes with error like:

{noformat}
Failed tests: 
  TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected 
exception junit.framework.AssertionFailedError: Client Execution results failed 
with error code = 1
See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or 
check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ 
for specific test cases logs.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8168) With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation)

2014-10-06 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-8168:
-
Attachment: HIVE-8168.4.patch

Addressed [~mmokhtar]'s review comments.

 With dynamic partition enabled fact table selectivity is not taken into 
 account when generating the physical plan (Use CBO cardinality using physical 
 plan generation)
 --

 Key: HIVE-8168
 URL: https://issues.apache.org/jira/browse/HIVE-8168
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
  Labels: performance
 Fix For: vectorization-branch, 0.14.0

 Attachments: HIVE-8168.1.patch, HIVE-8168.2.patch, HIVE-8168.3.patch, 
 HIVE-8168.4.patch


 When calculating estimate row counts  data size during physical plan 
 generation in StatsRulesProcFactory doesn't know that there will be dynamic 
 partition pruning and it is hard to know how many partitions will qualify at 
 runtime, as a result with Dynamic partition pruning enabled a query 32 can 
 run with 570 compared to 70 tasks with dynamic partition pruning disabled and 
 actual partition filters on the fact table.
 The long term solution for this issue is to use the cardinality estimates 
 from CBO as it takes into account join selectivity and such, estimate from 
 CBO won't address the number of the tasks used for the partitioned table but 
 they will address the incorrect number of tasks used for the concequent 
 reducers where the majority of the slowdown is coming from.
 Plan dynamic partition pruning on 
 {code}
Map 5 
 Map Operator Tree:
 TableScan
   alias: ss
   filterExpr: ss_store_sk is not null (type: boolean)
   Statistics: Num rows: 550076554 Data size: 47370018896 
 Basic stats: COMPLETE Column stats: NONE
   Filter Operator
 predicate: ss_store_sk is not null (type: boolean)
 Statistics: Num rows: 275038277 Data size: 23685009448 
 Basic stats: COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {ss_store_sk} {ss_net_profit}
 1 
   keys:
 0 ss_sold_date_sk (type: int)
 1 d_date_sk (type: int)
   outputColumnNames: _col6, _col21
   input vertices:
 1 Map 1
   Statistics: Num rows: 302542112 Data size: 26053511168 
 Basic stats: COMPLETE Column stats: NONE
   Map Join Operator
 condition map:
  Inner Join 0 to 1
 condition expressions:
   0 {_col21}
   1 {s_county} {s_state}
 keys:
   0 _col6 (type: int)
   1 s_store_sk (type: int)
 outputColumnNames: _col21, _col80, _col81
 input vertices:
   1 Map 2
 Statistics: Num rows: 332796320 Data size: 
 28658862080 Basic stats: COMPLETE Column stats: NONE
 Map Join Operator
   condition map:
Left Semi Join 0 to 1
   condition expressions:
 0 {_col21} {_col80} {_col81}
 1 
   keys:
 0 _col81 (type: string)
 1 _col0 (type: string)
   outputColumnNames: _col21, _col80, _col81
   input vertices:
 1 Reducer 11
   Statistics: Num rows: 366075968 Data size: 
 31524749312 Basic stats: COMPLETE Column stats: NONE
   Select Operator
 expressions: _col81 (type: string), _col80 (type: 
 string), _col21 (type: float)
 outputColumnNames: _col81, _col80, _col21
 Statistics: Num rows: 366075968 Data size: 
 31524749312 Basic stats: COMPLETE Column stats: NONE
 Group By Operator
   aggregations: sum(_col21)
   keys: _col81 (type: string), _col80 (type: 
 string), '0'

[jira] [Commented] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160716#comment-14160716
 ] 

Jimmy Xiang commented on HIVE-8352:
---

Parallel.q is ok for me locally sometimes. Filed HIVE-8362 to look into the 
failure.

 Enable windowing.q for spark
 

 Key: HIVE-8352
 URL: https://issues.apache.org/jira/browse/HIVE-8352
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
 hive-8385.patch


 We should enable windowing.q for basic windowing coverage. After checking out 
 the spark branch, we would build:
 {noformat}
 $ mvn clean install -DskipTests -Phadoop-2
 $ cd itests/
 $ mvn clean install -DskipTests -Phadoop-2
 {noformat}
 Then generate the windowing.q.out file:
 {noformat}
 $ cd qtest-spark/
 $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
 -Dtest.output.overwrite=true
 {noformat}
 Compare the output against MapReduce:
 {noformat}
 $ diff -y -W 150 
 ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
 ../../ql/src/test/results/clientpositive/windowing.q.out| less
 {noformat}
 And if everything looks good, add it to {{spark.query.files}} in 
 {{./itests/src/test/resources/testconfiguration.properties}}
 then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1

2014-10-06 Thread Szehon Ho (JIRA)

Szehon Ho created HIVE-8363:
---

 Summary: AccumuloStorageHandler compile failure hadoop-1
 Key: HIVE-8363
 URL: https://issues.apache.org/jira/browse/HIVE-8363
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.14.0
Reporter: Szehon Ho
Priority: Blocker


There's an error about AccumuloStorageHandler compiling on hadoop-1.  It seems 
the signature of split() is not the same.  Looks like we can should use another 
utils to fix this.

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hive-accumulo-handler: Compilation failure
[ERROR] 
/data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52]
 no suitable method found for split(java.lang.String,char)
[ERROR] method 
org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not 
applicable
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-8364:


 Summary: We're not waiting for all inputs in MapRecordProcessor on 
Tez
 Key: HIVE-8364
 URL: https://issues.apache.org/jira/browse/HIVE-8364
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: 0.14.0


Seems like this could be a race condition: We're blocking for some inputs to 
become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8364:
-
Attachment: HIVE-8364.1.patch

Proposed patch.

 We're not waiting for all inputs in MapRecordProcessor on Tez
 -

 Key: HIVE-8364
 URL: https://issues.apache.org/jira/browse/HIVE-8364
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: HIVE-8364.1.patch


 Seems like this could be a race condition: We're blocking for some inputs to 
 become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp

2014-10-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160729#comment-14160729
 ] 

Gopal V commented on HIVE-8292:
---

[~mmokhtar]: Probably better to just read exec context off 
mapOp.getExecContext().

 Reading from partitioned bucketed tables has high overhead in 
 MapOperator.cleanUpInputFileChangedOp
 ---

 Key: HIVE-8292
 URL: https://issues.apache.org/jira/browse/HIVE-8292
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
 Environment: cn105
Reporter: Mostafa Mokhtar
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch


 Reading from bucketed partitioned tables has significantly higher overhead 
 compared to non-bucketed non-partitioned files.
 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
 5% the CPU in 
 {code}
  Path onepath = normalizePath(onefile);
 {code}
 And 
 45% the CPU in 
 {code}
  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
 {code}
 From the profiler 
 {code}
 Stack Trace   Sample CountPercentage(%)
 hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326   62.336
   hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
  hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
  java.net.URI.relativize(URI) 3,903   45.681
 java.net.URI.relativize(URI, URI) 3,903   
 45.681
java.net.URI.normalize(String) 2,169   
 25.386
java.net.URI.equal(String, String) 
 526 6.156
java.net.URI.equalIgnoringCase(String, 
 String) 1   0.012
java.lang.String.substring(int)
 1   0.012
 hive.ql.exec.MapOperator.normalizePath(String)506 5.922
 org.apache.commons.logging.impl.Log4JLogger.info(Object)  32  
 0.375
  java.net.URI.equals(Object)  12  0.14
  java.util.HashMap$KeySet.iterator()  5   
 0.059
  java.util.HashMap.get(Object)4   
 0.047
  java.util.LinkedHashMap.get(Object)  3   
 0.035
  hive.ql.exec.Operator.cleanUpInputFileChanged()  1   0.012
   hive.ql.exec.Operator.forward(Object, ObjectInspector)  473 5.536
   hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1   0.012
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8364:
-
Status: Patch Available  (was: Open)

 We're not waiting for all inputs in MapRecordProcessor on Tez
 -

 Key: HIVE-8364
 URL: https://issues.apache.org/jira/browse/HIVE-8364
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: HIVE-8364.1.patch


 Seems like this could be a race condition: We're blocking for some inputs to 
 become available, but the main MR input is just assumed ready...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8227) NPE w/ hive on tez when doing unions on empty tables

2014-10-06 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8227:
-
Fix Version/s: 0.14.0

 NPE w/ hive on tez when doing unions on empty tables
 

 Key: HIVE-8227
 URL: https://issues.apache.org/jira/browse/HIVE-8227
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.14.0

 Attachments: HIVE-8227.1.patch, HIVE-8227.2.patch


 We're looking at aliasToWork.values() to determine input paths etc. This can 
 contain nulls when we're scanning empty tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8272) Query with particular decimal expression causes NPE during execution initialization

2014-10-06 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160762#comment-14160762
 ] 

Ashutosh Chauhan commented on HIVE-8272:


+1

 Query with particular decimal expression causes NPE during execution 
 initialization
 ---

 Key: HIVE-8272
 URL: https://issues.apache.org/jira/browse/HIVE-8272
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer, Physical Optimizer
Reporter: Matt McCline
Assignee: Jason Dere
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8272.1.patch


 Query:
 {code}
 select 
   cast(sum(dc)*100 as decimal(11,3)) as c1
   from somedecimaltable
   order by c1
   limit 100;
 {code}
 Fails during execution initialization due to *null* ExprNodeDesc.
 Noticed while trying to simplify a Vectorization issue and realized it was a 
 more general issue.
 {code}
 Caused by: java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154)
   ... 22 more
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:215)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:427)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133)
   ... 22 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154)
   at 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154)
   at 
 org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:148)
   ... 38 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.

2014-10-06 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8258:
-
Status: Open  (was: Patch Available)

Found an issue where this patch prevents the initiator from starting properly.

 Compactor cleaners can be starved on a busy table or partition.
 ---

 Key: HIVE-8258
 URL: https://issues.apache.org/jira/browse/HIVE-8258
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.13.1
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, 
 HIVE-8258.patch


 Currently the cleaning thread in the compactor does not run on a table or 
 partition while any locks are held on this partition.  This leaves it open to 
 starvation in the case of a busy table or partition.  It only needs to wait 
 until all locks on the table/partition at the time of the compaction have 
 expired.  Any jobs initiated after that (and thus any locks obtained) will be 
 for the new versions of the files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8365) TPCDS query #7 fails with IndexOutOfBoundsException [Spark Branch]

2014-10-06 Thread Xuefu Zhang (JIRA)

Xuefu Zhang created HIVE-8365:
-

 Summary: TPCDS query #7 fails with IndexOutOfBoundsException 
[Spark Branch]
 Key: HIVE-8365
 URL: https://issues.apache.org/jira/browse/HIVE-8365
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang


Running TPCDS query #17, given below, results IndexOutOfBoundsException: 
{code}
14/10/06 12:24:05 ERROR executor.Executor: Exception in task 0.0 in stage 7.0 
(TID 2)
java.lang.IndexOutOfBoundsException: Index: 1902425, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at 
org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:820)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670)
at 
org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:51)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:114)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:139)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:92)
at 
scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210)
at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
{code}

The query is:
{code}
select
  i_item_id,
  avg(ss_quantity) agg1,
  avg(ss_list_price) agg2,
  avg(ss_coupon_amt) agg3,
  avg(ss_sales_price) agg4
from
  store_sales,
  customer_demographics,
  date_dim,
  item,
  promotion
where
  ss_sold_date_sk = d_date_sk
  and ss_item_sk = i_item_sk
  and ss_cdemo_sk = cd_demo_sk
  and ss_promo_sk = p_promo_sk
  and cd_gender = 'F'
  and cd_marital_status = 'W'
  and cd_education_status = 'Primary'
  and (p_channel_email = 'N'
or p_channel_event = 'N')
  and d_year = 1998
  and ss_sold_date_sk between 2450815 and 2451179 -- partition key filter
group by
  i_item_id
order by
  i_item_id
limit 100;
{code},
though many other TPCDS queries give the same exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)

Ashutosh Chauhan created HIVE-8366:
--

 Summary: CBO fails if there is a table sample in subquery
 Key: HIVE-8366
 URL: https://issues.apache.org/jira/browse/HIVE-8366
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8366.patch

Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8366:
---
Attachment: HIVE-8366.patch

 CBO fails if there is a table sample in subquery
 

 Key: HIVE-8366
 URL: https://issues.apache.org/jira/browse/HIVE-8366
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8366.patch


 Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery

2014-10-06 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8366:
---
Status: Patch Available  (was: Open)

 CBO fails if there is a table sample in subquery
 

 Key: HIVE-8366
 URL: https://issues.apache.org/jira/browse/HIVE-8366
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: 0.14.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8366.patch


 Bail out from cbo in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160813#comment-14160813
 ] 

Lefty Leverenz commented on HIVE-6500:
--

Good catch, [~szehon].  Yes, the Newly Created Tables section of the StatsDev 
wikidoc needs to be updated, keeping in mind that releases 0.7 though 0.12 have 
jdbc:derby as the default for *hive.stats.dbclass* so we can't just swap in 
the new default value.  Linking to/from *hive.stats.dbclass* in the 
Configuration Properties doc will help with future maintenance.

* [StatsDev -- Newly Created Tables | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-NewlyCreatedTables]
* [Configuration Properties -- hive.stats.dbclass | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass]

Also, the HiveConf.java description of *hive.stats.dbclass* omits the fs 
value.  I can correct that in the next patch for HIVE-6586, perhaps using the 
wiki description or a variant of it:

{quote}
The storage that stores temporary Hive statistics. In FS based statistics 
collection, each task writes statistics it has collected in a file on the 
filesystem, which will be aggregated after the job has finished. Supported 
values are fs (filesystem), jdbc(:.*), hbase, counter and custom (HIVE-6500).
{quote}

Suggested changes to that description:  (1) change FS to filesystem (fs), 
(2) remove or move (HIVE-6500) so it doesn't imply that HIVE-6500 added 
custom, (3) change jdbc(:.*) to jdbc:database and explain that 
database can be derby, mysql, ... and what others -- is there a complete list 
anywhere?

P.S.  What do you mean by It is actually not linked from the top?  Top of 
what?  Maybe you mean it belongs on the Home page.  Currently it's listed on 
the LanguageManual page, but that's easy to change -- we can even list it both 
places.

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
  Labels: TODOC14
 Fix For: 0.13.0

 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review Request 26379: Disable cbo for tablesample

2014-10-06 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26379/
---

Review request for hive and John Pullokkaran.


Bugs: HIVE-8366
https://issues.apache.org/jira/browse/HIVE-8366


Repository: hive-git


Description
---

Disable cbo for tablesample


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
7c2b0cd 

Diff: https://reviews.apache.org/r/26379/diff/


Testing
---

udf_substr.q


Thanks,

Ashutosh Chauhan

[jira] [Commented] (HIVE-8120) Umbrella JIRA tracking Parquet improvements

2014-10-06 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160823#comment-14160823
 ] 

Brock Noland commented on HIVE-8120:


Linking to HIVE-4329

 Umbrella JIRA tracking Parquet improvements
 ---

 Key: HIVE-8120
 URL: https://issues.apache.org/jira/browse/HIVE-8120
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-6500:
-
Labels: TODOC13 TODOC14  (was: TODOC14)

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
  Labels: TODOC13, TODOC14
 Fix For: 0.13.0

 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7800) Parquet Column Index Access Schema Size Checking

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7800:
---
   Resolution: Fixed
Fix Version/s: (was: 0.14.0)
   0.15.0
   Status: Resolved  (was: Patch Available)

Thank you so much Daniel! I have committed this to trunk.

[~vikram.dixit] could we get this into 0.14?

 Parquet Column Index Access Schema Size Checking
 

 Key: HIVE-7800
 URL: https://issues.apache.org/jira/browse/HIVE-7800
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Daniel Weeks
Assignee: Daniel Weeks
Priority: Critical
 Fix For: 0.15.0

 Attachments: HIVE-7800.1.patch, HIVE-7800.2.patch, HIVE-7800.3.patch


 In the case that a parquet formatted table has partitions where the files 
 have different size schema, using column index access can result in an index 
 out of bounds exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-6500) Stats collection via filesystem

2014-10-06 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003107#comment-14003107
 ] 

Lefty Leverenz edited comment on HIVE-6500 at 10/6/14 8:02 PM:
---

Unfortunately my review board advice not to patch hive-default.xml.template led 
to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in 
the template file.  But it's updated in the most recent patch for HIVE-6037, so 
presumably it will be corrected by release 0.14.0.

Sorry about that.

Edit:  The updated parameter description didn't make it into the new version of 
HiveConf.java, so it needs to be fixed in another patch.  (I suggest HIVE-6586.)


was (Author: le...@hortonworks.com):
Unfortunately my review board advice not to patch hive-default.xml.template led 
to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in 
the template file.  But it's updated in the most recent patch for HIVE-6037, so 
presumably it will be corrected by release 0.14.0.

Sorry about that.

 Stats collection via filesystem
 ---

 Key: HIVE-6500
 URL: https://issues.apache.org/jira/browse/HIVE-6500
 Project: Hive
  Issue Type: New Feature
  Components: Statistics
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
  Labels: TODOC13, TODOC14
 Fix For: 0.13.0

 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch


 Recently, support for stats gathering via counter was [added | 
 https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has 
 following issues:
 * [Length of counter group name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340]
 * [Length of counter name is limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337]
 * [Number of distinct counter groups are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343]
 * [Number of distinct counters are limited | 
 https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334]
 Although, these limits are configurable, but setting them to higher value 
 implies increased memory load on AM and job history server.
 Now, whether these limits makes sense or not is [debatable | 
 https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that 
 Hive doesn't make use of counters features of framework so that it we can 
 evolve this feature without relying on support from framework. Filesystem 
 based counter collection is a step in that direction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8361) NPE in PTFOperator when there are empty partitions

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160851#comment-14160851
 ] 

Mostafa Mokhtar commented on HIVE-8361:
---

[~rhbutani]
Validated the fix on query98 and it ran fine.

 NPE in PTFOperator when there are empty partitions
 --

 Key: HIVE-8361
 URL: https://issues.apache.org/jira/browse/HIVE-8361
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-8361.1.patch


 Here is a simple query to reproduce this:
 {code}
 select sum(p_size) over (partition by p_mfgr )
 from part where p_mfgr = 'some non existent mfgr';
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8352) Enable windowing.q for spark

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8352:
---
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

 Enable windowing.q for spark
 

 Key: HIVE-8352
 URL: https://issues.apache.org/jira/browse/HIVE-8352
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: spark-branch

 Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
 hive-8385.patch


 We should enable windowing.q for basic windowing coverage. After checking out 
 the spark branch, we would build:
 {noformat}
 $ mvn clean install -DskipTests -Phadoop-2
 $ cd itests/
 $ mvn clean install -DskipTests -Phadoop-2
 {noformat}
 Then generate the windowing.q.out file:
 {noformat}
 $ cd qtest-spark/
 $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
 -Dtest.output.overwrite=true
 {noformat}
 Compare the output against MapReduce:
 {noformat}
 $ diff -y -W 150 
 ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
 ../../ql/src/test/results/clientpositive/windowing.q.out| less
 {noformat}
 And if everything looks good, add it to {{spark.query.files}} in 
 {{./itests/src/test/resources/testconfiguration.properties}}
 then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26325: HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26325/#review55571
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 3, 2014, 7:13 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26325/
 ---
 
 (Updated Oct. 3, 2014, 7:13 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-8172
 https://issues.apache.org/jira/browse/HIVE-8172
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-8172
 
 
 Diffs
 -
 
   jdbc/src/java/org/apache/hive/jdbc/Utils.java e6b1a36 
   jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java 06795a5 
 
 Diff: https://reviews.apache.org/r/26325/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160863#comment-14160863
 ] 

Thejas M Nair commented on HIVE-8172:
-

+1

 HiveServer2 dynamic service discovery should let the JDBC client use default 
 ZooKeeper namespace
 

 Key: HIVE-8172
 URL: https://issues.apache.org/jira/browse/HIVE-8172
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Critical
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8172.1.patch


 Currently the client provides a url like:
  
 jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.
  
 The zooKeeperNamespace param when not provided should use the default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-8335) TestHCatLoader/TestHCatStorer failures on pre-commit tests

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere resolved HIVE-8335.
--
   Resolution: Fixed
Fix Version/s: 0.14.0
 Assignee: Gopal V

Issue was resolved by Gopal reverting HIVE-8271.

 TestHCatLoader/TestHCatStorer failures on pre-commit tests
 --

 Key: HIVE-8335
 URL: https://issues.apache.org/jira/browse/HIVE-8335
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Tests
Reporter: Jason Dere
Assignee: Gopal V
 Fix For: 0.14.0


 Looks like a number of Hive pre-commit tests have been failing with the 
 following failures:
 {noformat}
 org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes[5]
 org.apache.hive.hcatalog.pig.TestHCatStorer.testNoAlias[5]
 org.apache.hive.hcatalog.pig.TestHCatStorer.testEmptyStore[5]
 org.apache.hive.hcatalog.pig.TestHCatStorer.testDynamicPartitioningMultiPartColsNoDataInDataNoSpec[5]
 org.apache.hive.hcatalog.pig.TestHCatStorer.testPartitionPublish[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic[5]
 org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic[5]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26277: Shim KerberosName (causes build failure on hadoop-1)

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26277/#review55572
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 3, 2014, 6:39 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26277/
 ---
 
 (Updated Oct. 3, 2014, 6:39 p.m.)
 
 
 Review request for hive, dilli dorai, Szehon Ho, and Thejas Nair.
 
 
 Bugs: HIVE-8324
 https://issues.apache.org/jira/browse/HIVE-8324
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-8324
 
 
 Diffs
 -
 
   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 83dd2e6 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 
 312d05e 
   shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
 a353a46 
   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
 030cb75 
   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
 0731108 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 4fcaa1e 
 
 Diff: https://reviews.apache.org/r/26277/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8321:
-
Attachment: HIVE-8321.3.patch

Looks like HCat tests were failing due to HIVE-8335.  Re-attaching same patch.

 Fix serialization of TypeInfo for qualified types
 -

 Key: HIVE-8321
 URL: https://issues.apache.org/jira/browse/HIVE-8321
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch


 TypeInfos for decimal/char/varchar don't appear to be serializing properly 
 with javaXML.
 Decimal needed proper getters/setters for precision/scale.
 Also disabling setTypeInfo since for decimal/char/varchar the proper type 
 name should already be set by the constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8352) Enable windowing.q for spark [Spark Branch]

2014-10-06 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8352:
---
Summary: Enable windowing.q for spark [Spark Branch]  (was: Enable 
windowing.q for spark)

 Enable windowing.q for spark [Spark Branch]
 ---

 Key: HIVE-8352
 URL: https://issues.apache.org/jira/browse/HIVE-8352
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Brock Noland
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: spark-branch

 Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, 
 hive-8385.patch


 We should enable windowing.q for basic windowing coverage. After checking out 
 the spark branch, we would build:
 {noformat}
 $ mvn clean install -DskipTests -Phadoop-2
 $ cd itests/
 $ mvn clean install -DskipTests -Phadoop-2
 {noformat}
 Then generate the windowing.q.out file:
 {noformat}
 $ cd qtest-spark/
 $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 
 -Dtest.output.overwrite=true
 {noformat}
 Compare the output against MapReduce:
 {noformat}
 $ diff -y -W 150 
 ../../ql/src/test/results/clientpositive/spark/windowing.q.out 
 ../../ql/src/test/results/clientpositive/windowing.q.out| less
 {noformat}
 And if everything looks good, add it to {{spark.query.files}} in 
 {{./itests/src/test/resources/testconfiguration.properties}}
 then submit the patch including the .q file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types

2014-10-06 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-8321:
-
Status: Patch Available  (was: Open)

 Fix serialization of TypeInfo for qualified types
 -

 Key: HIVE-8321
 URL: https://issues.apache.org/jira/browse/HIVE-8321
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch


 TypeInfos for decimal/char/varchar don't appear to be serializing properly 
 with javaXML.
 Decimal needed proper getters/setters for precision/scale.
 Also disabling setTypeInfo since for decimal/char/varchar the proper type 
 name should already be set by the constructor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8358) Constant folding should happen before predicate pushdown

2014-10-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160884#comment-14160884
 ] 

Hive QA commented on HIVE-8358:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12673119/HIVE-8358.patch

{color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 6525 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch_threshold
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_sample1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1133/

Messages:
{noformat}

[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler

2014-10-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160899#comment-14160899
 ] 

Szehon Ho commented on HIVE-7068:
-

This breaks hadoop-1 compilation, [~elserj] would you have a chance to look at 
this?  HIVE-8363, a reference to StringUtils method that changed signature

 Integrate AccumuloStorageHandler
 

 Key: HIVE-7068
 URL: https://issues.apache.org/jira/browse/HIVE-7068
 Project: Hive
  Issue Type: New Feature
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 0.14.0

 Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, 
 HIVE-7068.4.patch


 [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
 HBase. Some [initial 
 work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
 to support querying an Accumulo table using Hive already. It is not a 
 complete solution as, most notably, the current implementation presently 
 lacks support for INSERTs.
 I would like to polish up the AccumuloStorageHandler (presently based on 
 0.10), implement missing basic functionality and compare it to the 
 HBaseStorageHandler (to ensure that we follow the same general usage 
 patterns).
 I've also been in communication with [~bfem] (the initial author) who 
 expressed interest in working on this again. I hope to coordinate efforts 
 with him.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1

2014-10-06 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser reassigned HIVE-8363:


Assignee: Josh Elser

 AccumuloStorageHandler compile failure hadoop-1
 ---

 Key: HIVE-8363
 URL: https://issues.apache.org/jira/browse/HIVE-8363
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Affects Versions: 0.14.0
Reporter: Szehon Ho
Assignee: Josh Elser
Priority: Blocker

 There's an error about AccumuloStorageHandler compiling on hadoop-1.  It 
 seems the signature of split() is not the same.  Looks like we can should use 
 another utils to fix this.
 {code}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) 
 on project hive-accumulo-handler: Compilation failure
 [ERROR] 
 /data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52]
  no suitable method found for split(java.lang.String,char)
 [ERROR] method 
 org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not 
 applicable
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler

2014-10-06 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160903#comment-14160903
 ] 

Josh Elser commented on HIVE-7068:
--

[~szehon], yeah, I can get a patch up there today.

 Integrate AccumuloStorageHandler
 

 Key: HIVE-7068
 URL: https://issues.apache.org/jira/browse/HIVE-7068
 Project: Hive
  Issue Type: New Feature
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 0.14.0

 Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, 
 HIVE-7068.4.patch


 [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to 
 HBase. Some [initial 
 work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done 
 to support querying an Accumulo table using Hive already. It is not a 
 complete solution as, most notably, the current implementation presently 
 lacks support for INSERTs.
 I would like to polish up the AccumuloStorageHandler (presently based on 
 0.10), implement missing basic functionality and compare it to the 
 HBaseStorageHandler (to ensure that we follow the same general usage 
 patterns).
 I've also been in communication with [~bfem] (the initial author) who 
 expressed interest in working on this again. I hope to coordinate efforts 
 with him.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26379: Disable cbo for tablesample

2014-10-06 Thread John Pullokkaran


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26379/#review55577
---

Ship it!


Ship It!

- John Pullokkaran


On Oct. 6, 2014, 7:50 p.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26379/
 ---
 
 (Updated Oct. 6, 2014, 7:50 p.m.)
 
 
 Review request for hive and John Pullokkaran.
 
 
 Bugs: HIVE-8366
 https://issues.apache.org/jira/browse/HIVE-8366
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Disable cbo for tablesample
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 
 7c2b0cd 
 
 Diff: https://reviews.apache.org/r/26379/diff/
 
 
 Testing
 ---
 
 udf_substr.q
 
 
 Thanks,
 
 Ashutosh Chauhan

[jira] [Created] (HIVE-8367) delete writes records in wrong order in some cases

2014-10-06 Thread Alan Gates (JIRA)

Alan Gates created HIVE-8367:


 Summary: delete writes records in wrong order in some cases
 Key: HIVE-8367
 URL: https://issues.apache.org/jira/browse/HIVE-8367
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Blocker
 Fix For: 0.14.0


I have found one query with 10k records where you do:
create table
insert into table -- 10k records
delete from table -- just some records

The records in the delete delta are not ordered properly by rowid.

I assume this applies to updates as well, but I haven't tested it yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Review Request 26282: Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Thejas Nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26282/#review55581
---

Ship it!


Ship It!

- Thejas Nair


On Oct. 2, 2014, 9 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/26282/
 ---
 
 (Updated Oct. 2, 2014, 9 p.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-8193
 https://issues.apache.org/jira/browse/HIVE-8193
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 https://issues.apache.org/jira/browse/HIVE-8193
 
 
 Diffs
 -
 
   service/src/java/org/apache/hive/service/cli/CLIService.java b46c5b4 
   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
 ecc9b96 
   
 service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java
  9ee9785 
   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
 4a1e004 
   service/src/java/org/apache/hive/service/server/HiveServer2.java c667533 
   service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java 
 fb784aa 
   
 service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java
  47d3a56 
 
 Diff: https://reviews.apache.org/r/26282/diff/
 
 
 Testing
 ---
 
 Manually with ZooKeeper.
 
 
 Thanks,
 
 Vaibhav Gumashta

[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160935#comment-14160935
 ] 

Thejas M Nair commented on HIVE-8193:
-

+1

 Hook HiveServer2 dynamic service discovery with session time out
 

 Key: HIVE-8193
 URL: https://issues.apache.org/jira/browse/HIVE-8193
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8193.1.patch


 For dynamic service discovery, if the HiveServer2 instance is removed from 
 ZooKeeper, currently, on the last client close, the server shuts down. 
 However, we need to ensure that this also happens when a session is closed on 
 timeout and no current sessions exit on this instance of HiveServer2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8368) compactor is improperly writing delete records in base file

2014-10-06 Thread Alan Gates (JIRA)

Alan Gates created HIVE-8368:


 Summary: compactor is improperly writing delete records in base 
file
 Key: HIVE-8368
 URL: https://issues.apache.org/jira/browse/HIVE-8368
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0


When the compactor reads records from the base and deltas, it is not properly 
dropping delete records.  This leads to oversized base files, and possibly to 
wrong query results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8360) Add cross cluster support for webhcat E2E tests

2014-10-06 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160957#comment-14160957
 ] 

Thejas M Nair commented on HIVE-8360:
-

+1

 Add cross cluster support for webhcat E2E tests
 ---

 Key: HIVE-8360
 URL: https://issues.apache.org/jira/browse/HIVE-8360
 Project: Hive
  Issue Type: Test
  Components: Tests, WebHCat
 Environment: Secure cluster
Reporter: Aswathy Chellammal Sreekumar
 Attachments: AD-MIT.patch


 In current Webhcat E2E test setup, cross domain secure cluster runs will fail 
 since the realm name for user principles are not included in the kinit 
 command. This patch concatenates the realm name to the user principal there 
 by resulting in a successful kinit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]

2014-10-06 Thread Chao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160959#comment-14160959
 ] 

Chao commented on HIVE-8362:


Ran it several times - sometimes I got this diff:

{noformat}
--- a/ql/src/test/results/clientpositive/spark/parallel.q.out
+++ b/ql/src/test/results/clientpositive/spark/parallel.q.out
@@ -149,6 +149,7 @@ POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src
 POSTHOOK: Output: default@src_a
 POSTHOOK: Output: default@src_b
+POSTHOOK: Lineage: src_a.key SIMPLE [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_a.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_b.key SIMPLE [(src)src.FieldSchema(name:key, 
type:string, comment:default), ]
 POSTHOOK: Lineage: src_b.value SIMPLE [(src)src.FieldSchema(name:value, 
type:string, comment:default), ]
{noformat}

 Investigate flaky test parallel.q [Spark Branch]
 

 Key: HIVE-8362
 URL: https://issues.apache.org/jira/browse/HIVE-8362
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
  Labels: spark

 Test parallel.q is flaky. It fails sometimes with error like:
 {noformat}
 Failed tests: 
   TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected 
 exception junit.framework.AssertionFailedError: Client Execution results 
 failed with error code = 1
 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
 or check ./ql/target/surefire-reports or 
 ./itests/qtest/target/surefire-reports/ for specific test cases logs.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-2828) make timestamp accessible in the hbase KeyValue

2014-10-06 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160981#comment-14160981
 ] 

Sushanth Sowmyan commented on HIVE-2828:


Sure, I'll try to look into this tonight.

 make timestamp accessible in the hbase KeyValue 
 

 Key: HIVE-2828
 URL: https://issues.apache.org/jira/browse/HIVE-2828
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.5.patch, HIVE-2828.6.patch.txt, 
 HIVE-2828.7.patch.txt, HIVE-2828.8.patch.txt


 Originated from HIVE-2781 and not accepted, but I think this could be helpful 
 to someone.
 By using special column notation ':timestamp' in HBASE_COLUMNS_MAPPING, user 
 might access timestamp value in hbase KeyValue.
 {code}
 CREATE TABLE hbase_table (key int, value string, time timestamp)
   STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
   WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf:string,:timestamp)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8261) CBO : Predicate pushdown is removed by Optiq

2014-10-06 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160999#comment-14160999
 ] 

Harish Butani commented on HIVE-8261:
-

[~vikram.dixit]  can be add this to 0.14 branch

 CBO : Predicate pushdown is removed by Optiq 
 -

 Key: HIVE-8261
 URL: https://issues.apache.org/jira/browse/HIVE-8261
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0, 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Harish Butani
 Fix For: 0.14.0

 Attachments: HIVE-8261.1.patch


 Plan for TPC-DS Q64 wasn't optimal upon looking at the logical plan I 
 realized that predicate pushdown is not applied on date_dim d1.
 Interestingly before optiq we have the predicate pushed :
 {code}
 HiveFilterRel(condition=[=($5, $1)])
 HiveJoinRel(condition=[=($3, $6)], joinType=[inner])
   HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col2=[$3], 
 _o__col3=[$1])
 HiveFilterRel(condition=[=($0, 2000)])
   HiveAggregateRel(group=[{0, 1}], agg#0=[count()], agg#1=[sum($2)])
 HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2])
   HiveJoinRel(condition=[=($1, $8)], joinType=[inner])
 HiveJoinRel(condition=[=($1, $5)], joinType=[inner])
   HiveJoinRel(condition=[=($0, $3)], joinType=[inner])
 HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], 
 ss_wholesale_cost=[$11])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]])
 HiveProjectRel(d_date_sk=[$0], d_year=[$6])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]])
   HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 
 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), 
 between(false, $1, +(35, 1), +(35, 15)))])
 HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], 
 i_color=[$17])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]])
 HiveProjectRel(_o__col0=[$0])
   HiveAggregateRel(group=[{0}])
 HiveProjectRel($f0=[$0])
   HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], 
 joinType=[inner])
 HiveProjectRel(cs_item_sk=[$15], 
 cs_order_number=[$17])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]])
 HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]])
   HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col3=[$1])
 HiveFilterRel(condition=[=($0, +(2000, 1))])
   HiveAggregateRel(group=[{0, 1}], agg#0=[count()])
 HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2])
   HiveJoinRel(condition=[=($1, $8)], joinType=[inner])
 HiveJoinRel(condition=[=($1, $5)], joinType=[inner])
   HiveJoinRel(condition=[=($0, $3)], joinType=[inner])
 HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], 
 ss_wholesale_cost=[$11])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]])
 HiveProjectRel(d_date_sk=[$0], d_year=[$6])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]])
   HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 
 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), 
 between(false, $1, +(35, 1), +(35, 15)))])
 HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], 
 i_color=[$17])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]])
 HiveProjectRel(_o__col0=[$0])
   HiveAggregateRel(group=[{0}])
 HiveProjectRel($f0=[$0])
   HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], 
 joinType=[inner])
 HiveProjectRel(cs_item_sk=[$15], 
 cs_order_number=[$17])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]])
 HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16])
   
 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]])
 {code}
 While after Optiq the filter on date_dim gets pulled up the plan 
 {code}
   HiveFilterRel(condition=[=($5, $1)]): rowcount = 1.0, cumulative cost = 
 {5.50188454E8 rows, 0.0 cpu, 0.0 io}, id = 6895
 HiveProjectRel(_o__col0=[$0], _o__col1=[$1], _o__col2=[$2], 
 _o__col3=[$3], _o__col00=[$4], _o__col10=[$5], _o__col30=[$6]):

[jira] [Commented] (HIVE-7914) Simplify join predicates for CBO to avoid cross products

2014-10-06 Thread Mostafa Mokhtar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161000#comment-14161000
 ] 

Mostafa Mokhtar commented on HIVE-7914:
---

Issue still exists 
{code}
hive explain select avg(ss_quantity) ,avg(ss_ext_sales_price) 
,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store 
,customer_demographics ,household_demographics ,customer_address ,date_dim 
where store.s_store_sk = store_sales.ss_store_sk and 
store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 
and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'M' and 
customer_demographics.cd_education_status = '4 yr Degree' and 
store_sales.ss_sales_price between 100.00 and 150.00 and 
household_demographics.hd_dep_count = 3 )or 
(store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'D' and 
customer_demographics.cd_education_status = 'Primary' and 
store_sales.ss_sales_price between 50.00 and 100.00 and 
household_demographics.hd_dep_count = 1 ) or 
(store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and 
customer_demographics.cd_demo_sk = ss_cdemo_sk and 
customer_demographics.cd_marital_status = 'U' and 
customer_demographics.cd_education_status = 'Advanced Degree' and 
store_sales.ss_sales_price between 150.00 and 200.00 and 
household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('KY', 'GA', 'NM') and 
store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('MT', 'OR', 'IN') and 
store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = 
customer_address.ca_address_sk and customer_address.ca_country = 'United 
States' and customer_address.ca_state in ('WI', 'MO', 'WV') and 
store_sales.ss_net_profit between 50 and 250 )) ;
Warning: Map Join MAPJOIN[49][bigTable=?] in task 'Map 4' is a cross product
Warning: Map Join MAPJOIN[48][bigTable=?] in task 'Map 4' is a cross product
Warning: Map Join MAPJOIN[47][bigTable=?] in task 'Map 4' is a cross product
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Map 4 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 
(BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE)
Reducer 5 - Map 4 (SIMPLE_EDGE)
  DagName: mmokhtar_20141006173232_992a372b-cc0e-40d5-b51f-7098561df464:3
  Vertices:
Map 1
Map Operator Tree:
TableScan
  alias: household_demographics
  Statistics: Num rows: 7200 Data size: 770400 Basic stats: 
COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 7200 Data size: 770400 Basic stats: 
COMPLETE Column stats: NONE
value expressions: hd_demo_sk (type: int), hd_dep_count 
(type: int)
Execution mode: vectorized
Map 2
Map Operator Tree:
TableScan
  alias: store
  filterExpr: s_store_sk is not null (type: boolean)
  Statistics: Num rows: 212 Data size: 405680 Basic stats: 
COMPLETE Column stats: NONE
  Filter Operator
predicate: s_store_sk is not null (type: boolean)
Statistics: Num rows: 106 Data size: 202840 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  key expressions: s_store_sk (type: int)
  sort order: +
  Map-reduce partition columns: s_store_sk (type: int)
  Statistics: Num rows: 106 Data size: 202840 Basic stats: 
COMPLETE Column stats: NONE
Execution mode: vectorized
Map 3
Map Operator Tree:
TableScan
  alias: customer_address
  Statistics: Num rows: 80 Data size: 811903688 Basic 
stats: COMPLETE Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 80 Data size: 811903688 Basic 
stats: COMPLETE Column stats: NONE
value expressions: ca_address_sk (type: int), ca_state 
(type: string), ca_country (type: string)
Execution mode: vectorized
Map 4
Map Operator Tree:
TableScan
  alias:

1 2 3 >

1 - 100 of 206 matches

Mail list logo