date:20140321


[ 
https://issues.apache.org/jira/browse/HIVE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942844#comment-13942844
 ] 

Ashutosh Chauhan commented on HIVE-6241:


+1

 Remove direct reference of Hadoop23Shims inQTestUtil
 

 Key: HIVE-6241
 URL: https://issues.apache.org/jira/browse/HIVE-6241
 Project: Hive
  Issue Type: Wish
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6241.1.patch.txt, HIVE-6241.2.patch.txt


 {code}
 if (clusterType == MiniClusterType.tez) {
   if (!(shims instanceof Hadoop23Shims)) {
 throw new Exception(Cannot run tez on hadoop-1, Version: 
 +this.hadoopVer);
   }
   mr = ((Hadoop23Shims)shims).getMiniTezCluster(conf, 4, 
 getHdfsUriString(fs.getUri().toString()), 1);
 } else {
   mr = shims.getMiniMrCluster(conf, 4, 
 getHdfsUriString(fs.getUri().toString()), 1);
 }
 {code}
 Not important but a little annoying when the shims is not in classpath. And I 
 think hadoop24shims or later might support tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


[ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942849#comment-13942849
 ] 

Lefty Leverenz commented on HIVE-6681:
--

This adds *hive.serdes.using.metastore.for.schema* to HiveConf.java, but it 
needs a description.  (Perhaps it's self-evident.)  How about a release note?

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.5.patch, HIVE-6681.6.patch, HIVE-6681.7.patch, HIVE-6681.8.patch, 
 HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6580) Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.


 [ 
https://issues.apache.org/jira/browse/HIVE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6580:
---

   Resolution: Fixed
Fix Version/s: (was: 0.13.0)
   0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk  0.13. Thanks, Vaibhav!

 Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.
 ---

 Key: HIVE-6580
 URL: https://issues.apache.org/jira/browse/HIVE-6580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.14.0

 Attachments: HIVE-6580.1.patch


 Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6580) Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.


 [ 
https://issues.apache.org/jira/browse/HIVE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6580:
---

Fix Version/s: (was: 0.14.0)
   0.13.0

 Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.
 ---

 Key: HIVE-6580
 URL: https://issues.apache.org/jira/browse/HIVE-6580
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6580.1.patch


 Refactor ThriftBinaryCLIService and ThriftHttpCLIService tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6241) Remove direct reference of Hadoop23Shims inQTestUtil


 [ 
https://issues.apache.org/jira/browse/HIVE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6241:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Remove direct reference of Hadoop23Shims inQTestUtil
 

 Key: HIVE-6241
 URL: https://issues.apache.org/jira/browse/HIVE-6241
 Project: Hive
  Issue Type: Wish
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-6241.1.patch.txt, HIVE-6241.2.patch.txt


 {code}
 if (clusterType == MiniClusterType.tez) {
   if (!(shims instanceof Hadoop23Shims)) {
 throw new Exception(Cannot run tez on hadoop-1, Version: 
 +this.hadoopVer);
   }
   mr = ((Hadoop23Shims)shims).getMiniTezCluster(conf, 4, 
 getHdfsUriString(fs.getUri().toString()), 1);
 } else {
   mr = shims.getMiniMrCluster(conf, 4, 
 getHdfsUriString(fs.getUri().toString()), 1);
 }
 {code}
 Not important but a little annoying when the shims is not in classpath. And I 
 think hadoop24shims or later might support tez.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken


[ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942920#comment-13942920
 ] 

Hive QA commented on HIVE-6682:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635706/HIVE-6682.02.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5431 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_truncate_column_buckets
org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatementAsync
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1876/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1876/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635706

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6682.01.patch, HIVE-6682.02.patch, HIVE-6682.patch


 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at

[jira] [Updated] (HIVE-6331) HIVE-5279 deprecated UDAF class without explanation/documentation/alternative

2014-03-21 Thread Lars Francke (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Francke updated HIVE-6331:
---

Attachment: HIVE-6331.3.patch

Incorporated latest comments

 HIVE-5279 deprecated UDAF class without explanation/documentation/alternative
 -

 Key: HIVE-6331
 URL: https://issues.apache.org/jira/browse/HIVE-6331
 Project: Hive
  Issue Type: Bug
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-5279.1.patch, HIVE-6331.2.patch, HIVE-6331.3.patch


 HIVE-5279 added a @Deprecated annotation to the {{UDAF}} class. The comment 
 in that class says {quote}UDAF classes are REQUIRED to inherit from this 
 class.{quote}
 One of these two needs to be updated. Either remove the annotation or 
 document why it was deprecated and what to use instead.
 Unfortunately [~navis] did not leave any documentation about his intentions.
 I'm happy to provide a patch once I know the intentions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5652) Improve JavaDoc of UDF class

2014-03-21 Thread Lars Francke (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Francke updated HIVE-5652:
---

Attachment: HIVE-5652.3.patch

Incorporated comments

 Improve JavaDoc of UDF class
 

 Key: HIVE-5652
 URL: https://issues.apache.org/jira/browse/HIVE-5652
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Trivial
 Attachments: HIVE-5652.1.patch, HIVE-5652.2.patch, HIVE-5652.3.patch


 I think the JavaDoc for the UDF class can be improved. I'll attach a patch 
 shortly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19525: Clean up math based UDFs

2014-03-21 Thread Lars Francke


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19525/
---

Review request for hive.


Bugs: HIVE-6510
https://issues.apache.org/jira/browse/HIVE-6510


Repository: hive-git


Description
---

HIVE-6327, HIVE-6246 and HIVE-6385 touched a lot of the math based UDFs. There 
are some code inconsistencies and warnings left. This cleans up all the 
problems I could find.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 18c79a7 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java cfd5d38 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java 641bba2 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java bfa95ee 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDegrees.java bc5e1e2 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java cf6f53e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java eb5f646 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java 7a4d8a7 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 00dc319 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java 9202258 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java c1981af 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java fd1f0e3 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 6e4bee0 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java 8f757f2 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 17094c9 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java c286619 

Diff: https://reviews.apache.org/r/19525/diff/


Testing
---


Thanks,

Lars Francke

[jira] [Commented] (HIVE-6510) Clean up math based UDFs

2014-03-21 Thread Lars Francke (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942984#comment-13942984
 ] 

Lars Francke commented on HIVE-6510:


Sure: https://reviews.apache.org/r/19525/

Thanks!

 Clean up math based UDFs
 

 Key: HIVE-6510
 URL: https://issues.apache.org/jira/browse/HIVE-6510
 Project: Hive
  Issue Type: Improvement
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-6510.1.patch


 HIVE-6327, HIVE-6246 and HIVE-6385 touched a lot of the math based UDFs. 
 There are some code inconsistencies and warnings left. This cleans up all the 
 problems I could find.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


[ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942991#comment-13942991
 ] 

Hive QA commented on HIVE-6685:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635452/HIVE-6685.2.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1877/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1877/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1877/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
service/target contrib/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1579927.

At revision 1579927.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635452

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.2.patch, HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-21 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Status: Open  (was: Patch Available)

Need to rebase

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-21 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Attachment: HIVE-6222.5.patch

Resolved conflict with HIVE-6664

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch, HIVE-6222.5.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-21 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Status: Patch Available  (was: Open)

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch, HIVE-6222.5.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6432) Remove deprecated methods in HCatalog


[ 
https://issues.apache.org/jira/browse/HIVE-6432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943056#comment-13943056
 ] 

Hive QA commented on HIVE-6432:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635189/HIVE-6432.2.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1881/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1881/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1881/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'hcatalog/webhcat/svr/src/main/bin/webhcat_server.sh'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
service/target contrib/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1579940.

At revision 1579940.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635189

 Remove deprecated methods in HCatalog
 -

 Key: HIVE-6432
 URL: https://issues.apache.org/jira/browse/HIVE-6432
 Project: Hive
  Issue Type: Task
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.14.0

 Attachments: 6432-addendum.patch, 6432-full.patch, HIVE-6432.2.patch, 
 HIVE-6432.patch, HIVE-6432.wip.1.patch, HIVE-6432.wip.2.patch, 
 hcat.6432.test.out


 There are a lot of methods in HCatalog that have been deprecated in HCatalog 
 0.5, and some that were recently deprecated in Hive 0.11 (joint release with 
 HCatalog).
 The goal for HCatalog deprecation is that in general, after something has 
 been deprecated, it is expected to stay around for 2 releases, which means 
 hive-0.13 will be the last release to ship with all the methods that were 
 deprecated in hive-0.11 (the org.apache.hcatalog.* files should all be 
 removed afterwards), and it is also good for us to clean out and nuke all 
 other older deprecated methods.
 We should take this on early in a dev/release cycle to allow us time to 
 resolve all fallout, so I propose that we remove all HCatalog deprecated 
 methods after we branch out 0.13 and 0.14 becomes trunk.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.


[ 
https://issues.apache.org/jira/browse/HIVE-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943054#comment-13943054
 ] 

Hive QA commented on HIVE-6686:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635168/HIVE-6686.patch

{color:green}SUCCESS:{color} +1 5436 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1878/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1878/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635168

 webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of 
 log4j.properties file on local filesystem.
 --

 Key: HIVE-6686
 URL: https://issues.apache.org/jira/browse/HIVE-6686
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6686.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19525: Clean up math based UDFs

2014-03-21 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19525/#review38095
---


Looks very good. Just one minor suggestion for your consideration.


ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java
https://reviews.apache.org/r/19525/#comment70072

Could you make result final, to be consistent with other changes?


- Xuefu Zhang


On March 21, 2014, 11:31 a.m., Lars Francke wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19525/
 ---
 
 (Updated March 21, 2014, 11:31 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6510
 https://issues.apache.org/jira/browse/HIVE-6510
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-6327, HIVE-6246 and HIVE-6385 touched a lot of the math based UDFs. 
 There are some code inconsistencies and warnings left. This cleans up all the 
 problems I could find.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 18c79a7 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java cfd5d38 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java 641bba2 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java bfa95ee 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFDegrees.java bc5e1e2 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java cf6f53e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java eb5f646 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java 7a4d8a7 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 00dc319 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java 9202258 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java c1981af 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java fd1f0e3 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 6e4bee0 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java 8f757f2 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 17094c9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java c286619 
 
 Diff: https://reviews.apache.org/r/19525/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Lars Francke

[jira] [Commented] (HIVE-6689) Provide an option to not display partition columns separately in describe table output


[ 
https://issues.apache.org/jira/browse/HIVE-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943161#comment-13943161
 ] 

Hive QA commented on HIVE-6689:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635956/HIVE-6689.2.patch

{color:green}SUCCESS:{color} +1 5437 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1882/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1882/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635956

 Provide an option to not display partition columns separately in describe 
 table output 
 ---

 Key: HIVE-6689
 URL: https://issues.apache.org/jira/browse/HIVE-6689
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6689.1.patch, HIVE-6689.2.patch, HIVE-6689.patch


 In ancient Hive partition columns were not displayed differently, in newer 
 version they are displayed differently. This has resulted in backward 
 incompatible change for upgrade scenarios. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6689) Provide an option to not display partition columns separately in describe table output


 [ 
https://issues.apache.org/jira/browse/HIVE-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6689:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk  0.13

 Provide an option to not display partition columns separately in describe 
 table output 
 ---

 Key: HIVE-6689
 URL: https://issues.apache.org/jira/browse/HIVE-6689
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6689.1.patch, HIVE-6689.2.patch, HIVE-6689.patch


 In ancient Hive partition columns were not displayed differently, in newer 
 version they are displayed differently. This has resulted in backward 
 incompatible change for upgrade scenarios. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6203) Privileges of role granted indrectily to user is not applied


 [ 
https://issues.apache.org/jira/browse/HIVE-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6203:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Fixed via HIVE-5954

 Privileges of role granted indrectily to user is not applied
 

 Key: HIVE-6203
 URL: https://issues.apache.org/jira/browse/HIVE-6203
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
 Fix For: 0.13.0

 Attachments: HIVE-6203.1.patch.txt, HIVE-6203.2.patch.txt, 
 HIVE-6203.3.patch.txt, HIVE-6203.4.patch.txt


 For example, 
 {noformat}
 create role r1;
 create role r2;
 grant select on table eq to role r1;
 grant role r1 to role r2;
 grant role r2 to user admin;
 select * from eq limit 5;
 {noformat}
 admin - r2 - r1 - SEL on table eq
 but user admin fails to access table eq



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data

[
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943226#comment-13943226
]

Szehon Ho commented on HIVE-6131:
-

We saw this issue earlier while working with schema evolution for parquet and
other serde, but had thought it was expected behavior (that different partition
keep old column schema after HIVE-3833). This will be a good fix to have.

New columns after table alter result in null values despite data

Key: HIVE-6131
URL: https://issues.apache.org/jira/browse/HIVE-6131
Project: Hive
Issue Type: Bug
Reporter: James Vaughan
Priority: Minor

Hi folks,
I found and verified a bug on our CDH 4.0.3 install of Hive when adding
columns to tables with Partitions using 'REPLACE COLUMNS'. I dug through the
Jira a little bit and didn't see anything for it so hopefully this isn't just
noise on the radar.
Basically, when you alter a table with partitions and then reupload data to
that partition, it doesn't seem to recognize the extra data that actually
exists in HDFS- as in, returns NULL values on the new column despite having
the data and recognizing the new column in the metadata.
Here's some steps to reproduce using a basic table:
1. Run this hive command: CREATE TABLE jvaughan_test (col1 string)
partitioned by (day string);
2. Create a simple file on the system with a couple of entries, something
like hi and hi2 separated by newlines.
3. Run this hive command, pointing it at the file: LOAD DATA LOCAL INPATH
'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
4. Confirm the data with: SELECT * FROM jvaughan_test WHERE day =
'2014-01-02';
5. Alter the column definitions: ALTER TABLE jvaughan_test REPLACE COLUMNS
(col1 string, col2 string);
6. Edit your file and add a second column using the default separator
(ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the
first row and hi4 on the second
7. Run step 3 again
8. Check the data again like in step 4
For me, this is the results that get returned:
hive select * from jvaughan_test where day = '2014-01-01';
OK
hiNULL2014-01-02
hi2 NULL2014-01-02
This is despite the fact that there is data in the file stored by the
partition in HDFS.
Let me know if you need any other information. The only workaround for me
currently is to drop partitions for any I'm replacing data in and THEN
reupload the new data file.
Thanks,
-James

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


[ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943231#comment-13943231
 ] 

Szehon Ho commented on HIVE-6685:
-

As this change became a refactoring, will need a rebase as the code has 
changed.  Will take a look a bit later.

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.2.patch, HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5652) Improve JavaDoc of UDF class


[ 
https://issues.apache.org/jira/browse/HIVE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943268#comment-13943268
 ] 

Lefty Leverenz commented on HIVE-5652:
--

Except the period.  (I hate to be a stickler, though.  So if you really want it 
there, I'll let it pass).  [sic]

 Improve JavaDoc of UDF class
 

 Key: HIVE-5652
 URL: https://issues.apache.org/jira/browse/HIVE-5652
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Trivial
 Attachments: HIVE-5652.1.patch, HIVE-5652.2.patch, HIVE-5652.3.patch


 I think the JavaDoc for the UDF class can be improved. I'll attach a patch 
 shortly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6625) HiveServer2 running in http mode should support trusted proxy access


[ 
https://issues.apache.org/jira/browse/HIVE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943274#comment-13943274
 ] 

Hive QA commented on HIVE-6625:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635194/HIVE-6625.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5436 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.mapreduce.TestHCatMutablePartitioned.testHCatPartitionedTable
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1884/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1884/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635194

 HiveServer2 running in http mode should support trusted proxy access
 

 Key: HIVE-6625
 URL: https://issues.apache.org/jira/browse/HIVE-6625
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6625.1.patch, HIVE-6625.2.patch


 HIVE-5155 adds trusted proxy access to HiveServer2. This patch a minor change 
 to have it used when running HiveServer2 in http mode. Patch to be applied on 
 top of HIVE-4764  HIVE-5155.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection


 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar reassigned HIVE-6715:
-

Assignee: Prasad Mujumdar

 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar

 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

What would Hive need for a release 1.0?

2014-03-21 Thread Lefty Leverenz

Are there clear standards for bumping the release number up to 1.0?  Does
it depend more on new features, jira backlog, stability, or something else?

Perhaps the only thing holding Hive back is its documentation.  (That's a
tech writer's joke.)

Would a 1.0 release matter?

Just curious.

-- Lefty

[jira] [Created] (HIVE-6719) No results from getTables() and getColumns with null tableNamePattern

Jonathan Seidman created HIVE-6719:
--

 Summary: No results from getTables() and getColumns with null 
tableNamePattern
 Key: HIVE-6719
 URL: https://issues.apache.org/jira/browse/HIVE-6719
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman


Calling DatabaseMetaData.getTables() or getColumns() with a null 
tableNamePattern argument returns 0 results, counter to the JDBC spec.

For example, the following will return no results:

meta.getTables( null, schema, null , tableTypes); 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19503: JDBC ResultSet fails to get value by qualified projection name

2014-03-21 Thread Prasad Mujumdar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19503/#review38131
---


The patch look fine. Certainly a useful functionality!

I am a bit concern about the behavior change introduced here. The existing 
client applications that are not expecting fully qualified column name would 
break as after this patch. It will require application code change to work with 
the new driver.
IMO we should have config property to restore the old behavior when needed. The 
default should be the new behavior.

- Prasad Mujumdar


On March 20, 2014, 10:24 p.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19503/
 ---
 
 (Updated March 20, 2014, 10:24 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: hive-6687
 https://issues.apache.org/jira/browse/hive-6687
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 JDBC ResultSet fails to get value by qualified projection name
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java
  dac62d5 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
 c91df83 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java e1e427f 
 
 Diff: https://reviews.apache.org/r/19503/diff/
 
 
 Testing
 ---
 
   
 
 
 Thanks,
 
 Harish Butani

[jira] [Created] (HIVE-6720) Implement getURL()

Jonathan Seidman created HIVE-6720:
--

 Summary: Implement getURL() 
 Key: HIVE-6720
 URL: https://issues.apache.org/jira/browse/HIVE-6720
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman
Priority: Minor


DatabaseMetaData.getURL() throws an unsupported exception. This should be 
modified to return a valid value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6720) Implement getURL()


[ 
https://issues.apache.org/jira/browse/HIVE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943322#comment-13943322
 ] 

Jonathan Seidman commented on HIVE-6720:


Note that the API docs call for returning a null if URL cannot be returned, and 
an exception only on DB error.

 Implement getURL() 
 ---

 Key: HIVE-6720
 URL: https://issues.apache.org/jira/browse/HIVE-6720
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman
Priority: Minor

 DatabaseMetaData.getURL() throws an unsupported exception. This should be 
 modified to return a valid value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-6701) Analyze table compute statistics for decimal columns.


 [ 
https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-6701:
--

Assignee: Sergey Shelukhin  (was: Jitendra Nath Pandey)

 Analyze table compute statistics for decimal columns.
 -

 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Sergey Shelukhin
 Attachments: HIVE-6701.1.patch


 Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-21 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943341#comment-13943341
 ] 

Jitendra Nath Pandey commented on HIVE-6060:


OrcInputFormat#getRecordReader must check for vectorized mode before returning 
any reader. It seems this patch has moved the check down which introduces a 
scenario where non-vectorized record reader will be returned in vectorized 
mode, which would cause the query to fail.

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, 
 HIVE-6060.patch, HIVE-6060.patch, HIVE-6060.patch, acid-io.patch, 
 h-5317.patch, h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6616) Document ORC file format to enable development of external converters to/from ORC/text files

2014-03-21 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943345#comment-13943345
 ] 

Alan Gates commented on HIVE-6616:
--

This is a dangerous approach, as we make no guarantees to keep Orc's structure 
the same across releases.  For example, the HIVE-5317 work includes significant 
changes to Orc's file structure.

It would be better to file a jira (or re-purpose this jira) to be: provide 
simple classes to read and write Orc without forcing you to pull all of Hive's 
jars into your code.  This is much safer from a forward and backward 
compatibility viewpoint and something we need to do anyway.  If you are wanting 
to run outside the JVM this obviously does not help you though.

 Document ORC file format to enable development of external converters to/from 
 ORC/text files
 

 Key: HIVE-6616
 URL: https://issues.apache.org/jira/browse/HIVE-6616
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0, 0.12.0
Reporter: Michael

 Please document the structure of ORC file in a way that it allow writing and 
 reading such a file by external software. I would like to be able to create 
 ORC files myself without help of Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6721) Streaming ingest needs to be able to send many heartbeats together

2014-03-21 Thread Alan Gates (JIRA)

Alan Gates created HIVE-6721:


 Summary: Streaming ingest needs to be able to send many heartbeats 
together
 Key: HIVE-6721
 URL: https://issues.apache.org/jira/browse/HIVE-6721
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0


The heartbeat method added to HiveMetaStoreClient is intended for SQL 
operations where the user will have one transaction and a hand full of locks.  
But in the streaming ingest case the client opens a batch of transactions 
together.  In this case we need a way for the client to send a heartbeat for 
this batch of transactions rather than being forced to send the heartbeats one 
at a time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6721) Streaming ingest needs to be able to send many heartbeats together

2014-03-21 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943352#comment-13943352
 ] 

Alan Gates commented on HIVE-6721:
--

[~rhbutani] This should go into 0.13 as well, as HIVE-5687 will depend on it.

 Streaming ingest needs to be able to send many heartbeats together
 --

 Key: HIVE-6721
 URL: https://issues.apache.org/jira/browse/HIVE-6721
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.13.0


 The heartbeat method added to HiveMetaStoreClient is intended for SQL 
 operations where the user will have one transaction and a hand full of locks. 
  But in the streaming ingest case the client opens a batch of transactions 
 together.  In this case we need a way for the client to send a heartbeat for 
 this batch of transactions rather than being forced to send the heartbeats 
 one at a time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-2269) Hive --auxpath option can't handle multiple colon separated values


 [ 
https://issues.apache.org/jira/browse/HIVE-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-2269.


   Resolution: Fixed
Fix Version/s: 0.12.0

After various attempts including HIVE-3978 HIVE-5363 and HIVE-5410 this has 
been resolved and it was further improved in HIVE-6328

 Hive --auxpath option can't handle multiple colon separated values
 --

 Key: HIVE-2269
 URL: https://issues.apache.org/jira/browse/HIVE-2269
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.7.0, 0.7.1, 0.10.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.12.0

 Attachments: HIVE-2269-auxpath.1.patch.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6719) No results from getTables() and getColumns with null tableNamePattern

2014-03-21 Thread Rick Spickelmier (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943374#comment-13943374
 ] 

Rick Spickelmier commented on HIVE-6719:


For getColumns, the issue is with the columnNamePattern.  Specifying % in 
both methods produces the expected results, but specifying null returns a 
result set with 0 records.  The specification says that if null is used, the 
parameter will not be used in the search (which should be the equivalent of 
%), which is what most other JDBC drivers do.

Squirrel does not show the catalog and I assume it is due to this.

 No results from getTables() and getColumns with null tableNamePattern
 -

 Key: HIVE-6719
 URL: https://issues.apache.org/jira/browse/HIVE-6719
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman

 Calling DatabaseMetaData.getTables() or getColumns() with a null 
 tableNamePattern argument returns 0 results, counter to the JDBC spec.
 For example, the following will return no results:
 meta.getTables( null, schema, null , tableTypes); 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Status: Open  (was: Patch Available)

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6616) Document ORC file format to enable development of external converters to/from ORC/text files

2014-03-21 Thread Michael (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943377#comment-13943377
 ] 

Michael commented on HIVE-6616:
---

You are correct about creating a new issue for simple classes. 
Still, I believe that maintaining clear documentation for everything (including 
ORC format) is the correct way.


 Document ORC file format to enable development of external converters to/from 
 ORC/text files
 

 Key: HIVE-6616
 URL: https://issues.apache.org/jira/browse/HIVE-6616
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.11.0, 0.12.0
Reporter: Michael

 Please document the structure of ORC file in a way that it allow writing and 
 reading such a file by external software. I would like to be able to create 
 ORC files myself without help of Hive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Attachment: (was: HIVE-6687.patch)

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Attachment: HIVE-6687.2.patch

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.2.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6722) Provide simple classes to read and write Orc files

2014-03-21 Thread Michael (JIRA)

Michael created HIVE-6722:
-

 Summary: Provide simple classes to read and write Orc files
 Key: HIVE-6722
 URL: https://issues.apache.org/jira/browse/HIVE-6722
 Project: Hive
  Issue Type: Improvement
  Components: File Formats, Import/Export
Affects Versions: 0.12.0, 0.11.0
Reporter: Michael


Please provide simple classes to read and write files in the ORC format that 
does not require import of all the Hive's jars into the code.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6723) Tez golden files need to be updated

Ashutosh Chauhan created HIVE-6723:
--

 Summary: Tez golden files need to be updated
 Key: HIVE-6723
 URL: https://issues.apache.org/jira/browse/HIVE-6723
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


Golden files are out of date.
NO PRECOMMIT TESTS

since these are purely .q.out changes



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6723) Tez golden files need to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6723:
---

Status: Patch Available  (was: Open)

 Tez golden files need to be updated
 ---

 Key: HIVE-6723
 URL: https://issues.apache.org/jira/browse/HIVE-6723
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6723.patch


 Golden files are out of date.
 NO PRECOMMIT TESTS
 since these are purely .q.out changes



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6723) Tez golden files need to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6723:
---

Attachment: HIVE-6723.patch

 Tez golden files need to be updated
 ---

 Key: HIVE-6723
 URL: https://issues.apache.org/jira/browse/HIVE-6723
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6723.patch


 Golden files are out of date.
 NO PRECOMMIT TESTS
 since these are purely .q.out changes



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


[ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943398#comment-13943398
 ] 

Laljo John Pullokkaran commented on HIVE-6687:
--

Apparently view schema also uses same result set schema.
Modified patch to:
1. Separate out View Schema vs Result Set Schema.
2. View Schema won't use qualified table names. View schema would also ensure 
that column names are unique.
3. ResultSet schema by default would use table aliases if provided (select *, 
or user provided qualified projections select r1.x..)
4. To get old behavior for result set schema, introduced a config param 
hive.resultset.use.unique.column.names; this is set to true by default. User 
will have to set this to false for old behavior.

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.2.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


[ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943399#comment-13943399
 ] 

Laljo John Pullokkaran commented on HIVE-6687:
--

Vaibhav, I modified the test cases that seems like could get affected. If we 
are not using JDBC1 then its a no-op.

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.2.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19539: HIVE-6715: Hive JDBC should include username into open session request for non-sasl connection

2014-03-21 Thread Prasad Mujumdar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19539/
---

Review request for hive and Xuefu Zhang.


Bugs: HIVE-6715
https://issues.apache.org/jira/browse/HIVE-6715


Repository: hive-git


Description
---

In SASL auth cases (Kerberos and Plain), the username is read from the 
underlying SASL layer. For non-sasl case, the server doesn't get the user name. 
The fix is to the username as part of openSession request for non-sasl case.


Diffs
-

  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestNoSaslAuth.java 
PRE-CREATION 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 47a1061 

Diff: https://reviews.apache.org/r/19539/diff/


Testing
---

Added a new test to verify the username.


Thanks,

Prasad Mujumdar

[jira] [Commented] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection

2014-03-21 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943411#comment-13943411
 ] 

Vaibhav Gumashta commented on HIVE-6715:


[~prasadm] The change looks good. Is it possible to stick the tests in one of 
the existing test case classes though? 


 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection


 [ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-6715:
--

Attachment: HIVE-6715.1.patch

 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6625) HiveServer2 running in http mode should support trusted proxy access


 [ 
https://issues.apache.org/jira/browse/HIVE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-6625:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk and 0.13 branch.
Thanks [~vaibhavgumashta]!

 HiveServer2 running in http mode should support trusted proxy access
 

 Key: HIVE-6625
 URL: https://issues.apache.org/jira/browse/HIVE-6625
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6625.1.patch, HIVE-6625.2.patch


 HIVE-5155 adds trusted proxy access to HiveServer2. This patch a minor change 
 to have it used when running HiveServer2 in http mode. Patch to be applied on 
 top of HIVE-4764  HIVE-5155.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6723) Tez golden files need to be updated

2014-03-21 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943417#comment-13943417
 ] 

Vikram Dixit K commented on HIVE-6723:
--

+1 LGTM.

 Tez golden files need to be updated
 ---

 Key: HIVE-6723
 URL: https://issues.apache.org/jira/browse/HIVE-6723
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6723.patch


 Golden files are out of date.
 NO PRECOMMIT TESTS
 since these are purely .q.out changes



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6328) Hive script should not overwrite AUX_CLASSPATH with HIVE_AUX_JARS_PATH if the latter is set


[ 
https://issues.apache.org/jira/browse/HIVE-6328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943406#comment-13943406
 ] 

Lefty Leverenz commented on HIVE-6328:
--

Has anyone updated the wiki with information about this jira  related jiras 
(HIVE-2269, HIVE-3978, HIVE-5363, HIVE-5410)?

 Hive script should not overwrite AUX_CLASSPATH with HIVE_AUX_JARS_PATH if the 
 latter is set
 ---

 Key: HIVE-6328
 URL: https://issues.apache.org/jira/browse/HIVE-6328
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.13.0

 Attachments: HIVE-6328.patch


 Hive script (bin/hive) replaces the value of AUX_CLASSPATH with the value of 
 HIVE_AUX_JARS_PATH if HIVE_AUX_JARS_PATH is defined. This is not desirable 
 because user uses the former to include additional classes when starting 
 hive, while using the latter to specify additional jars that are needed to 
 run MR jobs. The problem can be demonstrated with the script snippet:
 {code}
 elif [ ${HIVE_AUX_JARS_PATH} !=  ]; then
   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
   if $cygwin; then
   HIVE_AUX_JARS_PATH=`cygpath -p -w $HIVE_AUX_JARS_PATH`
   HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/;/,/g'`
   fi
   AUX_CLASSPATH=${HIVE_AUX_JARS_PATH}
   AUX_PARAM=file://$(echo ${HIVE_AUX_JARS_PATH} | sed 's/:/,file:\/\//g')
 fi
 {code}
 AUX_CLASSPATH should be respected regardless whether HIVE_AUX_JARS_PATH is 
 defined.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection


[ 
https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943425#comment-13943425
 ] 

Prasad Mujumdar commented on HIVE-6715:
---

[~vaibhavgumashta] Thanks for taking a look.
The problem is the nosasl transport mode. It needs to be set via system 
property (till HIVE-6665 is committed) and 
restart the server. That's what I had to add it to a separate test.


 Hive JDBC should include username into open session request for non-sasl 
 connection
 ---

 Key: HIVE-6715
 URL: https://issues.apache.org/jira/browse/HIVE-6715
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Srinath
Assignee: Prasad Mujumdar
 Attachments: HIVE-6715.1.patch


 The only parameter from sessVars that's being set in 
 HiveConnection.openSession() is HS2_PROXY_USER. 
 HIVE_AUTH_USER must also be set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc


[ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943428#comment-13943428
 ] 

Hive QA commented on HIVE-6657:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635828/HIVE-6657.5.patch

{color:green}SUCCESS:{color} +1 5437 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1889/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1889/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635828

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch, HIVE-6657.4.patch, 
 HIVE-6657.4.patch, HIVE-6657.5.patch, HIVE-6657.5.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6677) HBaseSerDe needs to be refactored


[ 
https://issues.apache.org/jira/browse/HIVE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943432#comment-13943432
 ] 

Hive QA commented on HIVE-6677:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635229/HIVE-6677.3.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1890/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1890/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1890/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'pom.xml'
Reverted 
'itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/AbstractHiveService.java'
Reverted 
'itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2.java'
Reverted 
'itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java'
Reverted 'itests/hive-unit/pom.xml'
Reverted 'itests/pom.xml'
Reverted 'service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java'
Reverted 
'service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java'
Reverted 
'service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hive-minikdc itests/hcatalog-unit/target 
itests/test-serde/target itests/qtest/target itests/hive-unit/target 
itests/hive-unit/src/main itests/custom-serde/target itests/util/target 
hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
contrib/target service/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update
Uservice/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java

Fetching external item into 'hcatalog/src/test/e2e/harness'
Updated external to revision 1580023.

Updated to revision 1580023.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635229

 HBaseSerDe needs to be refactored
 -

 Key: HIVE-6677
 URL: https://issues.apache.org/jira/browse/HIVE-6677
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.14.0

 Attachments: HIVE-6677.1.patch, HIVE-6677.2.patch, HIVE-6677.3.patch, 
 HIVE-6677.patch


 The code in HBaseSerde seems very complex and hard to be extend to support 
 new features such as adding generic compound key (HIVE-6411) and Compound key 
 filter (HIVE-6290), especially when handling key/field serialization. Hope 
 this task will clean up the code a bit and make it ready for new extensions. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6625) HiveServer2 running in http mode should support trusted proxy access


[ 
https://issues.apache.org/jira/browse/HIVE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943433#comment-13943433
 ] 

Lefty Leverenz commented on HIVE-6625:
--

Does this need any userdoc?

 HiveServer2 running in http mode should support trusted proxy access
 

 Key: HIVE-6625
 URL: https://issues.apache.org/jira/browse/HIVE-6625
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6625.1.patch, HIVE-6625.2.patch


 HIVE-5155 adds trusted proxy access to HiveServer2. This patch a minor change 
 to have it used when running HiveServer2 in http mode. Patch to be applied on 
 top of HIVE-4764  HIVE-5155.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6724) HCatStorer throws ClassCastException while storing tinyint/smallint data


 [ 
https://issues.apache.org/jira/browse/HIVE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6724:
-

Description: 
given Hive tables:
1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
2) create table all100k (si smallint, ti tinyint) STORED ;

the following sequence of steps (assuming there is data in all100k)

{noformat}
a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
b = foreach a generate si;
store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();
{noformat}
produces 
{noformat}
org.apache.hadoop.mapred.YarnChild: Exception running child : 
java.lang.ClassCastException: java.lang.Short cannot be cast to 
java.lang.Integer
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
{noformat}



  was:
given Hive tables:
1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
2) create table all100k (si smallint, ti tinyint) STORED ;

the following sequence of steps (assuming there is data in all100k)

a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
b = foreach a generate si;
store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();

produces 
{noformat}
org.apache.hadoop.mapred.YarnChild: Exception running child : 
java.lang.ClassCastException: java.lang.Short cannot be cast to 
java.lang.Integer
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at

[jira] [Created] (HIVE-6724) HCatStorer throws ClassCastException while storing tinyint/smallint data

2014-03-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

Eugene Koifman created HIVE-6724:


 Summary: HCatStorer throws ClassCastException while storing 
tinyint/smallint data
 Key: HIVE-6724
 URL: https://issues.apache.org/jira/browse/HIVE-6724
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


given Hive tables:
1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
2) create table all100k (si smallint, ti tinyint) STORED ;

the following sequence of steps (assuming there is data in all100k)

a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
b = foreach a generate si;
store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();

produces 
{noformat}
org.apache.hadoop.mapred.YarnChild: Exception running child : 
java.lang.ClassCastException: java.lang.Short cannot be cast to 
java.lang.Integer
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
at 
org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at 
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at 
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them


 [ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6708:


Attachment: HIVE-6708.2.patch

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Attachment: (was: HIVE-6687.2.patch)

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them


 [ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6708:


Status: Patch Available  (was: Open)

updated patch as per [~jnp] 's comments

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc


[ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943438#comment-13943438
 ] 

Prasad Mujumdar commented on HIVE-6657:
---

[~brocknoland] The patch was rebased after the initial review due to conflicts 
on trunk. I would appreciate if you could take another quick look. Thanks!

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch, HIVE-6657.4.patch, 
 HIVE-6657.4.patch, HIVE-6657.5.patch, HIVE-6657.5.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6725) getTables() only returns partial table descriptions

Jonathan Seidman created HIVE-6725:
--

 Summary: getTables() only returns partial table descriptions 
 Key: HIVE-6725
 URL: https://issues.apache.org/jira/browse/HIVE-6725
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman
Priority: Minor


The ResultSet from calling DatabaseMetaData.getTables() only returns 5 columns, 
as opposed to the 10 columns called for in the JDBC spec:

TABLE_CAT String = table catalog (may be null)
TABLE_SCHEM String = table schema (may be null)
TABLE_NAME String = table name
TABLE_TYPE String = table type. Typical types are TABLE, VIEW, SYSTEM 
TABLE, GLOBAL TEMPORARY, LOCAL TEMPORARY, ALIAS, SYNONYM.
REMARKS String = explanatory comment on the table
TYPE_CAT String = the types catalog (may be null)
TYPE_SCHEM String = the types schema (may be null)
TYPE_NAME String = type name (may be null)
SELF_REFERENCING_COL_NAME String = name of the designated identifier column 
of a typed table (may be null)
REF_GENERATION String = specifies how values in SELF_REFERENCING_COL_NAME are 
created. Values are SYSTEM, USER, DERIVED. (may be null)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Attachment: HIVE-6687.3.patch

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.3.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them


 [ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6708:


Attachment: (was: HIVE-6708.2.patch)

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6708:


Attachment: HIVE-6708.2.patch

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Status: Patch Available  (was: Open)

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.3.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6726) Hcat cli does not close SessionState

Sushanth Sowmyan created HIVE-6726:
--

 Summary: Hcat cli does not close SessionState
 Key: HIVE-6726
 URL: https://issues.apache.org/jira/browse/HIVE-6726
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


When running HCat E2E tests, it was observed that hcat cli left Tez sessions on 
the RM which ultimately die upon timeout. Expected behavior is to clean the Tez 
sessions immediately upon exit. This is causing slowness in system tests as 
over time lot of orphan Tez sessions hang around.

On looking through code, it seems obvious in retrospect because HCatCli starts 
a SessionState, but does not explicitly call close on them, exiting the jvm 
through System.exit instead. This needs to be changed to explicitly call 
SessionState.close() before exiting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6726) Hcat cli does not close SessionState


 [ 
https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6726:
---

Attachment: HIVE-6726.patch

Attaching patch.

 Hcat cli does not close SessionState
 

 Key: HIVE-6726
 URL: https://issues.apache.org/jira/browse/HIVE-6726
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6726.patch


 When running HCat E2E tests, it was observed that hcat cli left Tez sessions 
 on the RM which ultimately die upon timeout. Expected behavior is to clean 
 the Tez sessions immediately upon exit. This is causing slowness in system 
 tests as over time lot of orphan Tez sessions hang around.
 On looking through code, it seems obvious in retrospect because HCatCli 
 starts a SessionState, but does not explicitly call close on them, exiting 
 the jvm through System.exit instead. This needs to be changed to explicitly 
 call SessionState.close() before exiting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6726) Hcat cli does not close SessionState


[ 
https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943501#comment-13943501
 ] 

Sushanth Sowmyan commented on HIVE-6726:


[~hagleitn], could I bug you to have a quick look at this patch to see if this 
is sufficient?

 Hcat cli does not close SessionState
 

 Key: HIVE-6726
 URL: https://issues.apache.org/jira/browse/HIVE-6726
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6726.patch


 When running HCat E2E tests, it was observed that hcat cli left Tez sessions 
 on the RM which ultimately die upon timeout. Expected behavior is to clean 
 the Tez sessions immediately upon exit. This is causing slowness in system 
 tests as over time lot of orphan Tez sessions hang around.
 On looking through code, it seems obvious in retrospect because HCatCli 
 starts a SessionState, but does not explicitly call close on them, exiting 
 the jvm through System.exit instead. This needs to be changed to explicitly 
 call SessionState.close() before exiting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6726) Hcat cli does not close SessionState


 [ 
https://issues.apache.org/jira/browse/HIVE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6726:
---

Status: Patch Available  (was: Open)

 Hcat cli does not close SessionState
 

 Key: HIVE-6726
 URL: https://issues.apache.org/jira/browse/HIVE-6726
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6726.patch


 When running HCat E2E tests, it was observed that hcat cli left Tez sessions 
 on the RM which ultimately die upon timeout. Expected behavior is to clean 
 the Tez sessions immediately upon exit. This is causing slowness in system 
 tests as over time lot of orphan Tez sessions hang around.
 On looking through code, it seems obvious in retrospect because HCatCli 
 starts a SessionState, but does not explicitly call close on them, exiting 
 the jvm through System.exit instead. This needs to be changed to explicitly 
 call SessionState.close() before exiting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6698) hcat.py script does not correctly load the hbase storage handler jars


 [ 
https://issues.apache.org/jira/browse/HIVE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6698:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 hcat.py script does not correctly load the hbase storage handler jars
 -

 Key: HIVE-6698
 URL: https://issues.apache.org/jira/browse/HIVE-6698
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6698.patch


 Currently queries using the HBaseHCatStorageHandler when run using hcat.py 
 fail. Example query
 {code}
 create table pig_hbase_1(key string, age string, gpa string)
 STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
 TBLPROPERTIES ('hbase.columns.mapping'=':key,info:age,info:gpa');
 {code}
 Following error is seen in the hcat logs:
 {noformat}
 2014-03-18 08:25:49,437 ERROR ql.Driver (SessionState.java:printError(541)) - 
 FAILED: SemanticException java.io.IOException: Error in loading storage 
 handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
 org.apache.hadoop.hive.ql.parse.SemanticException: java.io.IOException: Error 
 in loading storage handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.CreateTableHook.postAnalyze(CreateTableHook.java:208)
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.HCatSemanticAnalyzer.postAnalyze(HCatSemanticAnalyzer.java:242)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:295)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:949)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:997)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:885)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:875)
   at org.apache.hive.hcatalog.cli.HCatDriver.run(HCatDriver.java:43)
   at org.apache.hive.hcatalog.cli.HCatCli.processCmd(HCatCli.java:259)
   at org.apache.hive.hcatalog.cli.HCatCli.processLine(HCatCli.java:213)
   at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:172)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.io.IOException: Error in loading storage 
 handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at 
 org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:432)
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.CreateTableHook.postAnalyze(CreateTableHook.java:199)
   ... 16 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at java.net.URLClassLoader$1.run(Unknown Source)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Unknown Source)
   at 
 org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:426)
   ... 17 more
 {noformat}
 The problem is that the hbaseStorageJar is incorrect with the merging of hcat 
 into hive. Also as per HIVE-6695 we should add the HBASE_LIB in the classpath.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6698) hcat.py script does not correctly load the hbase storage handler jars


[ 
https://issues.apache.org/jira/browse/HIVE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943510#comment-13943510
 ] 

Sushanth Sowmyan commented on HIVE-6698:


Committed. Thanks, Deepesh!

 hcat.py script does not correctly load the hbase storage handler jars
 -

 Key: HIVE-6698
 URL: https://issues.apache.org/jira/browse/HIVE-6698
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6698.patch


 Currently queries using the HBaseHCatStorageHandler when run using hcat.py 
 fail. Example query
 {code}
 create table pig_hbase_1(key string, age string, gpa string)
 STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
 TBLPROPERTIES ('hbase.columns.mapping'=':key,info:age,info:gpa');
 {code}
 Following error is seen in the hcat logs:
 {noformat}
 2014-03-18 08:25:49,437 ERROR ql.Driver (SessionState.java:printError(541)) - 
 FAILED: SemanticException java.io.IOException: Error in loading storage 
 handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
 org.apache.hadoop.hive.ql.parse.SemanticException: java.io.IOException: Error 
 in loading storage handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.CreateTableHook.postAnalyze(CreateTableHook.java:208)
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.HCatSemanticAnalyzer.postAnalyze(HCatSemanticAnalyzer.java:242)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:295)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:949)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:997)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:885)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:875)
   at org.apache.hive.hcatalog.cli.HCatDriver.run(HCatDriver.java:43)
   at org.apache.hive.hcatalog.cli.HCatCli.processCmd(HCatCli.java:259)
   at org.apache.hive.hcatalog.cli.HCatCli.processLine(HCatCli.java:213)
   at org.apache.hive.hcatalog.cli.HCatCli.main(HCatCli.java:172)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.io.IOException: Error in loading storage 
 handler.org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at 
 org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:432)
   at 
 org.apache.hive.hcatalog.cli.SemanticAnalysis.CreateTableHook.postAnalyze(CreateTableHook.java:199)
   ... 16 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hcatalog.hbase.HBaseHCatStorageHandler
   at java.net.URLClassLoader$1.run(Unknown Source)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at java.lang.ClassLoader.loadClass(Unknown Source)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Unknown Source)
   at 
 org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:426)
   ... 17 more
 {noformat}
 The problem is that the hbaseStorageJar is incorrect with the merging of hcat 
 into hive. Also as per HIVE-6695 we should add the HBASE_LIB in the classpath.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6724) HCatStorer throws ClassCastException while storing tinyint/smallint data


[ 
https://issues.apache.org/jira/browse/HIVE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943524#comment-13943524
 ] 

Eugene Koifman commented on HIVE-6724:
--

The use case is Hive-Pig-Hive.
HCatLoader automatically sets hcat.data.tiny.small.int.promotion=true as that 
is required by Pig. HCatStorer did the opposite for no good reason. Since Pig 
doesn't evaluate each statement separately, the Storer action clobbered Loader 
action (the context that contains the configuration is shared). I changed 
Storer not to do that.

 HCatStorer throws ClassCastException while storing tinyint/smallint data
 

 Key: HIVE-6724
 URL: https://issues.apache.org/jira/browse/HIVE-6724
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 given Hive tables:
 1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
 2) create table all100k (si smallint, ti tinyint) STORED ;
 the following sequence of steps (assuming there is data in all100k)
 {noformat}
 a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
 b = foreach a generate si;
 store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();
 {noformat}
 produces 
 {noformat}
 org.apache.hadoop.mapred.YarnChild: Exception running child : 
 java.lang.ClassCastException: java.lang.Short cannot be cast to 
 java.lang.Integer
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
   at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
   at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6708) ConstantVectorExpression should create copies of data objects rather than referencing them

2014-03-21 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943523#comment-13943523
 ] 

Jitendra Nath Pandey commented on HIVE-6708:


+1

 ConstantVectorExpression should create copies of data objects rather than 
 referencing them
 --

 Key: HIVE-6708
 URL: https://issues.apache.org/jira/browse/HIVE-6708
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6708-1.patch, HIVE-6708.2.patch


 1. ConstantVectorExpression vector should be updated for bytecolumnvectors 
 and decimalColumnVectors. The current code changes the reference to the 
 vector which might be shared across multiple columns
 2. VectorizationContext.foldConstantsForUnaryExpression(ExprNodeDesc 
 exprDesc) has a minor bug as to when to constant fold the expression.
 The following code should replace the corresponding piece of code in the 
 trunk.
 ..
 GenericUDF gudf = ((ExprNodeGenericFuncDesc) exprDesc).getGenericUDF();
 if (gudf instanceof GenericUDFOPNegative || gudf instanceof 
 GenericUDFOPPositive
 || castExpressionUdfs.contains(gudf.getClass())
 ... 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6395) multi-table insert from select transform fails if optimize.ppd enabled


 [ 
https://issues.apache.org/jira/browse/HIVE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6395:


Attachment: HIVE-6395.patch

 multi-table insert from select transform fails if optimize.ppd enabled
 --

 Key: HIVE-6395
 URL: https://issues.apache.org/jira/browse/HIVE-6395
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6395.patch, test.py


 {noformat}
 set hive.optimize.ppd=true;
 add file ./test.py;
 from (select transform(test.*) using 'python ./test.py'
 as id,name,state from test) t0
 insert overwrite table test2 select * where state=1
 insert overwrite table test3 select * where state=2;
 {noformat}
 In the above example, the select transform returns an extra column, and that 
 column is used in where clause of the multi-insert selects.  However, if 
 optimize is on, the query plan is wrong:
 filter (state=1 and state=2) //impossible
 -- select, insert into test1
 -- select, insert into test2
 The correct query plan for hive.optimize.ppd=false is:
 filter (state=1)
 -- select, insert into test1
 filter (state=2)
 -- select, insert into test2
 For reference
 {noformat}
 create table test (id int, name string)
 create table test2(id int, name string, state int)
 create table test3(id int, name string, state int)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6724) HCatStorer throws ClassCastException while storing tinyint/smallint data


 [ 
https://issues.apache.org/jira/browse/HIVE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6724:
-

Attachment: HIVE-6724.patch

 HCatStorer throws ClassCastException while storing tinyint/smallint data
 

 Key: HIVE-6724
 URL: https://issues.apache.org/jira/browse/HIVE-6724
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6724.patch


 given Hive tables:
 1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
 2) create table all100k (si smallint, ti tinyint) STORED ;
 the following sequence of steps (assuming there is data in all100k)
 {noformat}
 a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
 b = foreach a generate si;
 store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();
 {noformat}
 produces 
 {noformat}
 org.apache.hadoop.mapred.YarnChild: Exception running child : 
 java.lang.ClassCastException: java.lang.Short cannot be cast to 
 java.lang.Integer
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
   at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
   at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6724) HCatStorer throws ClassCastException while storing tinyint/smallint data


 [ 
https://issues.apache.org/jira/browse/HIVE-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6724:
-

Status: Patch Available  (was: Open)

 HCatStorer throws ClassCastException while storing tinyint/smallint data
 

 Key: HIVE-6724
 URL: https://issues.apache.org/jira/browse/HIVE-6724
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6724.patch


 given Hive tables:
 1) create table pig_hcatalog_1 (si smallint)  STORED AS TEXTFILE;
 2) create table all100k (si smallint, ti tinyint) STORED ;
 the following sequence of steps (assuming there is data in all100k)
 {noformat}
 a=load 'all100k' using org.apache.hive.hcatalog.pig.HCatLoader();
 b = foreach a generate si;
 store b into 'pig_hcatalog_1' using org.apache.hive.hcatalog.pig.HCatStorer();
 {noformat}
 produces 
 {noformat}
 org.apache.hadoop.mapred.YarnChild: Exception running child : 
 java.lang.ClassCastException: java.lang.Short cannot be cast to 
 java.lang.Integer
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.getJavaObj(HCatBaseStorer.java:372)
   at 
 org.apache.hive.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:306)
   at org.apache.hive.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:61)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
   at 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
   at 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/
---

Review request for hive.


Repository: hive-git


Description
---

In this scenario, PPD on the script (transform) operator did the following 
wrong predicate pushdown:

script -- filter (state=1)
   -- select, insert into test1
   --filter (state=2)
   -- select, insert into test2

into:

script -- filter (state=1 and state=2)   //not possible.
 -- select, insert into test1
 -- select, insert into test2


The bug was a combination of two things, first that these filters got chosen by 
FilterPPD and that the ScriptPPD called the sequence mergeWithChildrenPred 
/createFilters (pred) which did the above transformation.  ScriptPPD was one 
of the few simple operator that did this, I tried with some other combination 
like extract (see my added test in transform_ppr2.q) and also just a select 
operator.

The fix is to skip marking a predicate as a 'candidate' for the pushdown if it 
is a sibling of another filter.  We still want to pushdown children of select 
transform with grandchildren, etc.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
  ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
  ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 

Diff: https://reviews.apache.org/r/19549/diff/


Testing
---

Reproduced both the issue in transform_ppd_multi.q, also did another similar 
issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
transform_ppd and general ppd tests to ensure no regression.


Thanks,

Szehon Ho

Re: Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/
---

(Updated March 21, 2014, 9:01 p.m.)


Review request for hive.


Changes
---

Cleaned up comments.


Repository: hive-git


Description (updated)
---

In this scenario, PPD on the script (transform) operator did the following 
wrong predicate pushdown:

script -- filter (state=1)
   -- select, insert into test1
   --filter (state=2)
   -- select, insert into test2

into:

script -- filter (state=1 and state=2)   //not possible.
 -- select, insert into test1
 -- select, insert into test2


The bug was a combination of two things, first that these filters got chosen by 
FilterPPD as 'candidate' pushdown precdicates, and that the ScriptPPD called  
mergeWithChildrenPred + createFilters which did the above transformation due 
to them being marked.  

ScriptPPD was one of the few simple operator that did this, I tried with some 
other combination like extract (see my added test in transform_ppr2.q) and also 
just a select operator and could not produce the issue with those.

The fix is to skip marking a predicate as a 'candidate' for the pushdown if it 
is a sibling of another filter.  We still want to pushdown children of select 
transform with grandchildren, etc.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
  ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
  ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 

Diff: https://reviews.apache.org/r/19549/diff/


Testing
---

Reproduced both the issue in transform_ppd_multi.q, also did another similar 
issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
transform_ppd and general ppd tests to ensure no regression.


Thanks,

Szehon Ho

[jira] [Updated] (HIVE-6395) multi-table insert from select transform fails if optimize.ppd enabled


 [ 
https://issues.apache.org/jira/browse/HIVE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6395:


Affects Version/s: 0.13.0
   Status: Patch Available  (was: Open)

 multi-table insert from select transform fails if optimize.ppd enabled
 --

 Key: HIVE-6395
 URL: https://issues.apache.org/jira/browse/HIVE-6395
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6395.patch, test.py


 {noformat}
 set hive.optimize.ppd=true;
 add file ./test.py;
 from (select transform(test.*) using 'python ./test.py'
 as id,name,state from test) t0
 insert overwrite table test2 select * where state=1
 insert overwrite table test3 select * where state=2;
 {noformat}
 In the above example, the select transform returns an extra column, and that 
 column is used in where clause of the multi-insert selects.  However, if 
 optimize is on, the query plan is wrong:
 filter (state=1 and state=2) //impossible
 -- select, insert into test1
 -- select, insert into test2
 The correct query plan for hive.optimize.ppd=false is:
 filter (state=1)
 -- select, insert into test1
 filter (state=2)
 -- select, insert into test2
 For reference
 {noformat}
 create table test (id int, name string)
 create table test2(id int, name string, state int)
 create table test3(id int, name string, state int)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/
---

(Updated March 21, 2014, 9:05 p.m.)


Review request for hive.


Repository: hive-git


Description (updated)
---

In this scenario, PPD on the script (transform) operator did the following 
wrong predicate pushdown:

script -- filter (state=1)
   -- select, insert into test1
   --filter (state=2)
   -- select, insert into test2

into:

script -- filter (state=1 and state=2)   //not possible.
 -- select, insert into test1
 -- select, insert into test2


The bug was a combination of two things, first that these filters got chosen by 
FilterPPD as 'candidate' pushdown precdicates, and that the ScriptPPD called  
mergeWithChildrenPred + createFilters which did the above transformation due 
to them being marked.  

ScriptPPD was one of the few simple operator that did this, I tried with some 
other parent operator like extract (see my added test in transform_ppr2.q) and 
also just a select operator and could not produce the issue with those.

The fix is to skip marking a predicate as a 'candidate' for the pushdown if it 
is a sibling of another filter.  We still want to pushdown children of 
transform-operator with grandchildren, etc.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
  ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
  ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 

Diff: https://reviews.apache.org/r/19549/diff/


Testing
---

Reproduced both the issue in transform_ppd_multi.q, also did another similar 
issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
transform_ppd and general ppd tests to ensure no regression.


Thanks,

Szehon Ho

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


[ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943552#comment-13943552
 ] 

Ashutosh Chauhan commented on HIVE-6687:


[~jpullokkaran] Can you also update RB with latest patch?

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
  Labels: documentation
 Fix For: 0.12.1

 Attachments: HIVE-6687.3.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


[ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943568#comment-13943568
 ] 

Lefty Leverenz commented on HIVE-6687:
--

The documentation of *hive.resultset.use.unique.column.names* looks good to me. 
 When the time comes I'll add it to the wiki.

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
  Labels: documentation
 Fix For: 0.12.1

 Attachments: HIVE-6687.3.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6692) Location for new table or partition should be a write entity


[ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943569#comment-13943569
 ] 

Hive QA commented on HIVE-6692:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635255/HIVE-6692.1.patch.txt

{color:red}ERROR:{color} -1 due to 60 failed/errored test(s), 5437 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_if_with_path_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_concatenate_inherit_table_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_like
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_database_removes_partition_dirs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_index_removes_partition_dirs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_table_removes_partition_dirs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_00_nonpart_empty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_01_nonpart
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_02_00_part_empty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_02_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_03_nonpart_over_compat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_04_all_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_04_evolved_parts
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_05_some_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_06_one_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_07_all_part_over_nonoverlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_08_nonpart_rename
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_09_part_spec_nonoverlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_10_external_managed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_11_managed_external
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_12_external_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_13_managed_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_14_managed_location_over_existing
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_15_external_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_16_part_external
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_17_part_managed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_18_part_external
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_19_00_part_external_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_19_part_external_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_20_part_managed_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_22_import_exist_authsuccess
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_23_import_part_authsuccess
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_24_import_nonexist_authsuccess
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_exim_hidden_files
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_hook_context_cs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_overwrite_local_directory_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insertexternal1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_fs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_fs_overwrite
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_external_partition_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_table_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_delimited
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2
org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_external_table_with_space_in_location_path
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_file_with_header_footer
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_import_exported_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority

Re: Review Request 19503: JDBC ResultSet fails to get value by qualified projection name

2014-03-21 Thread John Pullokkaran



 On March 20, 2014, 10:32 p.m., Harish Butani wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 9425
  https://reviews.apache.org/r/19503/diff/1/?file=530722#file530722line9425
 
  Can you verify that there is no issue if table/column names have the 
  '.' character. Sounds like jdbc treats column names as a string, so this 
  should be ok.

1. JDBC ResultSet treats label as string.
2. I don't think the qualifier would get anything more than the table name and 
hence there should only be one ..


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19503/#review37998
---


On March 20, 2014, 10:24 p.m., Harish Butani wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19503/
 ---
 
 (Updated March 20, 2014, 10:24 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: hive-6687
 https://issues.apache.org/jira/browse/hive-6687
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 JDBC ResultSet fails to get value by qualified projection name
 
 
 Diffs
 -
 
   
 itests/hive-unit/src/test/java/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java
  dac62d5 
   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
 c91df83 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java e1e427f 
 
 Diff: https://reviews.apache.org/r/19503/diff/
 
 
 Testing
 ---
 
   
 
 
 Thanks,
 
 Harish Butani

[jira] [Commented] (HIVE-1362) Optimizer statistics on columns in tables and partitions


[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943576#comment-13943576
 ] 

Sergey Shelukhin commented on HIVE-1362:


This jira adds but doesn't use decimal fields in the schema... I am going to 
reuse them for HIVE-6701. We probably cannot use decimal due to derby 
limitations (31 precision max, Hive is 38), so between string and binary there 
might not be a difference that matters

 Optimizer statistics on columns in tables and partitions
 

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Fix For: 0.10.0

 Attachments: HIVE-1362-gen_thrift.1.patch.txt, 
 HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
 HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
 HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362.1.patch.txt, 
 HIVE-1362.10.patch.txt, HIVE-1362.11.patch.txt, HIVE-1362.2.patch.txt, 
 HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
 HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
 HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, 
 HIVE-1362_gen-thrift.10.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
 HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6723) Tez golden files need to be updated


 [ 
https://issues.apache.org/jira/browse/HIVE-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6723:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to 0.13  trunk.

 Tez golden files need to be updated
 ---

 Key: HIVE-6723
 URL: https://issues.apache.org/jira/browse/HIVE-6723
 Project: Hive
  Issue Type: Task
  Components: Tests, Tez
Affects Versions: 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.13.0

 Attachments: HIVE-6723.patch


 Golden files are out of date.
 NO PRECOMMIT TESTS
 since these are purely .q.out changes



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name


[ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943587#comment-13943587
 ] 

Laljo John Pullokkaran commented on HIVE-6687:
--

Review Board: https://reviews.apache.org/r/19551/

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
  Labels: documentation
 Fix For: 0.12.1

 Attachments: HIVE-6687.3.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6701) Analyze table compute statistics for decimal columns.


 [ 
https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6701:
---

Status: Patch Available  (was: Open)

 Analyze table compute statistics for decimal columns.
 -

 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Sergey Shelukhin
 Attachments: HIVE-6701.02.patch, HIVE-6701.1.patch


 Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6701) Analyze table compute statistics for decimal columns.


 [ 
https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6701:
---

Attachment: HIVE-6701.02.patch

the metastore work and the the q file.
Note that fields for decimal (varchar) already existed in the schema since 
HIVE-1362, they just weren't used. So upgrade scripts are not necessary.
Most of the patch is generated code...

 Analyze table compute statistics for decimal columns.
 -

 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Sergey Shelukhin
 Attachments: HIVE-6701.02.patch, HIVE-6701.1.patch


 Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19552: HIVE-6701 Analyze table compute statistics for decimal columns.

2014-03-21 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19552/
---

Review request for hive and Jitendra Pandey.


Repository: hive-git


Description
---

See JIRA


Diffs
-

  data/files/decimal.txt PRE-CREATION 
  metastore/if/hive_metastore.thrift b3f01d6 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h d0998e0 
  metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 59ac959 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ColumnStatisticsData.java
 848188a 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Decimal.java
 PRE-CREATION 
  
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DecimalColumnStatsData.java
 PRE-CREATION 
  metastore/src/gen/thrift/gen-php/metastore/Types.php 39062f9 
  metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 2e9f238 
  metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb b768b7f 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
325aa8b 
  metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 
af54095 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 eb23cf9 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 c7ac9b9 
  metastore/src/model/package.jdo 158fdcd 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java 99b062f 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/DecimalNumDistinctValueEstimator.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 
7348478 
  ql/src/test/queries/clientpositive/compute_stats_decimal.q PRE-CREATION 
  ql/src/test/results/clientpositive/compute_stats_decimal.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/19552/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Commented] (HIVE-6701) Analyze table compute statistics for decimal columns.


[ 
https://issues.apache.org/jira/browse/HIVE-6701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943597#comment-13943597
 ] 

Sergey Shelukhin commented on HIVE-6701:


https://reviews.apache.org/r/19552

 Analyze table compute statistics for decimal columns.
 -

 Key: HIVE-6701
 URL: https://issues.apache.org/jira/browse/HIVE-6701
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Sergey Shelukhin
 Attachments: HIVE-6701.02.patch, HIVE-6701.1.patch


 Analyze table should compute statistics for decimal columns as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled

2014-03-21 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/#review38215
---



ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
https://reviews.apache.org/r/19549/#comment70234

Just for my understanding, for the given example, what's the filterOp, 
what's the parent, and what are the siblings?


- Xuefu Zhang


On March 21, 2014, 9:05 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19549/
 ---
 
 (Updated March 21, 2014, 9:05 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 In this scenario, PPD on the script (transform) operator did the following 
 wrong predicate pushdown:
 
 script -- filter (state=1)
-- select, insert into test1
--filter (state=2)
-- select, insert into test2
 
 into:
 
 script -- filter (state=1 and state=2)   //not possible.
  -- select, insert into test1
  -- select, insert into test2
 
 
 The bug was a combination of two things, first that these filters got chosen 
 by FilterPPD as 'candidate' pushdown precdicates, and that the ScriptPPD 
 called  mergeWithChildrenPred + createFilters which did the above 
 transformation due to them being marked.  
 
 ScriptPPD was one of the few simple operator that did this, I tried with some 
 other parent operator like extract (see my added test in transform_ppr2.q) 
 and also just a select operator and could not produce the issue with those.
 
 The fix is to skip marking a predicate as a 'candidate' for the pushdown if 
 it is a sibling of another filter.  We still want to pushdown children of 
 transform-operator with grandchildren, etc.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
   ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
   ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
   ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 
 
 Diff: https://reviews.apache.org/r/19549/diff/
 
 
 Testing
 ---
 
 Reproduced both the issue in transform_ppd_multi.q, also did another similar 
 issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
 transform_ppd and general ppd tests to ensure no regression.
 
 
 Thanks,
 
 Szehon Ho

Re: Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled



 On March 21, 2014, 9:50 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 180
  https://reviews.apache.org/r/19549/diff/1/?file=531817#file531817line180
 
  Just for my understanding, for the given example, what's the filterOp, 
  what's the parent, and what are the siblings?

Hi Xuefu, thanks for looking.  Like in my ascii diagram above, filter op is the 
(Filter).  The parent is the script operator.


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/#review38215
---


On March 21, 2014, 9:05 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19549/
 ---
 
 (Updated March 21, 2014, 9:05 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 In this scenario, PPD on the script (transform) operator did the following 
 wrong predicate pushdown:
 
 script -- filter (state=1)
-- select, insert into test1
--filter (state=2)
-- select, insert into test2
 
 into:
 
 script -- filter (state=1 and state=2)   //not possible.
  -- select, insert into test1
  -- select, insert into test2
 
 
 The bug was a combination of two things, first that these filters got chosen 
 by FilterPPD as 'candidate' pushdown precdicates, and that the ScriptPPD 
 called  mergeWithChildrenPred + createFilters which did the above 
 transformation due to them being marked.  
 
 ScriptPPD was one of the few simple operator that did this, I tried with some 
 other parent operator like extract (see my added test in transform_ppr2.q) 
 and also just a select operator and could not produce the issue with those.
 
 The fix is to skip marking a predicate as a 'candidate' for the pushdown if 
 it is a sibling of another filter.  We still want to pushdown children of 
 transform-operator with grandchildren, etc.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
   ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
   ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
   ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
   ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 
 
 Diff: https://reviews.apache.org/r/19549/diff/
 
 
 Testing
 ---
 
 Reproduced both the issue in transform_ppd_multi.q, also did another similar 
 issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
 transform_ppd and general ppd tests to ensure no regression.
 
 
 Thanks,
 
 Szehon Ho

[jira] [Commented] (HIVE-6395) multi-table insert from select transform fails if optimize.ppd enabled


[ 
https://issues.apache.org/jira/browse/HIVE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13943607#comment-13943607
 ] 

Szehon Ho commented on HIVE-6395:
-

Actually, I just saw your fix for HIVE-4293, is it a more complete fix for the 
same situation ?

 multi-table insert from select transform fails if optimize.ppd enabled
 --

 Key: HIVE-6395
 URL: https://issues.apache.org/jira/browse/HIVE-6395
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6395.patch, test.py


 {noformat}
 set hive.optimize.ppd=true;
 add file ./test.py;
 from (select transform(test.*) using 'python ./test.py'
 as id,name,state from test) t0
 insert overwrite table test2 select * where state=1
 insert overwrite table test3 select * where state=2;
 {noformat}
 In the above example, the select transform returns an extra column, and that 
 column is used in where clause of the multi-insert selects.  However, if 
 optimize is on, the query plan is wrong:
 filter (state=1 and state=2) //impossible
 -- select, insert into test1
 -- select, insert into test2
 The correct query plan for hive.optimize.ppd=false is:
 filter (state=1)
 -- select, insert into test1
 filter (state=2)
 -- select, insert into test2
 For reference
 {noformat}
 create table test (id int, name string)
 create table test2(id int, name string, state int)
 create table test3(id int, name string, state int)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name