date:20121002


[ 
https://issues.apache.org/jira/browse/HIVE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467523#comment-13467523
 ] 

Carl Steinbach commented on HIVE-3501:
--

+1. Will commit if tests pass.

 Track table and keys used in joins and group bys for logging
 

 Key: HIVE-3501
 URL: https://issues.apache.org/jira/browse/HIVE-3501
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Sambavi Muthukrishnan
Assignee: Sambavi Muthukrishnan
Priority: Minor
 Attachments: table_access_keys.1.patch, table_access_keys.2.patch, 
 table_access_keys.3.patch, table_access_keys.4.patch, 
 table_access_keys.5.patch

   Original Estimate: 96h
  Remaining Estimate: 96h

 For all operators that could benefit from bucketing, it will be useful to 
 keep track of and log the table names and key column names in order for the 
 operator to be converted to the bucketed version. This task is to track this 
 information for joins and group bys when the keys can be directly mapped back 
 to table scans and columns on that table. This information will be tracked on 
 the QueryPlan object so it is available to any pre/post execution hooks for 
 logging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-2821) union with two mapjoin will throw NPE

2012-10-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-2821.


Resolution: Cannot Reproduce

No longer reproducible on trunk.

 union  with two mapjoin will throw NPE 
 ---

 Key: HIVE-2821
 URL: https://issues.apache.org/jira/browse/HIVE-2821
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
 Environment: Linux zongren-VirtualBox 3.0.0-14-generic #23-Ubuntu SMP 
 Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux
 java version 1.6.0_25
 hadoop-0.20.2-cdh3u0
 hive-0.7.0-cdh3u0
Reporter: caofangkun
Priority: Critical
  Labels: optimizer, ql, union

 create table src (key string, value string);
 create table src1 (key string, value string);
 select count(*) from (
 select /+mapjoin(b)/ a.*
 from src a
 join 
 src1 b
 on a.key=b.key
 where a.key=48
 union all
 select /+mapjoin(bb)/ aa.*
 from src aa
 join 
 src1 bb
 on aa.key=bb.key
 where aa.key=100
 ) t;
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:156)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:553)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:514)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan(GenMapRedUtils.java:125)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process(GenMRRedSink1.java:76)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMRRedSink3.process(GenMRRedSink3.java:64)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:55)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:6946)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7247)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:904)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:279)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:228)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:417)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:350)
 at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:451)
 at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:461)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:585)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3495) For UDAFs, when generating a plan without map-side-aggregation, constant agg parameters will be replaced by ExprNodeColumnDesc

2012-10-02 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3495:
-

   Resolution: Fixed
Fix Version/s: 0.10.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Yin

 For UDAFs, when generating a plan without map-side-aggregation, constant agg 
 parameters will be replaced by ExprNodeColumnDesc
 --

 Key: HIVE-3495
 URL: https://issues.apache.org/jira/browse/HIVE-3495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3495.1.patch.txt, HIVE-3495.2.patch.txt, 
 HIVE-3495.3.patch.txt, HIVE-3495.4.patch.txt


 For UDAFs, when generating a plan without map-side-aggregation, constant agg 
 parameters having ConstantObjectInspectors will be replaced by 
 ExprNodeColumnDescs. A UDFArgumentTypeException will be thrown if a UDAF need 
 to checkout parameters' types. 
 An example used to reply the error is 
 {code:sql}
 set hive.map.aggr=false;
 SELECT percentile_approx(cast(substr(src.value,5) AS double), 0.5) FROM src;
 {code}. 
 Here is the log 
 {code}
 2012-09-20 12:36:06,947 DEBUG exec.FunctionRegistry 
 (FunctionRegistry.java:getGenericUDAFResolver(849)) - Looking up GenericUDAF: 
 percentile_approx
 2012-09-20 12:36:06,952 ERROR ql.Driver (SessionState.java:printError(400)) - 
 FAILED: UDFArgumentTypeException The second argument must be a constant, but 
 double was passed instead.
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: The second argument 
 must be a constant, but double was passed instead.
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileApprox.getEvaluator(GenericUDAFPercentileApprox.java:149)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:774)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:2389)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator(SemanticAnalyzer.java:2561)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlan1MR(SemanticAnalyzer.java:3341)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:6140)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6903)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7484)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:903)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:713)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_replay(TestCliDriver.java:125)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at junit.framework.TestSuite.runTest(TestSuite.java:232)
   at junit.framework.TestSuite.run(TestSuite.java:227)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:520)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1060)
   at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:911)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more

[jira] [Updated] (HIVE-2935) Implement HiveServer2


 [ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2935:
-

Attachment: HIVE-2935.1.notest.patch.txt

HIVE-2935.1.notest.patch.txt: patch w/o new qfile test outputs.



 Implement HiveServer2
 -

 Key: HIVE-2935
 URL: https://issues.apache.org/jira/browse/HIVE-2935
 Project: Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
  Labels: HiveServer2
 Attachments: HIVE-2935.1.notest.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2935) Implement HiveServer2


 [ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2935:
-

Attachment: HIVE-2935.2.notest.patch.txt

Second patch excludes some test outputs missed in the first patch.

 Implement HiveServer2
 -

 Key: HIVE-2935
 URL: https://issues.apache.org/jira/browse/HIVE-2935
 Project: Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
  Labels: HiveServer2
 Attachments: HIVE-2935.1.notest.patch.txt, 
 HIVE-2935.2.notest.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3484) RetryingRawStore logic needs to be significantly reworked to support retries within transactions


[ 
https://issues.apache.org/jira/browse/HIVE-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467578#comment-13467578
 ] 

Hudson commented on HIVE-3484:
--

Integrated in Hive-trunk-h0.21 #1714 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1714/])
HIVE-3484. RetryingRawStore logic needs to be significantly reworked to 
support retries within transactions (Jean Xu via kevinwilfong) (Revision 
1392491)

 Result = FAILURE
kevinwilfong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1392491
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IHMSHandler.java
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreInit.java
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java


 RetryingRawStore logic needs to be significantly reworked to support retries 
 within transactions
 

 Key: HIVE-3484
 URL: https://issues.apache.org/jira/browse/HIVE-3484
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jean Xu
Assignee: Jean Xu
 Fix For: 0.10.0


 The logic for retrying has been moved from RetryingRawStore to 
 RetryingHMSHandler. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1714 - Still Failing

Changes for Build #1708

Changes for Build #1709
[namit] HIVE-3515 metadata_export_drop.q causes failure of other tests
(Ivan Gorbachev via namit)


Changes for Build #1710

Changes for Build #1711
[heyongqiang] HIVE-2206:add a new optimizer for query correlation discovery and 
optimization (Yin Huai via He Yongqiang)

[namit] HIVE-1367 cluster by multiple columns does not work if parenthesis is 
present
(Zhenxiao Luo via namit)


Changes for Build #1712
[cws] add instrumentation to capture if there is skew in reducers (Arun Dobriya 
via cws)

[namit] HIVE-3493 aggName of SemanticAnalyzer.getGenericUDAFEvaluator is 
generated in two
different ways (Yin Huai via namit)

[heyongqiang] revert r1392105 due to bylaw requirement mentioned by Carl 
Steinbach


Changes for Build #1713

Changes for Build #1714
[kevinwilfong] HIVE-3484. RetryingRawStore logic needs to be significantly 
reworked to support retries within transactions (Jean Xu via kevinwilfong)




1 tests failed.
FAILED:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:11512)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1714)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1714/ to 
view the results.

[jira] [Commented] (HIVE-3495) For UDAFs, when generating a plan without map-side-aggregation, constant agg parameters will be replaced by ExprNodeColumnDesc


[ 
https://issues.apache.org/jira/browse/HIVE-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467763#comment-13467763
 ] 

Hudson commented on HIVE-3495:
--

Integrated in Hive-trunk-h0.21 #1715 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1715/])
HIVE-3495 For UDAFs, when generating a plan without map-side-aggregation, 
constant 
agg parameters will be replaced by ExprNodeColumnDesc (Yin Huai via namit) 
(Revision 1392761)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1392761
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx.q
* /hive/trunk/ql/src/test/results/clientpositive/count.q.out
* /hive/trunk/ql/src/test/results/clientpositive/nullgroup.q.out
* /hive/trunk/ql/src/test/results/clientpositive/nullgroup2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/nullgroup4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/nullgroup4_multi_distinct.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx.q.out


 For UDAFs, when generating a plan without map-side-aggregation, constant agg 
 parameters will be replaced by ExprNodeColumnDesc
 --

 Key: HIVE-3495
 URL: https://issues.apache.org/jira/browse/HIVE-3495
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3495.1.patch.txt, HIVE-3495.2.patch.txt, 
 HIVE-3495.3.patch.txt, HIVE-3495.4.patch.txt


 For UDAFs, when generating a plan without map-side-aggregation, constant agg 
 parameters having ConstantObjectInspectors will be replaced by 
 ExprNodeColumnDescs. A UDFArgumentTypeException will be thrown if a UDAF need 
 to checkout parameters' types. 
 An example used to reply the error is 
 {code:sql}
 set hive.map.aggr=false;
 SELECT percentile_approx(cast(substr(src.value,5) AS double), 0.5) FROM src;
 {code}. 
 Here is the log 
 {code}
 2012-09-20 12:36:06,947 DEBUG exec.FunctionRegistry 
 (FunctionRegistry.java:getGenericUDAFResolver(849)) - Looking up GenericUDAF: 
 percentile_approx
 2012-09-20 12:36:06,952 ERROR ql.Driver (SessionState.java:printError(400)) - 
 FAILED: UDFArgumentTypeException The second argument must be a constant, but 
 double was passed instead.
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: The second argument 
 must be a constant, but double was passed instead.
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileApprox.getEvaluator(GenericUDAFPercentileApprox.java:149)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:774)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:2389)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator(SemanticAnalyzer.java:2561)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlan1MR(SemanticAnalyzer.java:3341)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:6140)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6903)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7484)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:903)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:713)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_replay(TestCliDriver.java:125)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at

Hive-trunk-h0.21 - Build # 1715 - Still Failing

Changes for Build #1708

Changes for Build #1709
[namit] HIVE-3515 metadata_export_drop.q causes failure of other tests
(Ivan Gorbachev via namit)


Changes for Build #1710

Changes for Build #1711
[heyongqiang] HIVE-2206:add a new optimizer for query correlation discovery and 
optimization (Yin Huai via He Yongqiang)

[namit] HIVE-1367 cluster by multiple columns does not work if parenthesis is 
present
(Zhenxiao Luo via namit)


Changes for Build #1712
[cws] add instrumentation to capture if there is skew in reducers (Arun Dobriya 
via cws)

[namit] HIVE-3493 aggName of SemanticAnalyzer.getGenericUDAFEvaluator is 
generated in two
different ways (Yin Huai via namit)

[heyongqiang] revert r1392105 due to bylaw requirement mentioned by Carl 
Steinbach


Changes for Build #1713

Changes for Build #1714
[kevinwilfong] HIVE-3484. RetryingRawStore logic needs to be significantly 
reworked to support retries within transactions (Jean Xu via kevinwilfong)


Changes for Build #1715
[namit] HIVE-3495 For UDAFs, when generating a plan without 
map-side-aggregation, constant 
agg parameters will be replaced by ExprNodeColumnDesc (Yin Huai via namit)




1 tests failed.
FAILED:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:11512)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1715)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1715/ to 
view the results.

Re: Review Request: HIVE-2206: add a new optimizer for query correlation discovery and optimization

2012-10-02 Thread Yin Huai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7126/
---

(Updated Oct. 2, 2012, 3:43 p.m.)


Review request for hive.


Changes
---

remove the first phase of the optimizer


Description
---

This optimizer exploits intra-query correlations and merges multiple correlated 
MapReduce jobs into one jobs. Open a new request since I have been working on 
hive-git.


This addresses bug HIVE-2206.
https://issues.apache.org/jira/browse/HIVE-2206


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8064c73 
  conf/hive-default.xml.template 23762af 
  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
 8c9bd26 
  ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 652d81c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 8a5df6f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 919a140 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1469325 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 01b0728 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 40cc7ed 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java bce2a06 
  ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 322f20b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125 
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040 
  ql/src/test/queries/clientpositive/correlationoptimizer1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/correlationoptimizer5.q PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer1.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer2.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer4.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/correlationoptimizer5.q.out PRE-CREATION 
  ql/src/test/results/compiler/plan/groupby1.q.xml 4382252 
  ql/src/test/results/compiler/plan/groupby2.q.xml eef669c 
  ql/src/test/results/compiler/plan/groupby3.q.xml 9743480 
  ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860 

Diff: https://reviews.apache.org/r/7126/diff/


Testing
---

All tests pass.


Thanks,

Yin Huai

[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2012-10-02 Thread Yin Huai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-2206:
---

Attachment: HIVE-2206.15-r1392491.patch.txt

 add a new optimizer for query correlation discovery and optimization
 

 Key: HIVE-2206
 URL: https://issues.apache.org/jira/browse/HIVE-2206
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: He Yongqiang
Assignee: Yin Huai
 Attachments: HIVE-2206.10-r1384442.patch.txt, 
 HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
 HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
 HIVE-2206.15-r1392491.patch.txt, HIVE-2206.1.patch.txt, 
 HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, 
 HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, 
 HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, 
 HIVE-2206.8-r1237253.patch.txt, testQueries.2.q, YSmartPatchForHive.patch


 reference:
 http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2012-10-02 Thread Yin Huai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-2206:
---

Status: Patch Available  (was: Reopened)

@Namit:
You can review the latest patch. I removed the first phase and other 
unnecessary contents. 

 add a new optimizer for query correlation discovery and optimization
 

 Key: HIVE-2206
 URL: https://issues.apache.org/jira/browse/HIVE-2206
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: He Yongqiang
Assignee: Yin Huai
 Attachments: HIVE-2206.10-r1384442.patch.txt, 
 HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
 HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
 HIVE-2206.15-r1392491.patch.txt, HIVE-2206.1.patch.txt, 
 HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, 
 HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, 
 HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, 
 HIVE-2206.8-r1237253.patch.txt, testQueries.2.q, YSmartPatchForHive.patch


 reference:
 http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #156

See 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/

--
[...truncated 10125 lines...]
 [echo] Project: odbc
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/odbc/src/conf
 does not exist.

ivy-resolve-test:
 [echo] Project: odbc

ivy-retrieve-test:
 [echo] Project: odbc

compile-test:
 [echo] Project: odbc

create-dirs:
 [echo] Project: serde
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/serde/src/test/resources
 does not exist.

init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:
 [echo] Project: serde
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-serde-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/report/org.apache.hive-hive-serde-default.html

ivy-retrieve:
 [echo] Project: serde

dynamic-serde:

compile:
 [echo] Project: serde

ivy-resolve-test:
 [echo] Project: serde

ivy-retrieve-test:
 [echo] Project: serde

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/serde/test/classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/service/src/test/resources
 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/report/org.apache.hive-hive-service-default.html

ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/service/test/classes

test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/156/artifact/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling

[jira] [Updated] (HIVE-3481) Resource leak: Hiveserver is not closing the existing driver handle before executing the next command. It results in to file handle leaks.

2012-10-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3481:
---

   Resolution: Fixed
Fix Version/s: (was: 0.9.1)
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Kanna!

 Resource leak: Hiveserver is not closing the existing driver handle before 
 executing the next command. It results in to file handle leaks.
 

 Key: HIVE-3481
 URL: https://issues.apache.org/jira/browse/HIVE-3481
 Project: Hive
  Issue Type: Bug
  Components: JDBC, ODBC, Server Infrastructure
Affects Versions: 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 0.10.0

 Attachments: HIVE-3481.1.patch.txt


 Close the driver object if it exists before creating another driver object. 
 Bunch of HiveServer  JDBC related unit tests are failing because of these 
 file handle leaks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-374) code cleanup for Driver


 [ 
https://issues.apache.org/jira/browse/HIVE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghotham Murthy updated HIVE-374:
--

Assignee: (was: Raghotham Murthy)

 code cleanup for Driver
 ---

 Key: HIVE-374
 URL: https://issues.apache.org/jira/browse/HIVE-374
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Work started] (HIVE-3522) Make separator for Entity name configurable


 [ 
https://issues.apache.org/jira/browse/HIVE-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3522 started by Raghotham Murthy.

 Make separator for Entity name configurable
 ---

 Key: HIVE-3522
 URL: https://issues.apache.org/jira/browse/HIVE-3522
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
Priority: Trivial

 Right now its hard-coded to '@'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3522) Make separator for Entity name configurable

Raghotham Murthy created HIVE-3522:
--

 Summary: Make separator for Entity name configurable
 Key: HIVE-3522
 URL: https://issues.apache.org/jira/browse/HIVE-3522
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
Priority: Trivial


Right now its hard-coded to '@'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3501) Track table and keys used in joins and group bys for logging


[ 
https://issues.apache.org/jira/browse/HIVE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467931#comment-13467931
 ] 

Carl Steinbach commented on HIVE-3501:
--

@Sambavi: Please change the status of this ticket to patch submitted. Thanks.


 Track table and keys used in joins and group bys for logging
 

 Key: HIVE-3501
 URL: https://issues.apache.org/jira/browse/HIVE-3501
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Sambavi Muthukrishnan
Assignee: Sambavi Muthukrishnan
Priority: Minor
 Attachments: table_access_keys.1.patch, table_access_keys.2.patch, 
 table_access_keys.3.patch, table_access_keys.4.patch, 
 table_access_keys.5.patch

   Original Estimate: 96h
  Remaining Estimate: 96h

 For all operators that could benefit from bucketing, it will be useful to 
 keep track of and log the table names and key column names in order for the 
 operator to be converted to the bucketed version. This task is to track this 
 information for joins and group bys when the keys can be directly mapped back 
 to table scans and columns on that table. This information will be tracked on 
 the QueryPlan object so it is available to any pre/post execution hooks for 
 logging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3501) Track table and keys used in joins and group bys for logging

2012-10-02 Thread Sambavi Muthukrishnan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sambavi Muthukrishnan updated HIVE-3501:


Status: Patch Available  (was: Open)

 Track table and keys used in joins and group bys for logging
 

 Key: HIVE-3501
 URL: https://issues.apache.org/jira/browse/HIVE-3501
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Sambavi Muthukrishnan
Assignee: Sambavi Muthukrishnan
Priority: Minor
 Attachments: table_access_keys.1.patch, table_access_keys.2.patch, 
 table_access_keys.3.patch, table_access_keys.4.patch, 
 table_access_keys.5.patch

   Original Estimate: 96h
  Remaining Estimate: 96h

 For all operators that could benefit from bucketing, it will be useful to 
 keep track of and log the table names and key column names in order for the 
 operator to be converted to the bucketed version. This task is to track this 
 information for joins and group bys when the keys can be directly mapped back 
 to table scans and columns on that table. This information will be tracked on 
 the QueryPlan object so it is available to any pre/post execution hooks for 
 logging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3501) Track table and keys used in joins and group bys for logging

2012-10-02 Thread Sambavi Muthukrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467943#comment-13467943
 ] 

Sambavi Muthukrishnan commented on HIVE-3501:
-

Oops sorry. Just hit submit patch.

 Track table and keys used in joins and group bys for logging
 

 Key: HIVE-3501
 URL: https://issues.apache.org/jira/browse/HIVE-3501
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Sambavi Muthukrishnan
Assignee: Sambavi Muthukrishnan
Priority: Minor
 Attachments: table_access_keys.1.patch, table_access_keys.2.patch, 
 table_access_keys.3.patch, table_access_keys.4.patch, 
 table_access_keys.5.patch

   Original Estimate: 96h
  Remaining Estimate: 96h

 For all operators that could benefit from bucketing, it will be useful to 
 keep track of and log the table names and key column names in order for the 
 operator to be converted to the bucketed version. This task is to track this 
 information for joins and group bys when the keys can be directly mapped back 
 to table scans and columns on that table. This information will be tracked on 
 the QueryPlan object so it is available to any pre/post execution hooks for 
 logging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3522) Make separator for Entity name configurable


[ 
https://issues.apache.org/jira/browse/HIVE-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467973#comment-13467973
 ] 

Raghotham Murthy commented on HIVE-3522:


patch here: https://reviews.facebook.net/D5793

 Make separator for Entity name configurable
 ---

 Key: HIVE-3522
 URL: https://issues.apache.org/jira/browse/HIVE-3522
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
Priority: Trivial

 Right now its hard-coded to '@'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3458) Parallel test script doesnt run all tests


[ 
https://issues.apache.org/jira/browse/HIVE-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468050#comment-13468050
 ] 

Kevin Wilfong commented on HIVE-3458:
-

https://issues.apache.org/jira/browse/HIVE-3515 Seems to have fixed it.

 Parallel test script doesnt run all tests
 -

 Key: HIVE-3458
 URL: https://issues.apache.org/jira/browse/HIVE-3458
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9.0
Reporter: Sambavi Muthukrishnan
Assignee: Ivan Gorbachev
 Fix For: 0.10.0

 Attachments: ptest.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Parallel test script when run on a cluster of machines in multi-threaded mode 
 doesnt report back on all tests in the suite. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-3458) Parallel test script doesnt run all tests


 [ 
https://issues.apache.org/jira/browse/HIVE-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong resolved HIVE-3458.
-

Resolution: Fixed

Committed, thanks Ivan.

 Parallel test script doesnt run all tests
 -

 Key: HIVE-3458
 URL: https://issues.apache.org/jira/browse/HIVE-3458
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9.0
Reporter: Sambavi Muthukrishnan
Assignee: Ivan Gorbachev
 Fix For: 0.10.0

 Attachments: ptest.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Parallel test script when run on a cluster of machines in multi-threaded mode 
 doesnt report back on all tests in the suite. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3455) ANSI CORR(X,Y) is incorrect

2012-10-02 Thread Anonymous (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated HIVE-3455:


  Labels: patch  (was: )
Release Note: 
the patch for the
src/ql/src/java/org/apache/hadoop/hive/ql/udf/generic
  Status: Patch Available  (was: Reopened)

 ANSI CORR(X,Y) is incorrect
 ---

 Key: HIVE-3455
 URL: https://issues.apache.org/jira/browse/HIVE-3455
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.8.1, 0.8.0, 0.7.1, 0.10.0, 0.9.1
Reporter: Maxim Bolotin
  Labels: patch
 Attachments: my.patch


 A simple test with 2 collinear vectors returns a wrong result.
 The problem is the merge of variances, file:
 http://svn.apache.org/viewvc/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java?revision=1157222view=markup
 lines:
 347: myagg.xvar += xvarB + (xavgA - xavgB) * (xavgA - xavgB) * myagg.count;
 348: myagg.yvar += yvarB + (yavgA - yavgB) * (yavgA - yavgB) * myagg.count;
 the correct merge should be like this:
 347: myagg.xvar += xvarB + (xavgA - xavgB) * (xavgA - xavgB) / myagg.count * 
 nA * nB;
 348: myagg.yvar += yvarB + (yavgA - yavgB) * (yavgA - yavgB) / myagg.count * 
 nA * nB;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2935) Implement HiveServer2


 [ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2935:
-

Attachment: beelinepositive.tar.gz

Uploading the new test outputs separately since the combined patch exceeds the 
10MB size limit. Untar this file in the ql/src/test/results directory.

 Implement HiveServer2
 -

 Key: HIVE-2935
 URL: https://issues.apache.org/jira/browse/HIVE-2935
 Project: Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Carl Steinbach
Assignee: Carl Steinbach
  Labels: HiveServer2
 Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, 
 HIVE-2935.2.notest.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1716 - Still Failing

Changes for Build #1708

Changes for Build #1709
[namit] HIVE-3515 metadata_export_drop.q causes failure of other tests
(Ivan Gorbachev via namit)


Changes for Build #1710

Changes for Build #1711
[heyongqiang] HIVE-2206:add a new optimizer for query correlation discovery and 
optimization (Yin Huai via He Yongqiang)

[namit] HIVE-1367 cluster by multiple columns does not work if parenthesis is 
present
(Zhenxiao Luo via namit)


Changes for Build #1712
[cws] add instrumentation to capture if there is skew in reducers (Arun Dobriya 
via cws)

[namit] HIVE-3493 aggName of SemanticAnalyzer.getGenericUDAFEvaluator is 
generated in two
different ways (Yin Huai via namit)

[heyongqiang] revert r1392105 due to bylaw requirement mentioned by Carl 
Steinbach


Changes for Build #1713

Changes for Build #1714
[kevinwilfong] HIVE-3484. RetryingRawStore logic needs to be significantly 
reworked to support retries within transactions (Jean Xu via kevinwilfong)


Changes for Build #1715
[namit] HIVE-3495 For UDAFs, when generating a plan without 
map-side-aggregation, constant 
agg parameters will be replaced by ExprNodeColumnDesc (Yin Huai via namit)


Changes for Build #1716



1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener

Error Message:
java.net.SocketTimeoutException: Read timed out

Stack Trace:
org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:378)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:365)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:705)
at 
org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener(TestMetaStoreEventListener.java:190)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 24 more




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1716)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1716/ to 
view the results.

[jira] [Commented] (HIVE-3498) hivetest.py fails with --revision option

2012-10-02 Thread Ivan Gorbachev (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468135#comment-13468135
]

Ivan Gorbachev commented on HIVE-3498:
--

https://reviews.facebook.net/D5799

hivetest.py fails with --revision option

Key: HIVE-3498
URL: https://issues.apache.org/jira/browse/HIVE-3498
Project: Hive
Issue Type: Bug
Components: Testing Infrastructure
Reporter: Ivan Gorbachev
Assignee: Ivan Gorbachev
Labels: testing
Attachments: jira-3498.0.patch

How to reproduce outside hivetest.py:
1. Clone git://git.apache.org/hive.git
2. Run ant arc-setup
3. Run arc patch rev
Output:
{quote}
This diff is against commit
https://svn.apache.org/repos/asf/hive/trunk@1382631, but the commit is
nowhere in the working copy. Try to apply it against the current working
copy state? (d5f66df1edfff2645f225298e225dbccc70d97ff) [Y/n]
{quote}
If you choose 'Y' it suggests you to complete 'merge-message' and then prints:
{quote}
Select a Default Commit Range
You're running a command which operates on a range of revisions (usually,
from some revision to HEAD) but have not specified the revision that should
determine the start of the range.
Previously, arc assumed you meant 'HEAD^' when you did not specify a start
revision, but this behavior does not make much sense in most workflows
outside of Facebook's historic git-svn workflow.
arc no longer assumes 'HEAD^'. You must specify a relative commit explicitly
when you invoke a command (e.g., `arc diff HEAD^`, not just `arc diff`) or
select a default for this working copy.
In most cases, the best default is 'origin/master'. You can also select
'HEAD^' to preserve the old behavior, or some other remote or branch. But you
almost certainly want to select 'origin/master'.
(Technically: the merge-base of the selected revision and HEAD is used to
determine the start of the commit range.)
What default do you want to use? [origin/master]
{quote}
There isn't the same behavior for svn checkout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3498) hivetest.py fails with --revision option

2012-10-02 Thread Ivan Gorbachev (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ivan Gorbachev updated HIVE-3498:
-

Status: Patch Available (was: Open)

hivetest.py fails with --revision option

[jira] [Updated] (HIVE-3498) hivetest.py fails with --revision option

2012-10-02 Thread Ivan Gorbachev (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ivan Gorbachev updated HIVE-3498:
-

Attachment: jira-3498.0.patch

hivetest.py fails with --revision option

[jira] [Commented] (HIVE-3501) Track table and keys used in joins and group bys for logging

[
https://issues.apache.org/jira/browse/HIVE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468145#comment-13468145
]

Carl Steinbach commented on HIVE-3501:
--

@Sambavi: I'm a little behind schedule on this. Just kicked off the test run a
minute ago. If everything goes ok I should have this committed later tonight.
Sorry for the delay.

Track table and keys used in joins and group bys for logging

Key: HIVE-3501
URL: https://issues.apache.org/jira/browse/HIVE-3501
Project: Hive
Issue Type: Task
Components: Query Processor
Affects Versions: 0.10.0
Reporter: Sambavi Muthukrishnan
Assignee: Sambavi Muthukrishnan
Priority: Minor
Attachments: table_access_keys.1.patch, table_access_keys.2.patch,
table_access_keys.3.patch, table_access_keys.4.patch,
table_access_keys.5.patch

Original Estimate: 96h
Remaining Estimate: 96h

For all operators that could benefit from bucketing, it will be useful to
keep track of and log the table names and key column names in order for the
operator to be converted to the bucketed version. This task is to track this
information for joins and group bys when the keys can be directly mapped back
to table scans and columns on that table. This information will be tracked on
the QueryPlan object so it is available to any pre/post execution hooks for
logging.

[jira] [Commented] (HIVE-3522) Make separator for Entity name configurable


[ 
https://issues.apache.org/jira/browse/HIVE-3522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468166#comment-13468166
 ] 

Kevin Wilfong commented on HIVE-3522:
-

Comments on Phabricator.

 Make separator for Entity name configurable
 ---

 Key: HIVE-3522
 URL: https://issues.apache.org/jira/browse/HIVE-3522
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
Priority: Trivial

 Right now its hard-coded to '@'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3498) hivetest.py fails with --revision option

[
https://issues.apache.org/jira/browse/HIVE-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468196#comment-13468196
]

Kevin Wilfong commented on HIVE-3498:
-

hivetest.py fails with --revision option

[jira] [Created] (HIVE-3523) Hive info logging is broken

Shreepadma Venugopalan created HIVE-3523:


 Summary: Hive info logging is broken
 Key: HIVE-3523
 URL: https://issues.apache.org/jira/browse/HIVE-3523
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan


Hive Info logging is broken on trunk. hive -hiveconf 
hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
to the console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3523) Hive info logging is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3523:
-

Component/s: Logging

 Hive info logging is broken
 ---

 Key: HIVE-3523
 URL: https://issues.apache.org/jira/browse/HIVE-3523
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Carl Steinbach

 Hive Info logging is broken on trunk. hive -hiveconf 
 hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
 to the console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3523) Hive info logging is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-3523:


Assignee: Carl Steinbach

 Hive info logging is broken
 ---

 Key: HIVE-3523
 URL: https://issues.apache.org/jira/browse/HIVE-3523
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Carl Steinbach

 Hive Info logging is broken on trunk. hive -hiveconf 
 hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
 to the console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3523) Hive info logging is broken

[
https://issues.apache.org/jira/browse/HIVE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468241#comment-13468241
]

Carl Steinbach commented on HIVE-3523:
--

This bug is caused by HIVE-3505 which modified the hive-log4j.properties file.

Before HIVE-3505 the hive-log4j.properties file looked like this:

{noformat}
hive.root.logger=WARN,DRFA
hive.log.dir=/tmp/${user.name}

...

# Logging Threshold
log4j.threshhold=WARN
{noformat}

And after HIVE-3505 it looks like this:

{noformat}
hive.log.threshold=WARN
hive.root.logger=${hive.log.threshold},DRFA

...

# Logging Threshold
log4j.threshold=${hive.log.threshold}
{noformat}

One not so obvious change is that we corrected a spelling mistake, changing
log4j.thresshold to log4j.threshold. The fact that log4j.threshold was
previously misspelled meant that log4j had been using the default threshold
value ALL, which is equivalent to no threshold at all. HIVE-3505 fixed the
spelling mistake, which caused log4j to start using the new threshold value
WARN, which explains why INFO level messages are getting filtered out even when
hive.root.logger is set to INFO,console.

It's possible to work around this problem right now by setting both
hive.log.threshold and hive.root.logger. For example:

hive -hiveconf hive.log.threshold=INFO -hiveconf hive.root.logger=INFO,console

Hive info logging is broken
---

Key: HIVE-3523
URL: https://issues.apache.org/jira/browse/HIVE-3523
Project: Hive
Issue Type: Bug
Components: Logging
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Carl Steinbach

Hive Info logging is broken on trunk. hive -hiveconf
hive.root.logger=INFO,console doesn't print the output of LOG.info statements
to the console.

[jira] [Updated] (HIVE-3523) Hive info logging is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3523:
-

Attachment: HIVE-3523.1.patch.txt

 Hive info logging is broken
 ---

 Key: HIVE-3523
 URL: https://issues.apache.org/jira/browse/HIVE-3523
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Carl Steinbach
 Attachments: HIVE-3523.1.patch.txt


 Hive Info logging is broken on trunk. hive -hiveconf 
 hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
 to the console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3523) Hive info logging is broken

2012-10-02 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3523:
--

Attachment: HIVE-3523.D5811.1.patch

cwsteinbach requested code review of HIVE-3523 [jira] Hive info logging is 
broken.
Reviewers: JIRA, ashutoshc, njain

  HIVE-3523. Hive INFO level logging is broken

  Hive Info logging is broken on trunk. hive -hiveconf 
hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
to the console.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D5811

AFFECTED FILES
  common/src/java/conf/hive-log4j.properties
  ql/src/java/conf/hive-exec-log4j.properties

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/13737/

To: JIRA, ashutoshc, njain, cwsteinbach


 Hive info logging is broken
 ---

 Key: HIVE-3523
 URL: https://issues.apache.org/jira/browse/HIVE-3523
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Carl Steinbach
 Attachments: HIVE-3523.1.patch.txt, HIVE-3523.D5811.1.patch


 Hive Info logging is broken on trunk. hive -hiveconf 
 hive.root.logger=INFO,console doesn't print the output of LOG.info statements 
 to the console. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1362) column level statistics


 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-1362:
-

Status: Patch Available  (was: Open)

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, 
 HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1362) column level statistics


 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-1362:
-

Attachment: HIVE-1362-gen_thrift.2.patch.txt

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, 
 HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1362) column level statistics


[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468270#comment-13468270
 ] 

Shreepadma Venugopalan commented on HIVE-1362:
--

I've provide a review board link shortly. 

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, 
 HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1362) column level statistics


[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468271#comment-13468271
 ] 

Shreepadma Venugopalan commented on HIVE-1362:
--

Meant to say I'll provide a review board link shortly.

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, 
 HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-1362: Support for column statistics in Hive

2012-10-02 Thread Shreepadma Venugopalan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6878/
---

(Updated Oct. 3, 2012, 3:10 a.m.)


Review request for hive and Carl Steinbach.


Changes
---

This patch addresses the comments from revision # 1 and makes the following 
changes,

* Splits the ColumnStatistics thrift structure in such a way that the thrift 
API is locked down.
* Splits the writeColumnStatistics API to updateTable.. and updatePartition... 
to separate out partition and table level updates
* Adds comments to the Thrift RPC calls
* Logs the record that is being written in 
update[Table/Partition]ColumnStatistics to identify the bad record in case of a 
failed update
* Uses a consistent naming convention for the Thrift APIs
* Incorporates the rest of the misc. review comments from revision # 1.


Description
---

This patch implements version 1 of the column statistics project in Hive. It 
adds support for computing and persisting statistical summary of column values 
in Hive Tables and Partitions. In order to support column statistics in Hive, 
this patch does the following,

* Adds a new compute stats UDAF to compute scalar statistics for all primitive 
Hive data types. In version 1 of the project, we support the following scalar 
statistics on primitive types - estimate of number of distinct values, number 
of null values, number of trues/falses for boolean typed columsn, max and avg 
length for string and binary typed columns, max and min value for long and 
double typed columns. Note that version 1 of the column stats project includes 
support for column statistics both at the table and partition level.

* Adds Metastore schema tables to persist the newly added statistics both at 
table and partition level.
* Adds Metastore Thrift API to persist, retrieve and delete column statistics 
at both table and partition level. 
Please refer to the following wiki link for the details of the schema and the 
Thrift API changes - 
https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

* Extends the analyze table compute statistics statement to trigger statistics 
computation and persistence for one or more columns. Please note that 
statistics for multiple columns is computed through a single scan of the table 
data. Please refer to the following wiki link for the syntax changes - 
https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive

One thing missing from the patch at this point is the metastore upgrade scrips 
for MySQL/Derby/Postgres/Oracle. I'm waiting for the review to finalize the 
metastore schema changes before I go ahead and add the upgrade scripts.

In a follow on patch, as part of version 2 of the column statistics project, we 
will add support for computing, persisting and retrieving histograms on long 
and double typed column values.

Generated Thrift files have been removed for viewing pleasure. JIRA page has 
the patch with the generated Thrift files.


This addresses bug HIVE-1362.
https://issues.apache.org/jira/browse/HIVE-1362


Diffs (updated)
-

  data/files/UserVisits.dat PRE-CREATION 
  data/files/binary.txt PRE-CREATION 
  data/files/bool.txt PRE-CREATION 
  data/files/double.txt PRE-CREATION 
  data/files/employee.dat PRE-CREATION 
  data/files/employee2.dat PRE-CREATION 
  data/files/int.txt PRE-CREATION 
  ivy/libraries.properties 7ac6778 
  metastore/if/hive_metastore.thrift d4fad72 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
8fec13d 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
17b986c 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
3883b5b 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java eff44b1 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java bf5ae3a 
  metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 77d1caa 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
 PRE-CREATION 
  
metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
 PRE-CREATION 
  metastore/src/model/package.jdo 38ce6d5 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 528a100 
  metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 
925938d 
  ql/build.xml 5de3f78 
  ql/if/queryplan.thrift 05fbf58 
  ql/ivy.xml aa3b8ce 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 425900d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java 4c8831f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 4446952 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 79b87f1 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java

[jira] [Updated] (HIVE-1362) column level statistics


 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shreepadma Venugopalan updated HIVE-1362:
-

Attachment: HIVE-1362.2.patch.txt

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
 HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1362) column level statistics


[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468293#comment-13468293
 ] 

Shreepadma Venugopalan commented on HIVE-1362:
--

Patch is available for review at : https://reviews.apache.org/r/6878/

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
 HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-1362: Support for column statistics in Hive

2012-10-02 Thread Shreepadma Venugopalan



 On Sept. 13, 2012, 9:58 p.m., Carl Steinbach wrote:
  Made it most of the way through the first page of the review. More comments 
  to follow.

Since Hive doesn't interpret binary data, I was using string data for testing. 
However as you suggest in the comment below, I'll be use the table that you 
have been using for HS2.


 On Sept. 13, 2012, 9:58 p.m., Carl Steinbach wrote:
  data/files/binary.txt, line 1
  https://reviews.apache.org/r/6878/diff/2/?file=148716#file148716line1
 
  This doesn't look like binary data to me?

Since Hive doesn't interpret binary data, I was using string data for testing. 
However as you suggest in the comment below, I'll be use the table that you 
have been using for HS2.


 On Sept. 13, 2012, 9:58 p.m., Carl Steinbach wrote:
  metastore/if/hive_metastore.thrift, line 213
  https://reviews.apache.org/r/6878/diff/2/?file=148722#file148722line213
 
  isTblLevel looks redundant. If partName is not set, isn't it implicit 
  that isTblLevel==true?

I prefer to add explicit flags to distinguish between different states rather 
than overload the existing fields. For instance if the write_column_statistics 
is passed an object with a null partName we would end up interpreting it as 
column statistics for the table. However this can also occur in the case of an 
error when a bad column statistics struct is passed. In the absence of a flag 
there is no way to tell the difference between the two states.


 On Sept. 13, 2012, 9:58 p.m., Carl Steinbach wrote:
  metastore/if/hive_metastore.thrift, line 509
  https://reviews.apache.org/r/6878/diff/2/?file=148722#file148722line509
 
  Please add some comments explaining how each of these RPCs behaves, 
  e.g. what happens if I call delete_column_statistics_by_table on a 
  partition table that contains partition-level column statistics?

Will add comments.


 On Sept. 13, 2012, 9:58 p.m., Carl Steinbach wrote:
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, 
  line 2614
  https://reviews.apache.org/r/6878/diff/2/?file=148723#file148723line2614
 
  In my opinion the problem with leaving lots of blank lines in the 
  bodies of methods is that it makes it harder to spot the beginning and end 
  of functions which are typically signalled by two lines of whitespace. The 
  convention in this file is to not leave lots of blank lines in method 
  bodies, and it's kind of distracting to see code that doesn't follow that 
  convention.

I'll remove some of the line spaces in this file. I personally prefer a line 
space between blocks of code in a method - I think that makes the code much 
more readable in an editor . For instance I prefer a line space between 
variable declarations and a try-catch-finally block. Talking of conventions, 
there seems to be no one convention in Hive. I try to remain consistent with 
the file I'm modifying as much as possible but its hard to follow a convention 
when its not very clear what it is.


- Shreepadma


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6878/#review11502
---


On Oct. 3, 2012, 3:10 a.m., Shreepadma Venugopalan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/6878/
 ---
 
 (Updated Oct. 3, 2012, 3:10 a.m.)
 
 
 Review request for hive and Carl Steinbach.
 
 
 Description
 ---
 
 This patch implements version 1 of the column statistics project in Hive. It 
 adds support for computing and persisting statistical summary of column 
 values in Hive Tables and Partitions. In order to support column statistics 
 in Hive, this patch does the following,
 
 * Adds a new compute stats UDAF to compute scalar statistics for all 
 primitive Hive data types. In version 1 of the project, we support the 
 following scalar statistics on primitive types - estimate of number of 
 distinct values, number of null values, number of trues/falses for boolean 
 typed columsn, max and avg length for string and binary typed columns, max 
 and min value for long and double typed columns. Note that version 1 of the 
 column stats project includes support for column statistics both at the table 
 and partition level.
 
 * Adds Metastore schema tables to persist the newly added statistics both at 
 table and partition level.
 * Adds Metastore Thrift API to persist, retrieve and delete column statistics 
 at both table and partition level. 
 Please refer to the following wiki link for the details of the schema and the 
 Thrift API changes - 
 https://cwiki.apache.org/confluence/display/Hive/Column+Statistics+in+Hive
 
 * Extends the analyze table compute statistics statement to trigger 
 statistics computation and persistence for one or more columns. Please

[jira] [Commented] (HIVE-1362) column level statistics


[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468295#comment-13468295
 ] 

Shreepadma Venugopalan commented on HIVE-1362:
--

Latest revision is here: https://reviews.apache.org/r/6878/diff/#index_header

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan
 Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
 HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3458) Parallel test script doesnt run all tests


[ 
https://issues.apache.org/jira/browse/HIVE-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468312#comment-13468312
 ] 

Hudson commented on HIVE-3458:
--

Integrated in Hive-trunk-h0.21 #1717 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1717/])
HIVE-3458. Parallel test script doesnt run all tests. (Ivan Gorbachev via 
kevinwilfong) (Revision 1393169)

 Result = FAILURE
kevinwilfong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1393169
Files : 
* /hive/trunk/build.xml
* /hive/trunk/testutils/ptest/Ssh.py
* /hive/trunk/testutils/ptest/hivetest.py


 Parallel test script doesnt run all tests
 -

 Key: HIVE-3458
 URL: https://issues.apache.org/jira/browse/HIVE-3458
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9.0
Reporter: Sambavi Muthukrishnan
Assignee: Ivan Gorbachev
 Fix For: 0.10.0

 Attachments: ptest.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Parallel test script when run on a cluster of machines in multi-threaded mode 
 doesnt report back on all tests in the suite. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3481) Resource leak: Hiveserver is not closing the existing driver handle before executing the next command. It results in to file handle leaks.