date:20130227


 [ 
https://issues.apache.org/jira/browse/HIVE-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2597:
--

Attachment: HIVE-2597.D8967.1.patch

navis requested code review of HIVE-2597 [jira] Repeated key in GROUP BY is 
erroneously displayed when using DISTINCT.

Reviewers: JIRA

HIVE-2597 Repeated key in GROUP BY is erroneously displayed when using DISTINCT

The following query was simplified for illustration purposes.

This works correctly:
select client_tid,  as myvalue1,  as myvalue2 from clients cluster by 
client_tid

The intent here is to produce two empty columns in between data.

The following query does not work:
select distinct client_tid,  as myvalue1,  as myvalue2 from clients cluster 
by client_tid

FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY 

The key is not repeated since the aliases were given. Seems like Hive is 
ignoring the aliases when the distinct keyword is specified.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D8967

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/test/queries/clientpositive/groupby_constant.q
  ql/src/test/results/clientpositive/groupby_constant.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/21711/

To: JIRA, navis


 Repeated key in GROUP BY is erroneously displayed when using DISTINCT
 -

 Key: HIVE-2597
 URL: https://issues.apache.org/jira/browse/HIVE-2597
 Project: Hive
  Issue Type: Bug
Reporter: Alex Rovner
Assignee: Navis
 Attachments: HIVE-2597.D8967.1.patch


 The following query was simplified for illustration purposes. 
 This works correctly:
 select client_tid,  as myvalue1,  as myvalue2 from clients cluster by 
 client_tid
 The intent here is to produce two empty columns in between data.
 The following query does not work:
 select distinct client_tid,  as myvalue1,  as myvalue2 from clients 
 cluster by client_tid
 FAILED: Error in semantic analysis: Line 1:44 Repeated key in GROUP BY 
 The key is not repeated since the aliases were given. Seems like Hive is 
 ignoring the aliases when the distinct keyword is specified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4080) Add Lead Lag UDAFs


[ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588116#comment-13588116
 ] 

Ashutosh Chauhan commented on HIVE-4080:


bq.  Support for this feature will probably be removed. Causes ambiguities when 
Query contains different partition clauses. 
Do you mean feature which this patch is introducing (ability to have lead 
function independent of UDAFs in select expr) will be removed? Consider 
following query:
{noformat}
 select p_mfgr, p_retailprice,
lead(p_retailprice,1) as l1 over (partition by p_mfgr order by p_name),
lead(p_retailprice,1, p_retailprice) as l2 over (partition by p_size order by 
p_name),
p_retailprice - lead(p_retailprice,1) 
from part;
{noformat}
My guess is ambiguity you are referring to is once we start supporting 
different partitioning in same query than last lead() in above query becomes 
ambiguous as to which partitioning function it is refering to. But my 
understanding is sql standard says that lead and lag function must always be 
associated with over clause. So, above query is illegal in standard sql. It 
must be written as:
{noformat}
 select p_mfgr, p_retailprice,
lead(p_retailprice,1) as l1 over (partition by p_mfgr order by p_name),
lead(p_retailprice,1, p_retailprice) as l2 over (partition by p_size order by 
p_name),
p_retailprice - lead(p_retailprice,1)as l3 over (partition by p_size order by 
p_name) 
from part;
{noformat} 
Now we have this concept of default partitioning which would have made first 
query legal if partitioning scheme was identical for l1 and l2. I think long 
term:
* We should keep functionality introduced in this patch to stay compliant.
* Associate default partitioning with windowing function only if there is no 
ambiguity (i.e., there is only one partitioning clause in query).
* Raise error if user doesn't specify partitioning and there are more than one 
partitioning scheme to choose from.
Same argument stands for when lead/lag functions are used as arguments with 
UDAFs. Make sense ?

Further, I think this concept of default partitioning is only extra convenience 
we are offering to hive users which is non-standard. If it turns out its 
burdensome to support this I am fine with removing it and always requiring user 
to specify over clause.

 Add Lead  Lag UDAFs
 

 Key: HIVE-4080
 URL: https://issues.apache.org/jira/browse/HIVE-4080
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch


 Currently we support Lead/Lag as navigation UDFs usable with Windowing.
 To be standard compliant we need to support Lead  Lag UDAFs.
 Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
 is in play. 
 Currently allow Lead/Lag expressions to appear in SelectLists even when they 
 are not arguments to UDAFs. Support for this feature will probably be 
 removed. Causes ambiguities when Query contains different partition clauses. 
 Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4085) Incorrectly pruning columns for PTFOperator

Ashutosh Chauhan created HIVE-4085:
--

 Summary: Incorrectly pruning columns for PTFOperator
 Key: HIVE-4085
 URL: https://issues.apache.org/jira/browse/HIVE-4085
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan


Following simple query used to work before HIVE-4035
{code}
select s, sum(b) over (distribute by i sort by si rows between  unbounded 
preceding and current row) from over100k;
{code}
but now it fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4085) Incorrectly pruning columns for PTFOperator


[ 
https://issues.apache.org/jira/browse/HIVE-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588131#comment-13588131
 ] 

Ashutosh Chauhan commented on HIVE-4085:


After HIVE-4035, it is failing with following stack-trace
{code}
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:160)
... 14 more
Caused by: java.lang.RuntimeException: cannot find field _col2 from [0:_col3, 
1:_col7]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator.setupKeysWrapper(PTFOperator.java:193)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:100)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409)
at 
org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:152)
... 14 more
{code}
Prajakta / Harish,
Do you guys already know about this failure?

 Incorrectly pruning columns for PTFOperator
 ---

 Key: HIVE-4085
 URL: https://issues.apache.org/jira/browse/HIVE-4085
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan

 Following simple query used to work before HIVE-4035
 {code}
 select s, sum(b) over (distribute by i sort by si rows between  unbounded 
 preceding and current row) from over100k;
 {code}
 but now it fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4085) Incorrectly pruning columns for PTFOperator


[ 
https://issues.apache.org/jira/browse/HIVE-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588137#comment-13588137
 ] 

Ashutosh Chauhan commented on HIVE-4085:


Does this has same root cause as HIVE-4083? If so, feel free to mark it as 
duplicate. 

 Incorrectly pruning columns for PTFOperator
 ---

 Key: HIVE-4085
 URL: https://issues.apache.org/jira/browse/HIVE-4085
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan

 Following simple query used to work before HIVE-4035
 {code}
 select s, sum(b) over (distribute by i sort by si rows between  unbounded 
 preceding and current row) from over100k;
 {code}
 but now it fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-1990) Logging fails due to moved EventCounter class in Hadoop 0.20.100


 [ 
https://issues.apache.org/jira/browse/HIVE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-1990.


Resolution: Duplicate

 Logging fails due to moved EventCounter class in Hadoop 0.20.100
 

 Key: HIVE-1990
 URL: https://issues.apache.org/jira/browse/HIVE-1990
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 0.6.0
 Environment: Red Hat 2.6.18
Reporter: Joep Rottinghuis
 Fix For: 0.11.0

 Attachments: hive-1990.patch


 When compiling Hive against Hadoop 0.20.100 logging on command line and in 
 unit tests fails due to the EventCounter class being moved from 
 o.a.h.metrics.jvm.EventCounter to o.a.h.log.EventCounter.
 {code}
 [junit] Running org.apache.hadoop.hive.serde2.TestTCTLSeparatedProtocol 
 [junit] log4j:ERROR Could not instantiate class 
 [org.apache.hadoop.metrics.jvm.EventCounter]. 
 [junit] java.lang.ClassNotFoundException: 
 org.apache.hadoop.metrics.jvm.EventCounter 
 [junit] at java.net.URLClassLoader$1.run(URLClassLoader.java:200) 
 [junit] at java.security.AccessController.doPrivileged(Native Method) 
 [junit] at java.net.URLClassLoader.findClass(URLClassLoader.java:188) 
 [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:307) 
 [junit] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) 
 [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:252) 
 {code}
 As a note: In order to re-produce I first applied patch as per HIVE-1264 to 
 0.6 branch in order to resolve jar naming issues in build.
 Then I locally modified the build.properties to my locally built 0.20.100 
 Hadoop build:
 {code}
 hadoop.security.url=file:.../hadoop/core/hadoop-${hadoop.version}
 hadoop.security.version=${hadoop.version}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HIVE-3886) WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated


 [ 
https://issues.apache.org/jira/browse/HIVE-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-3886:



 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated
 -

 Key: HIVE-3886
 URL: https://issues.apache.org/jira/browse/HIVE-3886
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.9.0, 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3886.1.patch


 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-3886) WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated


 [ 
https://issues.apache.org/jira/browse/HIVE-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-3886.


Resolution: Duplicate

 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated
 -

 Key: HIVE-3886
 URL: https://issues.apache.org/jira/browse/HIVE-3886
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.9.0, 0.10.0, 0.11.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3886.1.patch


 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3952) merge map-job followed by map-reduce job

2013-02-27 Thread Amareshwari Sriramadasu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588233#comment-13588233
 ] 

Amareshwari Sriramadasu commented on HIVE-3952:
---

Tried out the patch, when we run query like the following :

INSERT OVERWRITE DIRECTORY /dir
Select 

It fails with exception :

{noformat}
java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.MoveTask cannot be 
cast to org.apache.hadoop.hive.ql.exec.MapRedTask
at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.mayBeMergeMapJoinTaskWithMapReduceTask(CommonJoinResolver.java:291)
at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.processCurrentTask(CommonJoinResolver.java:535)
at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver$CommonJoinTaskDispatcher.dispatch(CommonJoinResolver.java:701)
at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:194)
at 
org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:139)
at 
org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:113)
at 
org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:79)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:8138)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8470)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
{noformat}



 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli
 Attachments: HIVE-3952-20130226.txt


 Consider the query like:
 select count(*) FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3428) Fix log4j configuration errors when running hive on hadoop23

2013-02-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588405#comment-13588405
 ] 

Hudson commented on HIVE-3428:
--

Integrated in Hive-trunk-h0.21 #1989 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1989/])
HIVE-3428 : Fix log4j configuration errors when running hive on hadoop23 
(Gunther Hagleitner via Ashutosh Chauhan) (Revision 1450645)

 Result = SUCCESS
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450645
Files : 
* /hive/trunk/common/src/java/conf/hive-log4j.properties
* /hive/trunk/data/conf/hive-log4j.properties
* /hive/trunk/pdk/scripts/conf/log4j.properties
* /hive/trunk/ql/src/java/conf/hive-exec-log4j.properties
* /hive/trunk/shims/ivy.xml
* 
/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HiveEventCounter.java
* /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java


 Fix log4j configuration errors when running hive on hadoop23
 

 Key: HIVE-3428
 URL: https://issues.apache.org/jira/browse/HIVE-3428
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Zhenxiao Luo
Assignee: Gunther Hagleitner
 Fix For: 0.11.0

 Attachments: HIVE-3428.1.D8805.patch, HIVE-3428.1.patch.txt, 
 HIVE-3428.2.patch.txt, HIVE-3428.3.patch.txt, HIVE-3428.4.patch.txt, 
 HIVE-3428.5.patch.txt, HIVE-3428.6.patch.txt, 
 HIVE-3428_SHIM_EVENT_COUNTER.patch


 There are log4j configuration errors when running hive on hadoop23, some of 
 them may fail testcases, since the following log4j error message could 
 printed to console, or to output file, which diffs from the expected output:
 [junit]  log4j:ERROR Could not find value for key log4j.appender.NullAppender
 [junit]  log4j:ERROR Could not instantiate appender named NullAppender.
 [junit]  12/09/04 11:34:42 WARN conf.HiveConf: hive-site.xml not found on 
 CLASSPATH

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Issue involving non-sun java build

2013-02-27 Thread Renata Ghisloti Duarte de Souza

Hello,

I've been working with Hive for a couple of months now, and although it
is a pretty strong framework, I can't help but noticing that some
testcases fails on Non-Sun Java.

Some examples are: TestCliDriver, TestParse and TestJdbcDriver.

The issues involve HashMap situation, where Sun Java has a different
output order than Non-Sun Java.  It is a silly problem, but it does
cause failures.

I've been working on fixes for theses problems, and was planning to
contribute it. Is this something the Hive community would be interested
in? What are your thoughts about that?

Thanks,


Renata.

[jira] [Updated] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.


 [ 
https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-2264:
---

Attachment: HIVE-2264-2.patch

Hi,

I rebased this patch on trunk (attached) and removed the commented out 
System.exit().

Brock

 Hive server is SHUTTING DOWN when invalid queries beeing executed.
 --

 Key: HIVE-2264
 URL: https://issues.apache.org/jira/browse/HIVE-2264
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: SuSE-Linux-11
Reporter: rohithsharma
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch


 When invalid query is beeing executed, Hive server is shutting down.
 {noformat}
 CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds 
 string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
 ALTER TABLE SAMPLETABLE add Partition(ds='sf') location 
 '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.


[ 
https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588484#comment-13588484
 ] 

Brock Noland commented on HIVE-2264:


Navis, you can use my rebased patch to update review board or if you don't have 
interest in this any longer, no worries, I'd be willing to take it up.

 Hive server is SHUTTING DOWN when invalid queries beeing executed.
 --

 Key: HIVE-2264
 URL: https://issues.apache.org/jira/browse/HIVE-2264
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: SuSE-Linux-11
Reporter: rohithsharma
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch


 When invalid query is beeing executed, Hive server is shutting down.
 {noformat}
 CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds 
 string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
 ALTER TABLE SAMPLETABLE add Partition(ds='sf') location 
 '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype

2013-02-27 Thread Arun A K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588503#comment-13588503
 ] 

Arun A K commented on HIVE-3850:


Hello [~analog.sony], the test cases are ok. I think this need to be considered 
for commit. [~ajeshpg] had given me the link to raise the review request. I 
tried raising the review, but I am getting an error message The selected file 
does not appear to be a diff. If possible could you create a new patch with 
the name HIVE-3850.1.patch ? Either [~710154] or yourself can do that so that 
we can edit the current review request (https://reviews.apache.org/r/9171/) or 
discard this and create a new one. 


 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850.patch, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4080) Add Lead Lag UDAFs


[ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588515#comment-13588515
 ] 

Phabricator commented on HIVE-4080:
---

ashutoshc has requested changes to the revision HIVE-4080 [jira] Add Lead  
Lag UDAFs.

  Some comments.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:533 Use 
LEAD_FUNC_NAME and LAG_FUNC_NAME here to be consistent.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:871 This 
null check should be done outside of nesting if() block.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:875 I don't 
see a case where fInfo is != null but udafResolver could be. If so, than this 
null check is redundant and should be removed.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:890 We are 
already know that we are dealing with lead/lag udaf, it must be of type 
GenericUDAFResolver2.. no ?
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:866 I am 
wondering if this function can be rewritten as following:

  WindowFunctionInfo finfo = windowFunctions.get(name.toLowerCase());
  if (finfo == null) { return null;}
  f ( !name.toLowerCase().equals(LEAD_FUNC_NAME) 
  !name.toLowerCase().equals(LAG_FUNC_NAME) ) {
  return getGenericUDAFEvaluator(name, argumentOIs, isDistinct, isAllColumns);
  }

  // this must be lead/lag UDAF
  GenericUDAFResolver udafResolver = finfo.getfInfo().getGenericUDAFResolver();
  GenericUDAFParameterInfo paramInfo = new SimpleGenericUDAFParameterInfo(
   argumentOIs.toArray(), isDistinct, isAllColumns);
  return  ((GenericUDAFResolver2) udafResolver).getEvaluator(paramInfo);

  If not, than I have specific questions. See next few comments.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:1467 It will 
be good to comments for this new boolean registerAsUDAF. Something like 
following
  There are certain UDAFs like lead/lag which we want as windowing functions, 
but don't want them to appear in mFunctions.
  Why?
  Because
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:29 This 
and GenericUDAFLead share lots of common code. It might be good to have an 
abstract class for these two, just the way you have it in GenericUDFLeadLag.
  ql/src/test/queries/clientpositive/leadlag_queries.q:20 I think currently we 
don't support over clause on expressions. Once we do, it will be good to add 
test like:
  select p_retailprice - lead (p_retail) over (partition by p_mfgr) from part;
  ql/src/test/queries/clientpositive/leadlag_queries.q:35 It will be good to 
add a test which has both lead and lag in same query.

REVISION DETAIL
  https://reviews.facebook.net/D8961

BRANCH
  HIVE-4080

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, hbutani


 Add Lead  Lag UDAFs
 

 Key: HIVE-4080
 URL: https://issues.apache.org/jira/browse/HIVE-4080
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch


 Currently we support Lead/Lag as navigation UDFs usable with Windowing.
 To be standard compliant we need to support Lead  Lag UDAFs.
 Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
 is in play. 
 Currently allow Lead/Lag expressions to appear in SelectLists even when they 
 are not arguments to UDAFs. Support for this feature will probably be 
 removed. Causes ambiguities when Query contains different partition clauses. 
 Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Issue involving non-sun java build

2013-02-27 Thread Brock Noland

Hi,

I think we'd want hive to work on as many JVM's as feasible. With that
said, since it's tested mostly on Sun JVM, it's possible we'll introduce
new issues in the future so you'll need to keep testing. Here is a guide on
how to contribute:

https://cwiki.apache.org/confluence/display/Hive/HowToContribute

Glad to have you interested!
Brock


On Wed, Feb 27, 2013 at 10:06 AM, Renata Ghisloti Duarte de Souza 
rgdua...@linux.vnet.ibm.com wrote:

 Hello,

 I've been working with Hive for a couple of months now, and although it
 is a pretty strong framework, I can't help but noticing that some
 testcases fails on Non-Sun Java.

 Some examples are: TestCliDriver, TestParse and TestJdbcDriver.

 The issues involve HashMap situation, where Sun Java has a different
 output order than Non-Sun Java.  It is a silly problem, but it does
 cause failures.

 I've been working on fixes for theses problems, and was planning to
 contribute it. Is this something the Hive community would be interested
 in? What are your thoughts about that?

 Thanks,


 Renata.




-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

[jira] [Commented] (HIVE-4080) Add Lead Lag UDAFs

2013-02-27 Thread Harish Butani (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588560#comment-13588560
]

Harish Butani commented on HIVE-4080:
-

Yes, there are several related issues:

1. Lead/Lag as UDAFs
- this jira only addresses this
- will work on your comments.

2. Support expressions with over clause
- filed JIRA 4081 for this
- will work on this next

3. Support for lead/lag UDFs. Based on our offline conversation and as you
point out here the options are:
- should we continue to support
- should we completely remove support?
- support lead/lag as UDFs, but only within argument expressions of other UDAFs.
The consensus seems to be option 3 is nice to have; 1 is problematic.
Will address this in a separate JIRA

4. The notion of default partitions
- you have given more proof, why supporting lead/lag as UDFs generally (option
1) is problematic.
- in general, should we continue to support this?
- Your approach, makes sense
Will address this in separate JIRA.

Does this break down of issues make sense? Will address the first 3 asap; and
then work on supporting multiple partitions(4041). The 4th one will have to
wait a bit.

Add Lead Lag UDAFs

Key: HIVE-4080
URL: https://issues.apache.org/jira/browse/HIVE-4080
Project: Hive
Issue Type: Bug
Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch

Currently we support Lead/Lag as navigation UDFs usable with Windowing.
To be standard compliant we need to support Lead Lag UDAFs.
Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing
is in play.
Currently allow Lead/Lag expressions to appear in SelectLists even when they
are not arguments to UDAFs. Support for this feature will probably be
removed. Causes ambiguities when Query contains different partition clauses.
Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Todd Wilson (JIRA)

Todd Wilson created HIVE-4086:
-

 Summary: Cross Database Support for Indexes and Partitions (or all 
DDL statements)
 Key: HIVE-4086
 URL: https://issues.apache.org/jira/browse/HIVE-4086
 Project: Hive
  Issue Type: Improvement
  Components: Database/Schema, ODBC, SQL
Affects Versions: 0.9.0
 Environment: Writing a query tool in .NET connecting with ODBC to 
Hadoop on Linux.
Reporter: Todd Wilson


I'd like to see more cross-database support.  I'm using a Cloudera 
implementation on Hive .9.  Currently, you can create new databases, you can 
create tables and views in those databases, but you cannot create indexes or 
partitions on those tables.  Likewise, commands like show partitions or show 
indexes will only work on table in the default database.  This would probably 
also affect statements like Alter Table and Recover Partitions.  Probably also 
something like Create Function, but if you want to keep all functions being 
created in the default database that would work.  I would be more interested in 
full cross-database support for tables and views to start.  Functions for 
example could all be created in default.  Thank you.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4080) Add Lead Lag UDAFs


[ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588612#comment-13588612
 ] 

Phabricator commented on HIVE-4080:
---

hbutani has commented on the revision HIVE-4080 [jira] Add Lead  Lag UDAFs.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:866 Yes, i 
wanted to introduce a new fn to take a GenericUDAFResolver: public static 
GenericUDAFEvaluator getGenericUDAFEvaluator(GenericUDAFResolver,
ListObjectInspector argumentOIs, boolean isDistinct,
boolean isAllColumns)

  and have the current getGenericUDAFEvaluator and getGenericWindowingEvaluator 
call it.

  But backed out, because was not comfortable making this change and submitting 
the patch w/o running the entire test suite.

  Ended up just doing a cut and paste. Your soln is much better
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java:1467 Yes, 
meant to do this. Somehow forgot, sorry
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:29 yes, 
i was rushing this...
  should refactor it.
  ql/src/test/queries/clientpositive/leadlag_queries.q:20 yes exactly
  ql/src/test/queries/clientpositive/leadlag_queries.q:35 will add

REVISION DETAIL
  https://reviews.facebook.net/D8961

BRANCH
  HIVE-4080

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, hbutani


 Add Lead  Lag UDAFs
 

 Key: HIVE-4080
 URL: https://issues.apache.org/jira/browse/HIVE-4080
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch


 Currently we support Lead/Lag as navigation UDFs usable with Windowing.
 To be standard compliant we need to support Lead  Lag UDAFs.
 Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
 is in play. 
 Currently allow Lead/Lag expressions to appear in SelectLists even when they 
 are not arguments to UDAFs. Support for this feature will probably be 
 removed. Causes ambiguities when Query contains different partition clauses. 
 Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Jarek Jarcec Cecho (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588626#comment-13588626
]

Jarek Jarcec Cecho commented on HIVE-4086:
--

Hi Todd,
thank you very much for reporting this issue. We already have JIRA HIVE-4064 to
track this requirement so I'm closing this one as a duplicate to keep all the
information in one place.

To immediately unblock you, did you consider using SQL query {{USE dbname}}
to change working database from {{default}} to {{dbname}}?

Jarcec

Cross Database Support for Indexes and Partitions (or all DDL statements)
-

Key: HIVE-4086
URL: https://issues.apache.org/jira/browse/HIVE-4086
Project: Hive
Issue Type: Improvement
Components: Database/Schema, ODBC, SQL
Affects Versions: 0.9.0
Environment: Writing a query tool in .NET connecting with ODBC to
Hadoop on Linux.
Reporter: Todd Wilson

I'd like to see more cross-database support. I'm using a Cloudera
implementation on Hive .9. Currently, you can create new databases, you can
create tables and views in those databases, but you cannot create indexes or
partitions on those tables. Likewise, commands like show partitions or show
indexes will only work on table in the default database. This would probably
also affect statements like Alter Table and Recover Partitions. Probably
also something like Create Function, but if you want to keep all functions
being created in the default database that would work. I would be more
interested in full cross-database support for tables and views to start.
Functions for example could all be created in default. Thank you.

[jira] [Resolved] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Jarek Jarcec Cecho (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jarek Jarcec Cecho resolved HIVE-4086.
--

Resolution: Duplicate

Cross Database Support for Indexes and Partitions (or all DDL statements)
-

[jira] [Commented] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements

2013-02-27 Thread Shreepadma Venugopalan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588639#comment-13588639
 ] 

Shreepadma Venugopalan commented on HIVE-4064:
--

I believe there is a problem with a number of DDLs including ALTER TABLE, 
CREATE INDEX. 

 Handle db qualified names consistently across all HiveQL statements
 ---

 Key: HIVE-4064
 URL: https://issues.apache.org/jira/browse/HIVE-4064
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan

 Hive doesn't consistently handle db qualified names across all HiveQL 
 statements. While some HiveQL statements such as SELECT support DB qualified 
 names, other such as CREATE INDEX doesn't. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Issue involving non-sun java build

2013-02-27 Thread Jarek Jarcec Cecho

Hi Renata,
I'm glad to see you interested in contributing to Hive community! Please don't 
hesitate and follow the link provided by Brock in previous email if you're 
interested.

I just wanted to add that Hive is a Hadoop SQL Engine and thus we have major 
dependencies on Hadoop. Whereas extending Hive ability to work on other JDKs is 
definitely a great thing to do, I feel the need to warn that you might get into 
issues as Hadoop itself might not work on those JDKs. I know about following 
wiki page [1] that describes the support of various JDKs in Hadoop, but it 
seems not maintained any more.

Jarcec

Links:
1: http://wiki.apache.org/hadoop/HadoopJavaVersions

On Wed, Feb 27, 2013 at 11:01:58AM -0600, Brock Noland wrote:
 Hi,
 
 I think we'd want hive to work on as many JVM's as feasible. With that
 said, since it's tested mostly on Sun JVM, it's possible we'll introduce
 new issues in the future so you'll need to keep testing. Here is a guide on
 how to contribute:
 
 https://cwiki.apache.org/confluence/display/Hive/HowToContribute
 
 Glad to have you interested!
 Brock
 
 
 On Wed, Feb 27, 2013 at 10:06 AM, Renata Ghisloti Duarte de Souza 
 rgdua...@linux.vnet.ibm.com wrote:
 
  Hello,
 
  I've been working with Hive for a couple of months now, and although it
  is a pretty strong framework, I can't help but noticing that some
  testcases fails on Non-Sun Java.
 
  Some examples are: TestCliDriver, TestParse and TestJdbcDriver.
 
  The issues involve HashMap situation, where Sun Java has a different
  output order than Non-Sun Java.  It is a silly problem, but it does
  cause failures.
 
  I've been working on fixes for theses problems, and was planning to
  contribute it. Is this something the Hive community would be interested
  in? What are your thoughts about that?
 
  Thanks,
 
 
  Renata.
 
 
 
 
 -- 
 Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/


signature.asc
Description: Digital signature

[jira] [Commented] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Todd Wilson (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588655#comment-13588655
]

Todd Wilson commented on HIVE-4086:
---

Hello Jarcec:

Thank you for the reply. I appreciate this. I figured you might have
something like this, but I couldn't find it. It was my first time entering an
issue so I was assuming I'd do something wrong!

As far as the USE command goes, I'll give that a try. I actually didn't
realize this command was supported. That would help a lot in what I'm trying
to do. I'm switching back between a lot of data sources like Teradata,
ParAccel, Kognitio so sometimes my brain gets scrambled. :p

Thank you again.

Best Regards,

Todd Wilson
Senior Technical Consultant
Coffing Data Warehousing
(513) 292-3158
www.CoffingDW.com

The information contained in this communication is confidential, private,
proprietary, or otherwise privileged and is intended only for the use of the
addressee. Unauthorized use, disclosure, distribution or copying is strictly
prohibited and may be unlawful. If you have received this communication in
error, please notify the sender immediately at gene...@coffingdw.com.
*

Cross Database Support for Indexes and Partitions (or all DDL statements)
-

[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HIVE-3952:
--

Attachment: HIVE-3952-20130227.1.txt

Thanks for trying this, Amareshwari!

I've added your INSERT OVERWRITE DIRECTORY /dir Select  case to the test.

Here's an updated patch that should work for you, can you please try again? Tx.

 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli
 Attachments: HIVE-3952-20130226.txt, HIVE-3952-20130227.1.txt


 Consider the query like:
 select count(*) FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4079) Altering a view partition fails with NPE


[ 
https://issues.apache.org/jira/browse/HIVE-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588685#comment-13588685
 ] 

Kevin Wilfong commented on HIVE-4079:
-

Tests pass.

 Altering a view partition fails with NPE
 

 Key: HIVE-4079
 URL: https://issues.apache.org/jira/browse/HIVE-4079
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4079.1.patch.txt


 Altering a view partition e.g. to add partition parameters, fails with a null 
 pointer exception in the ObjectStore class.
 Currently, this is only possible using the metastore Thrift API and there are 
 no testcases for it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4087) Annotate public interfaces (UD*F, storage handler, SerDe)

2013-02-27 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-4087:


 Summary: Annotate public interfaces (UD*F, storage handler, SerDe)
 Key: HIVE-4087
 URL: https://issues.apache.org/jira/browse/HIVE-4087
 Project: Hive
  Issue Type: Sub-task
Reporter: Gunther Hagleitner


Going forward it would be nice to clearly annotate public interfaces in the 
hive codebase. The javadocs would be more useful that way. It might even make 
sense to produce documentation for just those interfaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4088) Landing page for previous versions of documentation

2013-02-27 Thread Gunther Hagleitner (JIRA)

Gunther Hagleitner created HIVE-4088:


 Summary: Landing page for previous versions of documentation
 Key: HIVE-4088
 URL: https://issues.apache.org/jira/browse/HIVE-4088
 Project: Hive
  Issue Type: Improvement
Reporter: Gunther Hagleitner


If you go to: http://hive.apache.org/releases.html

And navigate to documentation for previous releases you end up on a page like 
this:

http://hive.apache.org/docs/r0.7.1/

It would be great to have an actual page there instead of a directory listing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: jdo2-api dependency

2013-02-27 Thread Nitay Joffe

https://issues.apache.org/jira/browse/HIVE-4089

On Feb 22, 2013, at 3:15 PM, Jarek Jarcec Cecho jar...@apache.org wrote:

 Hi Nitay,
 would you mind opening a JIRA for that?
 
 Jarcec
 
 On Fri, Feb 22, 2013 at 01:03:15PM -0500, Nitay Joffe wrote:
 Hey guys,
 
 The latest open source hive release (0.10.0) depends on javax.jdo artifact 
 jdo2-api version 2.3-ec. This version is not actually in maven central, 
 which means everyone who uses hive requires custom maven repository 
 definitions which is discouraged by maven folks. I pinged the javax.jdo guys 
 about it and they recommended we upgrade to 3.0. See 
 http://mail-archives.apache.org/mod_mbox/db-jdo-dev/201302.mbox/%3CCAGZB7RguuEJnpVbtaqOgYEbsUNzP3aMSmM8SM8aOxcb-hLWwjg%40mail.gmail.com%3E
  for the conversation. Can you guys fix this?
 
 Thanks,
 - Nitay

[jira] [Created] (HIVE-4089) javax.jdo : jdo2-api dependency not in Maven Central

Nitay Joffe created HIVE-4089:
-

 Summary: javax.jdo : jdo2-api dependency not in Maven Central
 Key: HIVE-4089
 URL: https://issues.apache.org/jira/browse/HIVE-4089
 Project: Hive
  Issue Type: Bug
Reporter: Nitay Joffe
Assignee: Jarek Jarcec Cecho


The latest open source hive release (0.10.0) depends on javax.jdo artifact 
jdo2-api version 2.3-ec. This version is not actually in maven central, which 
means everyone who uses hive requires custom maven repository definitions which 
is discouraged by maven folks. I pinged the javax.jdo guys about it and they 
recommended we upgrade to 3.0. See goo.gl/fAoRn for the conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4089) javax.jdo : jdo2-api dependency not in Maven Central


[ 
https://issues.apache.org/jira/browse/HIVE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588728#comment-13588728
 ] 

Nitay Joffe commented on HIVE-4089:
---

Link: goo.gl/fAoRn

 javax.jdo : jdo2-api dependency not in Maven Central
 

 Key: HIVE-4089
 URL: https://issues.apache.org/jira/browse/HIVE-4089
 Project: Hive
  Issue Type: Bug
Reporter: Nitay Joffe
Assignee: Jarek Jarcec Cecho

 The latest open source hive release (0.10.0) depends on javax.jdo artifact 
 jdo2-api version 2.3-ec. This version is not actually in maven central, which 
 means everyone who uses hive requires custom maven repository definitions 
 which is discouraged by maven folks. I pinged the javax.jdo guys about it and 
 they recommended we upgrade to 3.0. See goo.gl/fAoRn for the conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4089) javax.jdo : jdo2-api dependency not in Maven Central


[ 
https://issues.apache.org/jira/browse/HIVE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588729#comment-13588729
 ] 

Nitay Joffe commented on HIVE-4089:
---

http://mail-archives.apache.org/mod_mbox/db-jdo-dev/201302.mbox/%3ccagzb7rguuejnpvbtaqogyebsunzp3amsmm8sm8aoxcb-hlw...@mail.gmail.com%3E

 javax.jdo : jdo2-api dependency not in Maven Central
 

 Key: HIVE-4089
 URL: https://issues.apache.org/jira/browse/HIVE-4089
 Project: Hive
  Issue Type: Bug
Reporter: Nitay Joffe
Assignee: Jarek Jarcec Cecho

 The latest open source hive release (0.10.0) depends on javax.jdo artifact 
 jdo2-api version 2.3-ec. This version is not actually in maven central, which 
 means everyone who uses hive requires custom maven repository definitions 
 which is discouraged by maven folks. I pinged the javax.jdo guys about it and 
 they recommended we upgrade to 3.0. See goo.gl/fAoRn for the conversation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4089) javax.jdo : jdo2-api dependency not in Maven Central

[
https://issues.apache.org/jira/browse/HIVE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nitay Joffe updated HIVE-4089:
--

Description: The latest open source hive release (0.10.0) depends on
javax.jdo artifact jdo2-api version 2.3-ec. This version is not actually in
maven central, which means everyone who uses hive requires custom maven
repository definitions which is discouraged by maven folks. I pinged the
javax.jdo guys about it and they recommended we upgrade to 3.0. See
http://goo.gl/fAoRn for the conversation. (was: The latest open source hive
release (0.10.0) depends on javax.jdo artifact jdo2-api version 2.3-ec. This
version is not actually in maven central, which means everyone who uses hive
requires custom maven repository definitions which is discouraged by maven
folks. I pinged the javax.jdo guys about it and they recommended we upgrade to
3.0. See goo.gl/fAoRn for the conversation.)

javax.jdo : jdo2-api dependency not in Maven Central

Key: HIVE-4089
URL: https://issues.apache.org/jira/browse/HIVE-4089
Project: Hive
Issue Type: Bug
Reporter: Nitay Joffe
Assignee: Jarek Jarcec Cecho

The latest open source hive release (0.10.0) depends on javax.jdo artifact
jdo2-api version 2.3-ec. This version is not actually in maven central, which
means everyone who uses hive requires custom maven repository definitions
which is discouraged by maven folks. I pinged the javax.jdo guys about it and
they recommended we upgrade to 3.0. See http://goo.gl/fAoRn for the
conversation.

[jira] [Updated] (HIVE-4078) Remove the serialize-deserialize pair in CommonJoinResolver


 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

Status: Open  (was: Patch Available)

Updating patch to match review comments.

 Remove the serialize-deserialize pair in CommonJoinResolver
 ---

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4078.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 Instead of cloning via XMLEncoder, it is faster to use BeanUtils.cloneBean() 
 which is following same data paths (get/set bean methods) instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4078) Remove the serialize-deserialize pair in CommonJoinResolver


 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

Attachment: HIVE-4078-20130227.patch

Updated to throw an exception if the cloner throws a SemanticException wrapped 
around the IllegalAccess or Invocation exception that are possible (but 
unlikely).

 Remove the serialize-deserialize pair in CommonJoinResolver
 ---

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4078-20130227.patch, HIVE-4078.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 Instead of cloning via XMLEncoder, it is faster to use BeanUtils.cloneBean() 
 which is following same data paths (get/set bean methods) instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4078) Remove the serialize-deserialize pair in CommonJoinResolver


 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

Status: Patch Available  (was: Open)

 Remove the serialize-deserialize pair in CommonJoinResolver
 ---

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4078-20130227.patch, HIVE-4078.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 Instead of cloning via XMLEncoder, it is faster to use BeanUtils.cloneBean() 
 which is following same data paths (get/set bean methods) instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3775) Unit test failures due to unspecified order of results in show grant command

2013-02-27 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588811#comment-13588811
 ] 

Gunther Hagleitner commented on HIVE-3775:
--

Updated: https://reviews.facebook.net/D8811

 Unit test failures due to unspecified order of results in show grant command
 --

 Key: HIVE-3775
 URL: https://issues.apache.org/jira/browse/HIVE-3775
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-3775.1-r1417768.patch, HIVE-3775.2.patch


 A number of unit tests (sometimes) using show grant fail, when run on 
 windows or previous failures put the database in an unexpected state.
 The reason is that the output of show grant is not specified to be in any 
 particular order, but the golden files expect it to be.
 The unit test framework should be extended to handled cases like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Todd Wilson (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588824#comment-13588824
]

Todd Wilson commented on HIVE-4086:
---

Hello Jarcec:

Your suggestion for the USE command works when querying hive directly, but I'm
using a couple of ODBC drivers (MapR and HortonWorks) and it looks like this
command doesn't work (which made me think this command wasn't working/supported
on Hive :/). Anyways, this is an ODBC issue I think. Thanks again for your
help.

Best Regards,

Todd Wilson
Senior Technical Consultant
Coffing Data Warehousing
(513) 292-3158
www.CoffingDW.com

Cross Database Support for Indexes and Partitions (or all DDL statements)
-

[jira] [Updated] (HIVE-4034) Should be able to specify windowing spec without needing Between


 [ 
https://issues.apache.org/jira/browse/HIVE-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4034:
---

Assignee: Ashutosh Chauhan

 Should be able to specify windowing spec without needing Between
 

 Key: HIVE-4034
 URL: https://issues.apache.org/jira/browse/HIVE-4034
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan

 Currently user need to do following:
 {noformat}
 select s, sum(b) over (distribute by i sort by si rows between unbounded 
 preceding and current row) from over100k;
 {noformat}
 but sql spec allows following as well:
 {noformat}
 select s, sum(b) over (distribute by i sort by si rows unbounded preceding) 
 from over100k;
 {noformat}
 In such cases {{current row}} should be assumed implicitly.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4086) Cross Database Support for Indexes and Partitions (or all DDL statements)

2013-02-27 Thread Jarek Jarcec Cecho (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588849#comment-13588849
]

Jarek Jarcec Cecho commented on HIVE-4086:
--

The {{USE db}} definitely works in hive shell and JDBC interface. I can't speak
for the ODBC drivers unfortunately.

Cross Database Support for Indexes and Partitions (or all DDL statements)
-

[jira] [Updated] (HIVE-4034) Should be able to specify windowing spec without needing Between


 [ 
https://issues.apache.org/jira/browse/HIVE-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4034:
---

Attachment: HIVE-4034.patch

Patch which fixes grammar to handle these cases. I found a bug in range 
handling so most changes are related to that. See new +ve test cases related to 
range in window specification. 

 Should be able to specify windowing spec without needing Between
 

 Key: HIVE-4034
 URL: https://issues.apache.org/jira/browse/HIVE-4034
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-4034.patch


 Currently user need to do following:
 {noformat}
 select s, sum(b) over (distribute by i sort by si rows between unbounded 
 preceding and current row) from over100k;
 {noformat}
 but sql spec allows following as well:
 {noformat}
 select s, sum(b) over (distribute by i sort by si rows unbounded preceding) 
 from over100k;
 {noformat}
 In such cases {{current row}} should be assumed implicitly.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #78

2013-02-27 Thread Apache Jenkins Server

See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/78/

--
[...truncated 41971 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-02-27 14:51:02,720 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-02-27_14-50-59_570_5853974869671614312/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201302271451_661347323.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-02-27_14-51-04_036_7445463875010800454/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-02-27_14-51-04_036_7445463875010800454/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201302271451_879632411.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable

[jira] [Updated] (HIVE-4078) Remove the serialize-deserialize pair in CommonJoinResolver


 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

Status: Open  (was: Patch Available)

cloneBean() only clones part of the data, does not do a true deep-copy.

 Remove the serialize-deserialize pair in CommonJoinResolver
 ---

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4078-20130227.patch, HIVE-4078.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 Instead of cloning via XMLEncoder, it is faster to use BeanUtils.cloneBean() 
 which is following same data paths (get/set bean methods) instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4044) Add URL type

2013-02-27 Thread Samuel Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1357#comment-1357
 ] 

Samuel Yuan commented on HIVE-4044:
---

You're right, the idea is that it will enable better encoding of URLs. Kevin 
found that breaking up the URL into its components and storing them as separate 
columns results in significant space savings. The original plan was to 
implement this idea with RCFile, but with the new ORC file format I decided to 
wait for that instead, and to submit this part separately.

However, it looks like the improvements of the ORC file have erased any gains 
we would have gotten by breaking up URLs into the individual components, so 
this won't be needed any more.

 Add URL type
 

 Key: HIVE-4044
 URL: https://issues.apache.org/jira/browse/HIVE-4044
 Project: Hive
  Issue Type: Improvement
Reporter: Samuel Yuan
Assignee: Samuel Yuan
 Attachments: HIVE-4044.HIVE-4044.HIVE-4044.D8799.1.patch


 Having a separate type for URLs would enable improvements in storage 
 efficiency based on breaking up a URL into its components. The new type will 
 be named URL and made a non-reserved keyword (see HIVE-701).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results

Kevin Wilfong created HIVE-4090:
---

 Summary: Use of hive.exec.script.allow.partial.consumption can 
produce partial results
 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong


When users execute use a transform script with the config 
hive.exec.script.allow.partial.consumption set to true, it may produce partial 
results.

When this config is set the script may close it's input pipe before its parent 
operator has finished passing it rows.  In the catch block for this exception, 
the setDone method is called marking the operator as done.  However, there's a 
separate thread running to process rows passed from the script back to Hive via 
stdout.  If this thread is not done processing rows, any rows it forwards after 
the setDone method is called will not be passed to its children.  This leads to 
partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results


[ 
https://issues.apache.org/jira/browse/HIVE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588891#comment-13588891
 ] 

Kevin Wilfong commented on HIVE-4090:
-

https://reviews.facebook.net/D8979

 Use of hive.exec.script.allow.partial.consumption can produce partial results
 -

 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong

 When users execute use a transform script with the config 
 hive.exec.script.allow.partial.consumption set to true, it may produce 
 partial results.
 When this config is set the script may close it's input pipe before its 
 parent operator has finished passing it rows.  In the catch block for this 
 exception, the setDone method is called marking the operator as done.  
 However, there's a separate thread running to process rows passed from the 
 script back to Hive via stdout.  If this thread is not done processing rows, 
 any rows it forwards after the setDone method is called will not be passed to 
 its children.  This leads to partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results


 [ 
https://issues.apache.org/jira/browse/HIVE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4090:


Attachment: HIVE-4090.1.patch.txt

 Use of hive.exec.script.allow.partial.consumption can produce partial results
 -

 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
 Attachments: HIVE-4090.1.patch.txt


 When users execute use a transform script with the config 
 hive.exec.script.allow.partial.consumption set to true, it may produce 
 partial results.
 When this config is set the script may close it's input pipe before its 
 parent operator has finished passing it rows.  In the catch block for this 
 exception, the setDone method is called marking the operator as done.  
 However, there's a separate thread running to process rows passed from the 
 script back to Hive via stdout.  If this thread is not done processing rows, 
 any rows it forwards after the setDone method is called will not be passed to 
 its children.  This leads to partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4044) Add URL type


 [ 
https://issues.apache.org/jira/browse/HIVE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4044:
---

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Per [~sxyuan] this is not needed anymore. Resolving.

 Add URL type
 

 Key: HIVE-4044
 URL: https://issues.apache.org/jira/browse/HIVE-4044
 Project: Hive
  Issue Type: Improvement
Reporter: Samuel Yuan
Assignee: Samuel Yuan
 Attachments: HIVE-4044.HIVE-4044.HIVE-4044.D8799.1.patch


 Having a separate type for URLs would enable improvements in storage 
 efficiency based on breaking up a URL into its components. The new type will 
 be named URL and made a non-reserved keyword (see HIVE-701).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4080) Add Lead Lag UDAFs


 [ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4080:
--

Attachment: HIVE-4080.D8961.2.patch

hbutani updated the revision HIVE-4080 [jira] Add Lead  Lag UDAFs.

- add Lead and Lag UDAFs, fix issues specified in review

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D8961

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D8961?vs=28749id=28803#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLead.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLeadLag.java
  ql/src/test/queries/clientpositive/leadlag_queries.q
  ql/src/test/results/clientpositive/leadlag_queries.q.out

To: JIRA, ashutoshc, hbutani


 Add Lead  Lag UDAFs
 

 Key: HIVE-4080
 URL: https://issues.apache.org/jira/browse/HIVE-4080
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch, 
 HIVE-4080.D8961.2.patch


 Currently we support Lead/Lag as navigation UDFs usable with Windowing.
 To be standard compliant we need to support Lead  Lag UDAFs.
 Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
 is in play. 
 Currently allow Lead/Lag expressions to appear in SelectLists even when they 
 are not arguments to UDAFs. Support for this feature will probably be 
 removed. Causes ambiguities when Query contains different partition clauses. 
 Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive


[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588986#comment-13588986
 ] 

Kevin Wilfong commented on HIVE-3874:
-

K, let me know when it's ready for review again.

 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: hive.3874.2.patch, HIVE-3874.D8529.1.patch, 
 HIVE-3874.D8529.2.patch, HIVE-3874.D8529.3.patch, HIVE-3874.D8529.4.patch, 
 HIVE-3874.D8871.1.patch, OrcFileIntro.pptx, orc.tgz


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3874:


Status: Open  (was: Patch Available)

 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: hive.3874.2.patch, HIVE-3874.D8529.1.patch, 
 HIVE-3874.D8529.2.patch, HIVE-3874.D8529.3.patch, HIVE-3874.D8529.4.patch, 
 HIVE-3874.D8871.1.patch, OrcFileIntro.pptx, orc.tgz


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3849) Aliased column in where clause for multi-groupby single reducer cannot be resolved


 [ 
https://issues.apache.org/jira/browse/HIVE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3849:


Status: Patch Available  (was: Open)

 Aliased column in where clause for multi-groupby single reducer cannot be 
 resolved
 --

 Key: HIVE-3849
 URL: https://issues.apache.org/jira/browse/HIVE-3849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3849.D7713.1.patch, HIVE-3849.D7713.2.patch, 
 HIVE-3849.D7713.3.patch, HIVE-3849.D7713.4.patch, HIVE-3849.D7713.5.patch, 
 HIVE-3849.D7713.6.patch, HIVE-3849.D7713.7.patch


 Verifying HIVE-3847, I've found an exception is thrown before meeting the 
 error situation described in it. Something like, 
 FAILED: SemanticException [Error 10025]: Line 40:6 Expression not in GROUP BY 
 key 'crit5'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3849) Aliased column in where clause for multi-groupby single reducer cannot be resolved


[ 
https://issues.apache.org/jira/browse/HIVE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589009#comment-13589009
 ] 

Phabricator commented on HIVE-3849:
---

navis has commented on the revision HIVE-3849 [jira] Columns are not extracted 
for multi-groupby single reducer case somtimes.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3537 done.
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3548 ok.
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:3569 done.
  ql/src/test/queries/clientpositive/groupby_multi_single_reducer3.q:8 ah, got 
it.

REVISION DETAIL
  https://reviews.facebook.net/D7713

To: JIRA, navis
Cc: njain


 Aliased column in where clause for multi-groupby single reducer cannot be 
 resolved
 --

 Key: HIVE-3849
 URL: https://issues.apache.org/jira/browse/HIVE-3849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3849.D7713.1.patch, HIVE-3849.D7713.2.patch, 
 HIVE-3849.D7713.3.patch, HIVE-3849.D7713.4.patch, HIVE-3849.D7713.5.patch, 
 HIVE-3849.D7713.6.patch, HIVE-3849.D7713.7.patch, HIVE-3849.D7713.8.patch


 Verifying HIVE-3847, I've found an exception is thrown before meeting the 
 error situation described in it. Something like, 
 FAILED: SemanticException [Error 10025]: Line 40:6 Expression not in GROUP BY 
 key 'crit5'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3849) Aliased column in where clause for multi-groupby single reducer cannot be resolved


 [ 
https://issues.apache.org/jira/browse/HIVE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3849:
--

Attachment: HIVE-3849.D7713.8.patch

navis updated the revision HIVE-3849 [jira] Columns are not extracted for 
multi-groupby single reducer case somtimes.

  Addressed comments and rebased to trunk

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D7713

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D7713?vs=27243id=28815#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java
  ql/src/test/queries/clientpositive/groupby_multi_insert_common_distinct.q
  ql/src/test/queries/clientpositive/groupby_mutli_insert_common_distinct.q
  ql/src/test/queries/clientpositive/groupby_multi_single_reducer3.q
  ql/src/test/results/clientpositive/groupby_multi_insert_common_distinct.q.out
  ql/src/test/results/clientpositive/groupby_mutli_insert_common_distinct.q.out
  ql/src/test/results/clientpositive/groupby_multi_single_reducer3.q.out

To: JIRA, navis
Cc: njain


 Aliased column in where clause for multi-groupby single reducer cannot be 
 resolved
 --

 Key: HIVE-3849
 URL: https://issues.apache.org/jira/browse/HIVE-3849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3849.D7713.1.patch, HIVE-3849.D7713.2.patch, 
 HIVE-3849.D7713.3.patch, HIVE-3849.D7713.4.patch, HIVE-3849.D7713.5.patch, 
 HIVE-3849.D7713.6.patch, HIVE-3849.D7713.7.patch, HIVE-3849.D7713.8.patch


 Verifying HIVE-3847, I've found an exception is thrown before meeting the 
 error situation described in it. Something like, 
 FAILED: SemanticException [Error 10025]: Line 40:6 Expression not in GROUP BY 
 key 'crit5'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3904) Replace hashmaps in JoinOperators to array


 [ 
https://issues.apache.org/jira/browse/HIVE-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3904:


Status: Patch Available  (was: Open)

 Replace hashmaps in JoinOperators to array
 --

 Key: HIVE-3904
 URL: https://issues.apache.org/jira/browse/HIVE-3904
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3904.D7959.1.patch


 Join operator has many HashMaps that maps tag to some internal 
 value(ExprEvals, OIs, etc.) and theses are accessed 5 or more times per an 
 object, which is seemed unnecessary overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3904) Replace hashmaps in JoinOperators to array


 [ 
https://issues.apache.org/jira/browse/HIVE-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3904:
--

Attachment: HIVE-3904.D7959.2.patch

navis updated the revision HIVE-3904 [jira] Replace hashmaps in JoinOperators 
to array.

  Rebased to trunk

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D7959

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D7959?vs=25563id=28821#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java

To: JIRA, navis
Cc: njain


 Replace hashmaps in JoinOperators to array
 --

 Key: HIVE-3904
 URL: https://issues.apache.org/jira/browse/HIVE-3904
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3904.D7959.1.patch, HIVE-3904.D7959.2.patch


 Join operator has many HashMaps that maps tag to some internal 
 value(ExprEvals, OIs, etc.) and theses are accessed 5 or more times per an 
 object, which is seemed unnecessary overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4091) [REGRESSION] HIVE-3571 does not run all tests sometimes

Navis created HIVE-4091:
---

 Summary: [REGRESSION] HIVE-3571 does not run all tests sometimes
 Key: HIVE-4091
 URL: https://issues.apache.org/jira/browse/HIVE-4091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Navis
Assignee: Navis


ant test does not run whole test but only runs tests in ql sometimes (the 
time difference is about 30 min)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.


[ 
https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589058#comment-13589058
 ] 

Navis commented on HIVE-2264:
-

I'm applying this for all internal hive releases and wish to be 
reviewed/applied into apache hive. But sadly, no committer seemed interested in 
it. 

It's already in patch-available status. Do you have more idea to be merged with 
this? Then I'll happily assign it to you.

 Hive server is SHUTTING DOWN when invalid queries beeing executed.
 --

 Key: HIVE-2264
 URL: https://issues.apache.org/jira/browse/HIVE-2264
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: SuSE-Linux-11
Reporter: rohithsharma
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch


 When invalid query is beeing executed, Hive server is shutting down.
 {noformat}
 CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds 
 string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
 ALTER TABLE SAMPLETABLE add Partition(ds='sf') location 
 '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.


[ 
https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589065#comment-13589065
 ] 

Brock Noland commented on HIVE-2264:


[~navis] You have been running with this patch for quite a long time? In 
regards to getting it merged, I think the best we can do is update the review 
board item with the rebased patch. Another item that may bring it up in terms 
of visibility is linking it to HIVE-2935 as it's quite important for HS2.

 Hive server is SHUTTING DOWN when invalid queries beeing executed.
 --

 Key: HIVE-2264
 URL: https://issues.apache.org/jira/browse/HIVE-2264
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
 Environment: SuSE-Linux-11
Reporter: rohithsharma
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch


 When invalid query is beeing executed, Hive server is shutting down.
 {noformat}
 CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds 
 string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
 ALTER TABLE SAMPLETABLE add Partition(ds='sf') location 
 '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse'
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1990 - Failure

2013-02-27 Thread Apache Jenkins Server

Changes for Build #1990



1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:2381)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:1867)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1990)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1990/ to 
view the results.

[jira] [Commented] (HIVE-4080) Add Lead Lag UDAFs


[ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589076#comment-13589076
 ] 

Phabricator commented on HIVE-4080:
---

ashutoshc has requested changes to the revision HIVE-4080 [jira] Add Lead  
Lag UDAFs.

  Mostly looks good.
  * Missing apache headers.
  * Request for couple more tests.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLead.java:1 
Apache headers are missing.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLeadLag.java:1 
Apache headers missing.
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLag.java:1 
Apache headers missing.
  ql/src/test/queries/clientpositive/leadlag_queries.q:19 So, lead/lag can now 
take three params. First one is column, second one is offset, third is default 
value. Correct?
  It will be good to have a test case with a constant default like lag 
(price,5,10) and one with implicit params like lag (price) which implies 
default of offset = 1 and NULL for default.

REVISION DETAIL
  https://reviews.facebook.net/D8961

BRANCH
  HIVE-4080

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, hbutani


 Add Lead  Lag UDAFs
 

 Key: HIVE-4080
 URL: https://issues.apache.org/jira/browse/HIVE-4080
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch, 
 HIVE-4080.D8961.2.patch


 Currently we support Lead/Lag as navigation UDFs usable with Windowing.
 To be standard compliant we need to support Lead  Lag UDAFs.
 Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
 is in play. 
 Currently allow Lead/Lag expressions to appear in SelectLists even when they 
 are not arguments to UDAFs. Support for this feature will probably be 
 removed. Causes ambiguities when Query contains different partition clauses. 
 Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4091) [REGRESSION] HIVE-3571 does not run all tests sometimes


[ 
https://issues.apache.org/jira/browse/HIVE-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589079#comment-13589079
 ] 

Ashutosh Chauhan commented on HIVE-4091:


One which I noticed is test in shims/ dir are not run. Are there others as 
well? 

 [REGRESSION] HIVE-3571 does not run all tests sometimes
 ---

 Key: HIVE-4091
 URL: https://issues.apache.org/jira/browse/HIVE-4091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Navis
Assignee: Navis

 ant test does not run whole test but only runs tests in ql sometimes (the 
 time difference is about 30 min)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4073) Make partition by optional in over clause


[ 
https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589129#comment-13589129
 ] 

Brock Noland commented on HIVE-4073:


[~ashutoshc]

When OVER() with no partition column is specified, we will partition by some 
constant. In that case, there is only a need for a single reducer.  Should this 
change force the single reducer? Not sure how how I would do that yet.

 Make partition by optional in over clause
 -

 Key: HIVE-4073
 URL: https://issues.apache.org/jira/browse/HIVE-4073
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Brock Noland

 select s, sum( i ) over() from tt; should work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3785) Core hive changes for HiveServer2 implementation

2013-02-27 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589135#comment-13589135
 ] 

Thejas M Nair commented on HIVE-3785:
-

Hi [~namit] Rebased patch has been uploaded by Prasad in HIVE-2935. Can you 
please review it using the phabricator link there ? The phabricator upload has 
only files that have changed.

 Core hive changes for HiveServer2 implementation
 

 Key: HIVE-3785
 URL: https://issues.apache.org/jira/browse/HIVE-3785
 Project: Hive
  Issue Type: Sub-task
  Components: Authentication, Build Infrastructure, Configuration, 
 Thrift API
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HS2-changed-files-only.patch


 The subtask to track changes in the core hive components for HiveServer2 
 implementation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype


 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan updated HIVE-3850:


Attachment: (was: hive-3850.patch)

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype


 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan updated HIVE-3850:


Attachment: hive-3850.patch

re-attaching the patch

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850.patch, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype


 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan updated HIVE-3850:


Attachment: (was: hive-3850.patch)

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850.patch_1.txt, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype


 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan updated HIVE-3850:


Attachment: hive-3850.patch_1.txt

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850.patch_1.txt, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4073) Make partition by optional in over clause


 [ 
https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4073:
---

Attachment: HIVE-4073-0.patch

Attached patch not for commit, but it does work. I'll add some tests tomorrow 
and then think about forcing one reducer. When I manually set more than one 
reducer the patch still worked since the partition key was the same for a 
records.

 Make partition by optional in over clause
 -

 Key: HIVE-4073
 URL: https://issues.apache.org/jira/browse/HIVE-4073
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Brock Noland
 Attachments: HIVE-4073-0.patch


 select s, sum( i ) over() from tt; should work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype


 [ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anandha L Ranganathan updated HIVE-3850:


Attachment: (was: hive-3850.patch_1.txt)

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850_1.patch, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: Request to review the change

2013-02-27 Thread Anandha L Ranganahan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9673/
---

Review request for hive.


Description
---

Patch for issue https://issues.apache.org/jira/browse/HIVE-3850. Please review.


This addresses bug https://issues.apache.org/jira/browse/HIVE-3850.

https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-3850


Diffs
-

  /trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFHour.java 115 
  /trunk/ql/src/test/queries/clientpositive/udf_hour.q 115 
  /trunk/ql/src/test/results/clientpositive/udf_hour.q.out 115 

Diff: https://reviews.apache.org/r/9673/diff/


Testing
---

Attached test case with results. Includes .q and .q.out


Thanks,

Anandha L Ranganahan

[jira] [Updated] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results


 [ 
https://issues.apache.org/jira/browse/HIVE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4090:


Assignee: Kevin Wilfong

 Use of hive.exec.script.allow.partial.consumption can produce partial results
 -

 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4090.1.patch.txt


 When users execute use a transform script with the config 
 hive.exec.script.allow.partial.consumption set to true, it may produce 
 partial results.
 When this config is set the script may close it's input pipe before its 
 parent operator has finished passing it rows.  In the catch block for this 
 exception, the setDone method is called marking the operator as done.  
 However, there's a separate thread running to process rows passed from the 
 script back to Hive via stdout.  If this thread is not done processing rows, 
 any rows it forwards after the setDone method is called will not be passed to 
 its children.  This leads to partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results


 [ 
https://issues.apache.org/jira/browse/HIVE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4090:


Status: Patch Available  (was: Open)

 Use of hive.exec.script.allow.partial.consumption can produce partial results
 -

 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
 Attachments: HIVE-4090.1.patch.txt


 When users execute use a transform script with the config 
 hive.exec.script.allow.partial.consumption set to true, it may produce 
 partial results.
 When this config is set the script may close it's input pipe before its 
 parent operator has finished passing it rows.  In the catch block for this 
 exception, the setDone method is called marking the operator as done.  
 However, there's a separate thread running to process rows passed from the 
 script back to Hive via stdout.  If this thread is not done processing rows, 
 any rows it forwards after the setDone method is called will not be passed to 
 its children.  This leads to partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4014) Hive+RCFile is not doing column pruning and reading much more data than necessary

2013-02-27 Thread Lianhui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589226#comment-13589226
 ] 

Lianhui Wang commented on HIVE-4014:


hi,Tamas
thank you very much,you are right.
also i think rcfile.reader are not very efficient.
the readed column ids are transfer to rcfile.reader.


 Hive+RCFile is not doing column pruning and reading much more data than 
 necessary
 -

 Key: HIVE-4014
 URL: https://issues.apache.org/jira/browse/HIVE-4014
 Project: Hive
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 With even simple projection queries, I see that HDFS bytes read counter 
 doesn't show any reduction in the amount of data read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3980) Cleanup after HIVE-3403


[ 
https://issues.apache.org/jira/browse/HIVE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589241#comment-13589241
 ] 

Namit Jain commented on HIVE-3980:
--

[~ashutoshc], ping

 Cleanup after HIVE-3403
 ---

 Key: HIVE-3980
 URL: https://issues.apache.org/jira/browse/HIVE-3980
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3980.1.patch, hive.3980.2.patch


 There have been a lot of comments on HIVE-3403, which involve changing 
 variable names/function names/adding more comments/general cleanup etc.
 Since HIVE-3403 involves a lot of refactoring, it was fairly difficult to
 address the comments there, since refreshing becomes impossible. This jira
 is to track those cleanups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3891) physical optimizer changes for auto sort-merge join


 [ 
https://issues.apache.org/jira/browse/HIVE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3891:
-

Attachment: hive.3891.7.patch

 physical optimizer changes for auto sort-merge join
 ---

 Key: HIVE-3891
 URL: https://issues.apache.org/jira/browse/HIVE-3891
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3891.1.patch, hive.3891.2.patch, hive.3891.3.patch, 
 hive.3891.4.patch, hive.3891.5.patch, hive.3891.6.patch, hive.3891.7.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4007) Create abstract classes for serializer and deserializer


 [ 
https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4007:
-

Attachment: hive.4007.4.patch

 Create abstract classes for serializer and deserializer
 ---

 Key: HIVE-4007
 URL: https://issues.apache.org/jira/browse/HIVE-4007
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, 
 hive.4007.4.patch


 Currently, it is very difficult to change the Serializer/Deserializer
 interface, since all the SerDes directly implement the interface.
 Instead, we should have abstract classes for implementing these interfaces.
 In case of a interface change, only the abstract class and the relevant 
 serde needs to change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4042) ignore mapjoin hint


 [ 
https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4042:
-

Attachment: hive.4042.7.patch

 ignore mapjoin hint
 ---

 Key: HIVE-4042
 URL: https://issues.apache.org/jira/browse/HIVE-4042
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4042.1.patch, hive.4042.2.patch, hive.4042.3.patch, 
 hive.4042.4.patch, hive.4042.5.patch, hive.4042.6.patch, hive.4042.7.patch


 After HIVE-3784, in a production environment, it can become difficult to
 deploy since a lot of production queries can break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3891) physical optimizer changes for auto sort-merge join


[ 
https://issues.apache.org/jira/browse/HIVE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589253#comment-13589253
 ] 

Namit Jain commented on HIVE-3891:
--

[~vikram.dixit], I am confused.

Look at line 965 of auto_sortmerge_join_1.q.out, it is a SMB.

Going in more detail: (line 486-492)

  Stage-5 is a root stage , consists of Stage-6, Stage-7, Stage-1
  Stage-6 has a backup stage: Stage-1
  Stage-3 depends on stages: Stage-6
  Stage-7 has a backup stage: Stage-1
  Stage-4 depends on stages: Stage-7
  Stage-1
  Stage-0 is a root stage

Stage6 and 7 are mapjoin jobs, whereas Stage1 is a SMB join.
This is the purpose of this jira. If a mapjoin can be performed, that
gets priority over SMB join.

 physical optimizer changes for auto sort-merge join
 ---

 Key: HIVE-3891
 URL: https://issues.apache.org/jira/browse/HIVE-3891
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3891.1.patch, hive.3891.2.patch, hive.3891.3.patch, 
 hive.3891.4.patch, hive.3891.5.patch, hive.3891.6.patch, hive.3891.7.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4090) Use of hive.exec.script.allow.partial.consumption can produce partial results


[ 
https://issues.apache.org/jira/browse/HIVE-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589254#comment-13589254
 ] 

Namit Jain commented on HIVE-4090:
--

+1

 Use of hive.exec.script.allow.partial.consumption can produce partial results
 -

 Key: HIVE-4090
 URL: https://issues.apache.org/jira/browse/HIVE-4090
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4090.1.patch.txt


 When users execute use a transform script with the config 
 hive.exec.script.allow.partial.consumption set to true, it may produce 
 partial results.
 When this config is set the script may close it's input pipe before its 
 parent operator has finished passing it rows.  In the catch block for this 
 exception, the setDone method is called marking the operator as done.  
 However, there's a separate thread running to process rows passed from the 
 script back to Hive via stdout.  If this thread is not done processing rows, 
 any rows it forwards after the setDone method is called will not be passed to 
 its children.  This leads to partial results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job


 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3952:
-

Status: Open  (was: Patch Available)

Can you create a phabricator entry ?

 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli
 Attachments: HIVE-3952-20130226.txt, HIVE-3952-20130227.1.txt


 Consider the query like:
 select count(*) FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-684) add UDF make_set


 [ 
https://issues.apache.org/jira/browse/HIVE-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-684:


Status: Open  (was: Patch Available)

Have you run all the tests ?
Some test outputs need to be updated.
Also, can you create a phabricator entry ?

 add UDF make_set
 

 Key: HIVE-684
 URL: https://issues.apache.org/jira/browse/HIVE-684
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: PRETTY SITHARA
 Attachments: HIVE-684.1.patch.txt, HIVE-684.2.patch.txt, 
 input.txt.txt, make_set.q, make_set.q.out


 add UDFmake_set
 look at
 http://dev.mysql.com/doc/refman/5.0/en/func-op-summary-ref.html
 for details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4077) alterPartition and alterPartitions methods in ObjectStore swallow exceptions


 [ 
https://issues.apache.org/jira/browse/HIVE-4077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4077:
-

Status: Open  (was: Patch Available)

comments

 alterPartition and alterPartitions methods in ObjectStore swallow exceptions
 

 Key: HIVE-4077
 URL: https://issues.apache.org/jira/browse/HIVE-4077
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-4077.1.patch.txt, HIVE-4077.2.patch.txt


 The alterPartition and alterPartitions methods in the ObjectStore class throw 
 a MetaException in the case of a failure but do not include the cause, 
 meaning that information is lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4056) Extend rcfilecat to support (un)compressed size and no. of row


 [ 
https://issues.apache.org/jira/browse/HIVE-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4056:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Tim

 Extend rcfilecat to support (un)compressed size and no. of row
 --

 Key: HIVE-4056
 URL: https://issues.apache.org/jira/browse/HIVE-4056
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
 Fix For: 0.11.0

 Attachments: HIVE-4056.patch.1


 rcfilecat supports data and metadata:
 https://cwiki.apache.org/Hive/rcfilecat.html
 In metadata, it supports column statistics.
 It will be natural to extend metadata support to 
 1. no. of rows 
 2. uncompressed size for the file
 3. compressed size for the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4007) Create abstract classes for serializer and deserializer