[jira] [Commented] (HIVE-4131) Fix eclipse template classpath to include new packages added by ORC file patch

2013-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597881#comment-13597881
 ] 

Hudson commented on HIVE-4131:
--

Integrated in Hive-trunk-h0.21 #2007 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2007/])
HIVE-4131. Fix eclipse template classpath to include new packages added by 
ORC file patch. (Prasad Mujumdar via kevinwilfong) (Revision 1454496)

 Result = SUCCESS
kevinwilfong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454496
Files : 
* /hive/trunk/eclipse-templates/.classpath


 Fix eclipse template classpath to include new packages added by ORC file patch
 --

 Key: HIVE-4131
 URL: https://issues.apache.org/jira/browse/HIVE-4131
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.11.0

 Attachments: HIVE-4131-1.patch


 The ORC file feature (HIVE-3874) has added protobuf and snappy libraries, 
 also generated protobuf code. All these needs to be included in the eclipse 
 classpath template. The eclipse projected generated on latest trunk has build 
 errors due to the missing jar/classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4097) ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids

2013-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597880#comment-13597880
 ] 

Hudson commented on HIVE-4097:
--

Integrated in Hive-trunk-h0.21 #2007 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2007/])
HIVE-4097 : ORC file doesnt properly interpret empty 
hive.io.file.readcolumn.ids (Owen Omalley via Ashutosh Chauhan) (Revision 
1454453)

 Result = SUCCESS
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454453
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java


 ORC file doesn't properly interpret empty hive.io.file.readcolumn.ids
 -

 Key: HIVE-4097
 URL: https://issues.apache.org/jira/browse/HIVE-4097
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4097.D9015.1.patch


 Hive assumes that an empty string in hive.io.file.readcolumn.ids means all 
 columns. The ORC reader currently assumes it means no columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4098) OrcInputFormat assumes Hive always calls createValue

2013-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597882#comment-13597882
 ] 

Hudson commented on HIVE-4098:
--

Integrated in Hive-trunk-h0.21 #2007 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2007/])
HIVE-4098 : OrcInputFormat assumes Hive always calls createValue (Owen 
Omalley via Ashutosh Chauhan) (Revision 1454454)

 Result = SUCCESS
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454454
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java


 OrcInputFormat assumes Hive always calls createValue
 

 Key: HIVE-4098
 URL: https://issues.apache.org/jira/browse/HIVE-4098
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4098.D9021.1.patch


 Hive's HiveContextAwareRecordReader doesn't create a new value for each 
 InputFormat and instead reuses the same row between input formats. That 
 causes the first record of second (and third, etc.) partition to be dropped 
 and replaced with the last row of the previous partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2007 - Fixed

2013-03-09 Thread Apache Jenkins Server
Changes for Build #2005

Changes for Build #2006

Changes for Build #2007
[kevinwilfong] HIVE-4131. Fix eclipse template classpath to include new 
packages added by ORC file patch. (Prasad Mujumdar via kevinwilfong)

[hashutosh] HIVE-4098 : OrcInputFormat assumes Hive always calls createValue 
(Owen Omalley via Ashutosh Chauhan)

[hashutosh] HIVE-4097 : ORC file doesnt properly interpret empty 
hive.io.file.readcolumn.ids (Owen Omalley via Ashutosh Chauhan)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2007)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2007/ to 
view the results.

[jira] [Updated] (HIVE-4096) problem in hive.map.groupby.sorted with distincts

2013-03-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4096:
-

Status: Patch Available  (was: Open)

Tests passed

 problem in hive.map.groupby.sorted with distincts
 -

 Key: HIVE-4096
 URL: https://issues.apache.org/jira/browse/HIVE-4096
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4096.1.patch


 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 set hive.exec.reducers.max = 10;
 set hive.map.groupby.sorted=true;
 CREATE TABLE T1(key STRING, val STRING) PARTITIONED BY (ds string)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1 PARTITION 
 (ds='1');
 -- perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE T1 PARTITION (ds='1') select key, val from T1 where ds 
 = '1';
 CREATE TABLE outputTbl1(cnt INT);
 -- The plan should be converted to a map-side group by, since the
 -- sorting columns and grouping columns match, and all the bucketing columns
 -- are part of sorting columns
 EXPLAIN
 select count(distinct key) from T1;
 select count(distinct key) from T1;
 explain
 INSERT OVERWRITE TABLE outputTbl1
 select count(distinct key) from T1;
 INSERT OVERWRITE TABLE outputTbl1
 select count(distinct key) from T1;
 SELECT * FROM outputTbl1;
 DROP TABLE T1;
 The above query gives wrong results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4127) Testing with Hadoop 2.x causes test failure for ORC's TestFileDump

2013-03-09 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4127:
---

   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Owen!

 Testing with Hadoop 2.x causes test failure for ORC's TestFileDump
 --

 Key: HIVE-4127
 URL: https://issues.apache.org/jira/browse/HIVE-4127
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4127.D9111.1.patch


 Hadoop 2's junit is a newer version, which causes differences in behaviors of 
 the TestFileDump. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4067) Followup to HIVE-701: reduce ambiguity in grammar

2013-03-09 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4067:
---

Status: Open  (was: Patch Available)

I started getting following errors while compiling after HIVE-701 This patch 
doesn't address these.
{noformat}
[java] error(111): 
/home/ashutosh/hive/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:303:10:
 reference to attribute outside of a rule: KEY
 [java] error(111): 
/home/ashutosh/hive/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:303:10:
 reference to attribute outside of a rule: VALUE
 [java] error(111): 
/home/ashutosh/hive/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:303:10:
 reference to attribute outside of a rule: ELEM
 [java] error(146): 
/home/ashutosh/hive/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:303:10:
 invalid StringTemplate % shorthand syntax: '%)'
{noformat}

I also suspect its because of this, we later get 
{noformat}
 [java] Java Result: 1
{noformat}
at the conclusion of grammar compilation. The way we compile grammar is by 
calling out to java process from within ant which runs org.antlr.Tool to 
compile grammar. This java process used to return with return code = 0 before 
HIVE-701 but now returns with return code = 1 which implies some error 
condition. My suspicion its the error which I have pointed out is causing this. 
We should fix both these problems. 

 Followup to HIVE-701: reduce ambiguity in grammar
 -

 Key: HIVE-4067
 URL: https://issues.apache.org/jira/browse/HIVE-4067
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Samuel Yuan
Assignee: Samuel Yuan
Priority: Minor
 Attachments: HIVE-4067.D8883.1.patch


 After HIVE-701 the grammar has become much more ambiguous, and the 
 compilation generates a large number of warnings. Making FROM, DISTINCT, 
 PRESERVE, COLUMN, ALL, AND, OR, and NOT reserved keywords again reduces the 
 number of warnings to 134, up from the original 81 warnings but down from the 
 565 after HIVE-701. Most of the remaining ambiguity is trivial, an example 
 being KW_ELEM_TYPE | KW_KEY_TYPE | KW_VALUE_TYPE | identifier, and they are 
 all correctly handled by ANTLR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4042) ignore mapjoin hint

2013-03-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-4042:
-

Attachment: hive.4042.10.patch

 ignore mapjoin hint
 ---

 Key: HIVE-4042
 URL: https://issues.apache.org/jira/browse/HIVE-4042
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4042.10.patch, hive.4042.1.patch, 
 hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, 
 hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch, hive.4042.9.patch


 After HIVE-3784, in a production environment, it can become difficult to
 deploy since a lot of production queries can break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4141) InspectorFactories contains static HashMaps which can cause infinite loop

2013-03-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597991#comment-13597991
 ] 

Jarek Jarcec Cecho commented on HIVE-4141:
--

+1 (non-binding)

 InspectorFactories contains static HashMaps which can cause infinite loop
 -

 Key: HIVE-4141
 URL: https://issues.apache.org/jira/browse/HIVE-4141
 Project: Hive
  Issue Type: Sub-task
  Components: Server Infrastructure
Reporter: Brock Noland
 Attachments: HIVE-4141-1.patch


 When many clients hit hs2, hs2 can get stuck in an infinite loop due to 
 concurrent modification of the static maps here:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/LazyObjectInspectorFactory.java
 and in other ObjectFactories. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4143) Incorrect column mappings with over clause

2013-03-09 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-4143:
--

 Summary: Incorrect column mappings with over clause
 Key: HIVE-4143
 URL: https://issues.apache.org/jira/browse/HIVE-4143
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4143) Incorrect column mappings with over clause

2013-03-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597996#comment-13597996
 ] 

Ashutosh Chauhan commented on HIVE-4143:


Stack trace of failed 3rd MR job:
{code}
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:162)
... 14 more
Caused by: java.lang.RuntimeException: cannot find field _wcol0 from [0:_col0, 
1:_col1, 2:_col2]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:995)
at 
org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1021)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:489)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:417)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeOp(Operator.java:402)
at 
org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:102)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:489)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:417)
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:78)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:489)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:417)
at 
org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at 
org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:154)
{code}

If I remove last where clause {{where rnk = 3}} query succeeds.

 Incorrect column mappings with over clause
 --

 Key: HIVE-4143
 URL: https://issues.apache.org/jira/browse/HIVE-4143
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3466) create a new type of tables: dependent tables

2013-03-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3466:
-

Assignee: Namit Jain
 Summary: create a new type of tables: dependent tables  (was: maintain 
dependency between external table partitions and managed table partitions )

 create a new type of tables: dependent tables
 -

 Key: HIVE-3466
 URL: https://issues.apache.org/jira/browse/HIVE-3466
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3466) create a new type of tables: dependent tables

2013-03-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598009#comment-13598009
 ] 

Namit Jain commented on HIVE-3466:
--

Instead of changing the behavior of external tables, it would be easier to add 
a new type of tables for storing that dependency, dependent tables.

 create a new type of tables: dependent tables
 -

 Key: HIVE-3466
 URL: https://issues.apache.org/jira/browse/HIVE-3466
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HIVE-3466) create a new type of tables: dependent tables

2013-03-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13455930#comment-13455930
 ] 

Namit Jain edited comment on HIVE-3466 at 3/9/13 6:56 PM:
--

https://cwiki.apache.org/confluence/display/Hive/Dependent+Tables

  was (Author: namit):
https://cwiki.apache.org/confluence/display/Hive/External+Tables
  
 create a new type of tables: dependent tables
 -

 Key: HIVE-3466
 URL: https://issues.apache.org/jira/browse/HIVE-3466
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4042) ignore mapjoin hint

2013-03-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598038#comment-13598038
 ] 

Namit Jain commented on HIVE-4042:
--

Too many test updates - changed the parameter false for tests, true otherwise

 ignore mapjoin hint
 ---

 Key: HIVE-4042
 URL: https://issues.apache.org/jira/browse/HIVE-4042
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4042.10.patch, hive.4042.1.patch, 
 hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, 
 hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch, hive.4042.9.patch


 After HIVE-3784, in a production environment, it can become difficult to
 deploy since a lot of production queries can break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2055) Hive HBase Integration issue

2013-03-09 Thread Michael Naumov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598041#comment-13598041
 ] 

Michael Naumov commented on HIVE-2055:
--

Somehow CM 4.5 is ignoring Hive Client Configuration Safety Values for 
hive-site.xml
and throws the same error in Hive
java.io.IOException: Cannot create an instance of InputSplit class = 
org.apache.hadoop.hive.hbase.HBaseSplit:Class 
org.apache.hadoop.hive.hbase.HBaseSplit not found

 Hive HBase Integration issue
 

 Key: HIVE-2055
 URL: https://issues.apache.org/jira/browse/HIVE-2055
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: sajith v

 Created an external table in hive , which points to the HBase table. When 
 tried to query a column using the column name in select clause got the 
 following exception : ( java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, 
 SQLState:42000)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4042) ignore mapjoin hint

2013-03-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598042#comment-13598042
 ] 

Namit Jain commented on HIVE-4042:
--

tests pass

 ignore mapjoin hint
 ---

 Key: HIVE-4042
 URL: https://issues.apache.org/jira/browse/HIVE-4042
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.4042.10.patch, hive.4042.1.patch, 
 hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, 
 hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch, hive.4042.9.patch


 After HIVE-3784, in a production environment, it can become difficult to
 deploy since a lot of production queries can break.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2379) Hive/HBase integration could be improved

2013-03-09 Thread Michael Naumov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598044#comment-13598044
 ] 

Michael Naumov commented on HIVE-2379:
--

Somehow CM 4.5 is ignoring Hive Client Configuration Safety Values for 
hive-site.xml


 Hive/HBase integration could be improved
 

 Key: HIVE-2379
 URL: https://issues.apache.org/jira/browse/HIVE-2379
 Project: Hive
  Issue Type: Bug
  Components: CLI, Clients, HBase Handler
Affects Versions: 0.7.1, 0.8.0, 0.9.0
Reporter: Roman Shaposhnik
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2379.D7347.1.patch


 For now any Hive/HBase queries would require the following jars to be 
 explicitly added via hive's add jar command:
 add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
 add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
 add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
 add jar /usr/lib/hive/lib/guava-r06.jar;
 the longer term solution, perhaps, should be to have the code at submit time 
 call hbase's 
 TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship 
 it in distributedcache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4144) Add select database() command to show the current database

2013-03-09 Thread Mark Grover (JIRA)
Mark Grover created HIVE-4144:
-

 Summary: Add select database() command to show the current 
database
 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover


A recent hive-user mailing list conversation asked about having a command to 
show the current database.
http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E

MySQL seems to have a command to do so:
{code}
select database();
{code}
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database

We should look into having something similar in Hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Request to review the change

2013-03-09 Thread Mark Grover

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9673/#review17649
---

Ship it!


Ship It!

- Mark Grover


On Feb. 28, 2013, 4:05 a.m., Anandha L Ranganahan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/9673/
 ---
 
 (Updated Feb. 28, 2013, 4:05 a.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 Patch for issue https://issues.apache.org/jira/browse/HIVE-3850. Please 
 review.
 
 
 This addresses bug https://issues.apache.org/jira/browse/HIVE-3850.
 
 https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-3850
 
 
 Diffs
 -
 
   /trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFHour.java 115 
   /trunk/ql/src/test/queries/clientpositive/udf_hour.q 115 
   /trunk/ql/src/test/results/clientpositive/udf_hour.q.out 115 
 
 Diff: https://reviews.apache.org/r/9673/diff/
 
 
 Testing
 ---
 
 Attached test case with results. Includes .q and .q.out
 
 
 Thanks,
 
 Anandha L Ranganahan
 




[jira] [Commented] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype

2013-03-09 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598049#comment-13598049
 ] 

Mark Grover commented on HIVE-3850:
---

+1 (non-committer)

 hour() function returns 12 hour clock value when using timestamp datatype
 -

 Key: HIVE-3850
 URL: https://issues.apache.org/jira/browse/HIVE-3850
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.9.0, 0.10.0
Reporter: Pieterjan Vriends
 Fix For: 0.11.0

 Attachments: hive-3850_1.patch, HIVE-3850.patch.txt


 Apparently UDFHour.java does have two evaluate() functions. One that does 
 accept a Text object as parameter and one that does use a TimeStampWritable 
 object as parameter. The first function does return the value of 
 Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the 
 documentation I couldn't find any information on the overload of the 
 evaluation function. I did spent quite some time finding out why my statement 
 didn't return a 24 hour clock value.
 Shouldn't both functions return the same?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Hive-3963

2013-03-09 Thread Mark Grover
Maxime,
I posted a comment on the JIRA. Thanks!

On Thu, Mar 7, 2013 at 5:57 AM, mlanciau mlanc...@gmail.com wrote:
 Hello !

 I am working on https://issues.apache.org/jira/browse/HIVE-3963 to allow
 Hive's users to get data from databases to do join with big Hadoop/Hive
 table and small/reference table.

 So I have coded a UDTF LoadFromJDBC and it is working well. But I am sure it
 can be improved a lot !

 I am looking for any comments/advices/help !

 Thanks.

 --
 Maxime LANCIAUX
 http://maximelanciauxbi.blogspot.fr/


[jira] [Commented] (HIVE-3963) Allow Hive to connect to RDBMS

2013-03-09 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598050#comment-13598050
 ] 

Mark Grover commented on HIVE-3963:
---

Maxime, can you upload a patch and post it for review.

Also, out of curiosity, when would a user use this instead of using something 
like Apache Sqoop.

 Allow Hive to connect to RDBMS
 --

 Key: HIVE-3963
 URL: https://issues.apache.org/jira/browse/HIVE-3963
 Project: Hive
  Issue Type: New Feature
  Components: Import/Export, JDBC, SQL, StorageHandler
Affects Versions: 0.9.0, 0.10.0, 0.9.1, 0.11.0
Reporter: Maxime LANCIAUX

 I am thinking about something like :
 SELECT jdbcload('driver','url','user','password','sql') FROM dual;
 There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for 
 JDBCStorageHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Merging HCatalog into Hive

2013-03-09 Thread Alan Gates
Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in 
a live conversation expressed his desire to move hcat into the Hive namespace 
sooner rather than later.  So the proposal is that we'd move the code to 
org.apache.hive.hcatalog, though we would create shell classes and interfaces 
in org.apache.hcatalog for all public classes and interfaces so that it will be 
backward compatible.  I'm fine with doing this now.

So, let's get started.  Carl, could you create an hcatalog directory under 
trunk/hive and grant the listed hcat committers karma on it?  Then I'll get 
started on moving the actual code.

Alan.

On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:

 Looks good from my perspective and I glad to see this moving forward.
 
 Regarding #4 (JIRA)
 
 I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
 but I think it would be better to leave them where they are.
 
 JIRA has a bulk move feature, but I am curious as why we would leave them
 under the old project? There might be good reason to orphan them, but my
 first thought is that it would be nice to have them under the HIVE project
 simply for search purposes.
 
 Brock
 
 
 
 
 On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote:
 
 Alright, our vote has passed, it's time to get on with merging HCatalog
 into Hive.  Here's the things I can think of we need to deal with.  Please
 add additional issues I've missed:
 
 1) Moving the code
 2) Dealing with domain names in the code
 3) The mailing lists
 4) The JIRA
 5) The website
 6) Committer rights
 7) Make a proposal for how HCat is released going forward
 8) Publish an FAQ
 
 Proposals for how we handle these:
 Below I propose an approach for how to handle each of these.  Feedback
 welcome.
 
 1) Moving the code
 I propose that HCat move into a subdirectory of Hive.  This fits nicely
 into Hive's structure since it already has metastore, ql, etc.  We'd just
 add 'hcatalog' as a new directory.  This directory would contain hcatalog
 as it is today.  It does not follow Hive's standard build model so we'd
 need to do some work to make it so that building Hive also builds HCat, but
 this should be minimal.
 
 2) Dealing with domain names
 HCat code currently is under org.apache.hcatalog.  Do we want to change
 it?  In time we probably should change it to match the rest of Hive
 (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
 compatible way.  I propose we leave it as is for now and if we decide to in
 the future we can move the actual code to org.apache.hadoop.hive.hcatalog
 and create shell classes under org.apache.hcatalog.
 
 3) The mailing lists
 Given that our goal is to merge the projects and not create a subproject
 we should merge the mailing lists rather than keep hcat specific lists.  We
 can ask infra to remove hcatalog-*@incubator.apache.org and forward any
 new mail to the appropriate Hive lists.  We need to find out if they can
 auto-subscribe people from the hcat lists to the hive lists.  Given that
 traffic on the Hive lists is an order of magnitude higher we should warn
 people before we auto-subscribe them and allow them a chance to get off.
 
 4) JIRA
 We can create an hcatalog component in Hive's JIRA.  All new HCat issues
 could be filed there.  I don't know if there's a way to upload existing
 JIRAs into Hive's JIRA, but I think it would be better to leave them where
 they are.  We should see if infra can turn off the ability to create new
 JIRAs in hcatalog.
 
 5) Website
 We will need to integrate HCatalog's website with Hive's.  This should be
 easy except for the documentation.  HCat uses forrest for docs, Hive uses
 wiki.  We will need to put links under 'Documentation' for older versions
 of HCat docs so users can find them.  As far as how docs are handled for
 the next version of HCatalog, I think that depends on the answer to
 question 7 (next release of HCat), but I propose that HCat needs to conform
 to the way Hive does docs on wiki.  Though I would strongly encourage the
 HCat docs to be version specific (that is, have a set of wiki pages for
 each version).  incubator.apache.org/hcatalog should be changed to
 forward to hive.apache.org.
 
 6) Committer rights
 Carl will need to set up committer rights for all the new HCat committers.
 Based on our discussion of making active HCat committers Hive submodule
 committers this would add the following set:  Alan, Sushanth, Francis,
 Daniel, Vandana, Travis, and Mithun.  Ashutosh and Paul are already Hive
 committers, and neither Devaraj nor Mac have been active in HCat in over a
 year.
 
 7) Future releases
 We need to discuss how future releases will happen, as I think this will
 help developers and users know how to respond to the merge.  I propose that
 HCat will simply become part of future Hive releases.  Thus Hive 0.11 (or
 whatever the next major release is) will include HCatalog.  If there are
 issues found we may need to make 

[jira] [Updated] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-03-09 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

Summary: Delay the serialize-deserialize pair in CommonJoinTaskDispatcher  
(was: Delay the serialize-deserialize pair in CommonJoinResolver)

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4078) Delay the serialize-deserialize pair in CommonJoinTaskDispatcher

2013-03-09 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-4078:
--

  Labels: client perfomance  (was: )
Release Note: Create a copy of the MapredWork only if a conditional task is 
involved and avoid the copy if the task is non-conditional.  (was: Create a 
copy of the MapredWork only if a conditional task is involved and avoid the 
copy if the task is non-conditional)
  Status: Patch Available  (was: Open)

 Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
 

 Key: HIVE-4078
 URL: https://issues.apache.org/jira/browse/HIVE-4078
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gopal V
Assignee: Gopal V
  Labels: perfomance, client
 Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch, 
 HIVE-4078-20130406.patch


 CommonJoinProcessor tries to clone a MapredWork while attempting a conversion 
 to a map-join
 {code}
   // deep copy a new mapred work from xml
   InputStream in = new ByteArrayInputStream(xml.getBytes(UTF-8));
   MapredWork newWork = Utilities.deserializeMapRedWork(in, 
 physicalContext.getConf());
 {code}
 which is a very heavy operation memory wise  cpu-wise.
 It would be better to do this only if a conditional task is required, 
 resulting in a copy of the task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4139) MiniDFS shim does not work for hadoop 2

2013-03-09 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-4139:
-

Attachment: HIVE-4139.2.patch

Had to add shim for MiniMRCluster as well. Ran tests on 1 line, 2 line and .20. 
All seem to check out fine.

 MiniDFS shim does not work for hadoop 2
 ---

 Key: HIVE-4139
 URL: https://issues.apache.org/jira/browse/HIVE-4139
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4139.1.patch, HIVE-4139.2.patch


 There's an incompatibility between hadoop 1  2 wrt to the MiniDfsCluster 
 class. That causes the hadoop 2 line Minimr tests to fail with a 
 MethodNotFound exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4143) Incorrect column mappings with over clause

2013-03-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598131#comment-13598131
 ] 

Ashutosh Chauhan commented on HIVE-4143:


A simpler query which demonstrates the problem is:
{noformat}
select ts, dec, rnk
from
  (select ts, dec, rank() over (partition by ts)  as rnk
  from (select other.ts, other.dec
 from over10k other join over10k on (other.b = over10k.b)
) item_sales 
  ) item_rank
where rnk =  3;
{noformat}

Workaround is to {{set hive.ppd.remove.duplicatefilters=false;}} 
It seems that Hive is too agressive in pushing filters. In this particular case 
filter (rnk = 3) gets pushed over PTFOperator and then this filter tries to 
reference column (rnk) which infact is generated by PTFOperator. We need to 
make this filter cannot be pushed over PTFOperator.

 Incorrect column mappings with over clause
 --

 Key: HIVE-4143
 URL: https://issues.apache.org/jira/browse/HIVE-4143
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4127) Testing with Hadoop 2.x causes test failure for ORC's TestFileDump

2013-03-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598136#comment-13598136
 ] 

Hudson commented on HIVE-4127:
--

Integrated in Hive-trunk-h0.21 #2008 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2008/])
HIVE-4127 : Testing with Hadoop 2.x causes test failure for ORC 
TestFileDump (Owen Omalley via Ashutosh Chauhan) (Revision 1454736)

 Result = SUCCESS
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454736
Files : 
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestFileDump.java
* /hive/trunk/ql/src/test/resources/orc-file-dump.out


 Testing with Hadoop 2.x causes test failure for ORC's TestFileDump
 --

 Key: HIVE-4127
 URL: https://issues.apache.org/jira/browse/HIVE-4127
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.11.0

 Attachments: HIVE-4127.D9111.1.patch


 Hadoop 2's junit is a newer version, which causes differences in behaviors of 
 the TestFileDump. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2055) Hive HBase Integration issue

2013-03-09 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598145#comment-13598145
 ] 

Brock Noland commented on HIVE-2055:


[~michaeln] CM should be discussed here 
https://groups.google.com/a/cloudera.org/forum/?fromgroups#!forum/scm-users

 Hive HBase Integration issue
 

 Key: HIVE-2055
 URL: https://issues.apache.org/jira/browse/HIVE-2055
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: sajith v

 Created an external table in hive , which points to the HBase table. When 
 tried to query a column using the column name in select clause got the 
 following exception : ( java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, 
 SQLState:42000)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira