[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x

2012-08-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435843#comment-13435843
 ] 

Namit Jain commented on HIVE-3029:
--

@Carl, can you take care of this ?

If you are busy, I can wrap it up - let me know.

 Update ShimLoader to work with Hadoop 2.x
 -

 Key: HIVE-3029
 URL: https://issues.apache.org/jira/browse/HIVE-3029
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3380) As a follow up for HIVE-3276, optimize union for dynamic partition queries

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3380:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Will do as part of HIVE-3276

 As a follow up for HIVE-3276, optimize union for dynamic partition queries
 --

 Key: HIVE-3380
 URL: https://issues.apache.org/jira/browse/HIVE-3380
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1610 - Still Failing

2012-08-16 Thread Apache Jenkins Server
Changes for Build #1606
[cws] HIVE-2804. Task log retrieval fails on Hadoop 0.23 (Zhenxiao Luo via cws)


Changes for Build #1607
[cws] HIVE-3337. Create Table Like should copy configured Table Parameters 
(Bhushan Mandhani via cws)


Changes for Build #1608

Changes for Build #1609
[hashutosh] HIVE-3385 : fixing 0.23 test build (Sushanth Sowmyan via Ashutosh 
Chauhan)


Changes for Build #1610



No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1610)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1610/ to 
view the results.

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #107

2012-08-16 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/107/

--
[...truncated 10116 lines...]
 [echo] Project: odbc
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/odbc/src/conf
 does not exist.

ivy-resolve-test:
 [echo] Project: odbc

ivy-retrieve-test:
 [echo] Project: odbc

compile-test:
 [echo] Project: odbc

create-dirs:
 [echo] Project: serde
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/serde/src/test/resources
 does not exist.

init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:
 [echo] Project: serde
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-serde-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-serde-default.html

ivy-retrieve:
 [echo] Project: serde

dynamic-serde:

compile:
 [echo] Project: serde

ivy-resolve-test:
 [echo] Project: serde

ivy-retrieve-test:
 [echo] Project: serde

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/serde/test/classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/service/src/test/resources
 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-service-default.html

ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/service/test/classes

test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common/java;/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 

Re: Problem with Hive Indexing

2012-08-16 Thread Mahsa Mofidpoor
Hi,

At lease the table size must be greater than 5GB to use the index for
filter pushdown. Otherwise you have to comment the checkQuerySize method.

Cheers,
Mahsa

On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji abli...@gmail.com wrote:

 I have written a custom index handler and wanted to test it. However hive
 is not using it.
 So I test with simple table (pokes (int foo, string bar)) which comes with
 hive distribution for testing purpose.
 Then I created a compact index and set the set
 hive.optimize.index.filter=true;
 However, upon checking the log info, it seems hive is still not using the
 index.
 So, what is the problem ?
 The query I issued is as follow:  select foo from pokes WHERE foo=498 ;

 Below is the log info I got after issuing the query.

 12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate for
 index optimization
 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498)
 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default
 tbl=pokes_idx
 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string
 _bucketname, list _offsets}
 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index
 staleness...
 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455
 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455
 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop
 library
 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not loaded



Re: Problem with Hive Indexing

2012-08-16 Thread Ablimit Aji
Thanks Mahsa !
I didn't know that there is such a constraint.

Best,
Ablimit

On Thu, Aug 16, 2012 at 12:32 PM, Mahsa Mofidpoor mofidp...@gmail.comwrote:

 Hi,

 At lease the table size must be greater than 5GB to use the index for
 filter pushdown. Otherwise you have to comment the checkQuerySize method.

 Cheers,
 Mahsa

 On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji abli...@gmail.com wrote:

  I have written a custom index handler and wanted to test it. However hive
  is not using it.
  So I test with simple table (pokes (int foo, string bar)) which comes
 with
  hive distribution for testing purpose.
  Then I created a compact index and set the set
  hive.optimize.index.filter=true;
  However, upon checking the log info, it seems hive is still not using the
  index.
  So, what is the problem ?
  The query I issued is as follow:  select foo from pokes WHERE foo=498 ;
 
  Below is the log info I got after issuing the query.
 
  12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate
 for
  index optimization
  12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498)
  12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default
  tbl=pokes_idx
  12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string
  _bucketname, list _offsets}
  12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index
  staleness...
  12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455
  12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455
  12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop
  library
  12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not
 loaded
 



[jira] [Created] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle

2012-08-16 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-3390:
--

 Summary: Hive List Bucketing - DDL support - DB upgrade script for 
Derby, Postgres, and Oracle
 Key: HIVE-3390
 URL: https://issues.apache.org/jira/browse/HIVE-3390
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu


This is a follow-up for HIVE-3072.

We need upgrade scripts for Derby, Postgres, and Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3268) expressions in cluster by are not working

2012-08-16 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3268:


   Resolution: Fixed
Fix Version/s: 0.10.0
   Status: Resolved  (was: Patch Available)

Committed, thanks Namit.

 expressions in cluster by are not working
 -

 Key: HIVE-3268
 URL: https://issues.apache.org/jira/browse/HIVE-3268
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3268.1.patch, hive.3268.2.patch


 The following query fails:
 select key+key, value from src cluster by key+key, value;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436105#comment-13436105
 ] 

Kevin Wilfong commented on HIVE-3276:
-

Namit, is this ready for review?  You mention that more test need to be added, 
but the JIRA is marked Patch Available.

 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-138) Provide option to export a HEADER

2012-08-16 Thread Andrew Perepelytsya (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436115#comment-13436115
 ] 

Andrew Perepelytsya commented on HIVE-138:
--

Adam, it doesn't look like 'with header' syntax was implemented as described 
originally, but I checked the patch diff and setting this option did work for 
me in 0.7.x:
{code}set hive.cli.print.header=true;{code}

 Provide option to export a HEADER
 -

 Key: HIVE-138
 URL: https://issues.apache.org/jira/browse/HIVE-138
 Project: Hive
  Issue Type: Improvement
  Components: Clients, Query Processor
Reporter: Adam Kramer
Assignee: Paul Butler
Priority: Minor
 Fix For: 0.7.0

 Attachments: HIVE-138.patch


 When writing data to directories or files for later analysis, or when 
 exploring data in the hive CLI with raw SELECT statements, it'd be great if 
 we could get a header or something so we know which columns our output 
 comes from. Any chance this is easy to add? Just print the column names (or 
 formula used to generate them) in the first row?
 SELECT foo.* WITH HEADER FROM some_table foo limit 3;
 col1col2col3
 1   9   6
 7   5   0
 7   5   3
 SELECT f.col1-f.col2, col3 WITH HEADER FROM some_table foo limit 3;
 f.col1-f.col2 col3
 -8 6
 2 0
 2 3
 ...etc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-3388) Improve Performance of UDF PERCENTILE_APPROX()

2012-08-16 Thread Rongrong Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rongrong Zhong reassigned HIVE-3388:


Assignee: Rongrong Zhong

 Improve Performance of UDF PERCENTILE_APPROX()
 --

 Key: HIVE-3388
 URL: https://issues.apache.org/jira/browse/HIVE-3388
 Project: Hive
  Issue Type: Task
Reporter: Rongrong Zhong
Assignee: Rongrong Zhong
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3276:
-

Status: Open  (was: Patch Available)

 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436176#comment-13436176
 ] 

Namit Jain commented on HIVE-3276:
--

I was able to run tests for hadoop 23.
I will upload the new patch soon.

 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Possible patch to fix column comments with non-native SerDe

2012-08-16 Thread Jakob Homan
You'll need to update your serde to use the method call that takes comments.  
See 
https://github.com/jghoman/haivvreo/commit/29ead1fe101baafa8e9844eaf92022cbe4846c6f
 for an example.  


Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, August 7, 2012 at 7:21 AM, Stephen R. Scaffidi wrote:

 So, the patch doesn not seem to fix the problem we are having, but it 
 combined with the one I sent to the list seems to take care of it. I 
 will continue to study this issue and report back on any issues.
 
 Thanks again!
 
 On 08/06/2012 06:59 PM, Stephen Scaffid wrote:
  Thanks! I'll see how it goes!
  
  (better yet, this could be what it takes to convince the team to upgrade!)
  
  On Aug 6, 2012, at 6:47 PM, Jakob Homan wrote:
  
   This was fixed in Hive 8 
   (https://issues.apache.org/jira/browse/HIVE-2171). Can you just apply 
   that patch?
   
   On Mon, Aug 6, 2012 at 2:15 PM, Stephen R. Scaffidi 
   sscaff...@tripadvisor.com (mailto:sscaff...@tripadvisor.com) wrote:
   My team and I have been trying, with limited success, to use the COMMENT 
   feature of hive columns to maintain documentation for the tables and 
   columns in our data-warehouse built on hive. However, we use a number of 
   custom and non-native SerDes, and what happens to those tables is that 
   the comments always get overwritten with the string from deserializer.
   
   I've possibly found a way to work around this from within hive but I want 
   to get some insight from the hive-dev community to figure out whether or 
   not this is a patently bad idea and we are just setting ourselves up for 
   pain later on.
   
   I won't go into all the details but it seems to work in our (so far) 
   limited testing. However, we are using hive 0.7.1 and the patch I am 
   sending is against master/HEAD.
   
   Please let me know if this is an acceptable approach to preserving column 
   comments with non-native SerDes or not! 



[jira] [Updated] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3375:
-

Attachment: hive.3375.3.patch

 bucketed map join should check that the number of files match the number of 
 buckets
 ---

 Key: HIVE-3375
 URL: https://issues.apache.org/jira/browse/HIVE-3375
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch


 Currently, we get NPE if that is not the case

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets

2012-08-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436190#comment-13436190
 ] 

Namit Jain commented on HIVE-3375:
--

refreshed after few other commits

 bucketed map join should check that the number of files match the number of 
 buckets
 ---

 Key: HIVE-3375
 URL: https://issues.apache.org/jira/browse/HIVE-3375
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch


 Currently, we get NPE if that is not the case

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3276:
-

Attachment: hive.3276.3.patch

 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436202#comment-13436202
 ] 

Namit Jain commented on HIVE-3276:
--

@Kevin, this is ready for review.

 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3276) optimize union sub-queries

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3276:
-

Status: Patch Available  (was: Open)

ShimLoader changes are copied from HIVE-3029, only to run tests on hadoop 23.
Once HIVE-3029 is checked in, this file will be reverted

Also, to run tests for thew newly added tests only on hadoop 23:

ant clean package


ant test -Dhadoop.mr.rev=23 -Dtest.print.classpath=true 
-Dhadoop.version=2.0.0-alpha -Dhadoop.security.version=2.0.0-alpha 
-Dtestcase=TestCliDriver 
-Dqfile=union_remove_1.q,union_remove_2.q,union_remove_3.q,union_remove_4.q,union_remove_5.q,union_remove_6.q,union_remove_7.q,union_remove_8.q,union_remove_9.q,union_remove_10.q,union_remove_11.q,union_remove_12.q,union_remove_13.q,union_remove_14.q,union_remove_15.q,union_remove_16.q,union_remove_17.q,union_remove_18.q


 optimize union sub-queries
 --

 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch


 It might be a good idea to optimize simple union queries containing 
 map-reduce jobs in at least one of the sub-qeuries.
 For eg:
 a query like:
 insert overwrite table T1 partition P1
 select * from 
 (
   subq1
 union all
   subq2
 ) u;
 today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
 the final one for the union. 
 It might be a good idea to optimize this. Instead of creating the union 
 task, it might be simpler to create a move task (or something like a move
 task), where the outputs of the two sub-queries will be moved to the final 
 directory. This can easily extend to more than 2 sub-queries in the union.
 This is only useful if there is a select * followed by filesink after the
 union. This can be independently useful, and also be used to optimize the
 skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle

2012-08-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-3390.
--

Resolution: Invalid

This work needs to be done as part of HIVE-3072.

 Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, 
 and Oracle
 -

 Key: HIVE-3390
 URL: https://issues.apache.org/jira/browse/HIVE-3390
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu

 This is a follow-up for HIVE-3072.
 We need upgrade scripts for Derby, Postgres, and Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3391) Keep the original query in HiveDriverRunHookContextImpl

2012-08-16 Thread Dawid Dabrowski (JIRA)
Dawid Dabrowski created HIVE-3391:
-

 Summary: Keep the original query in HiveDriverRunHookContextImpl
 Key: HIVE-3391
 URL: https://issues.apache.org/jira/browse/HIVE-3391
 Project: Hive
  Issue Type: Improvement
Reporter: Dawid Dabrowski
Priority: Minor


It'd be useful to have access to the original query in hooks. The hook that's 
executed first is HiveDriverRunHook, let's add it there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets

2012-08-16 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436227#comment-13436227
 ] 

Carl Steinbach commented on HIVE-3375:
--

+1

@Namit: Can you handle testing this and getting it committed? Thanks.

 bucketed map join should check that the number of files match the number of 
 buckets
 ---

 Key: HIVE-3375
 URL: https://issues.apache.org/jira/browse/HIVE-3375
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch


 Currently, we get NPE if that is not the case

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3228) unable to load null values that represent a timestamp value

2012-08-16 Thread Neha Tomar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436245#comment-13436245
 ] 

Neha Tomar commented on HIVE-3228:
--

I don't think this issue is fixed as I am still seeing it. Can anyone pleases 
update?

 unable to load null values that represent a timestamp value
 ---

 Key: HIVE-3228
 URL: https://issues.apache.org/jira/browse/HIVE-3228
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: N Campbell
 Attachments: CERT.TTS.txt


 Attempting to load delimited data into a table with one or more timestamp 
 columns will fail when null values are represented in the input set.
 load data local inpath 'CERT.TTS.txt'
 overwrite into table CERT.TTS_E;
 insert overwrite table CERT.TTS  select * from CERT.TTS_E;
 Error: Query returned non-zero code: 9, cause: FAILED: Execution Error, 
 return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
 SQLState:  08S01
 ErrorCode: 9
 create table if not exists CERT.TTS_E ( RNUM int , CTS timestamp)
 row format delimited
 fields terminated by '\t'
 stored as textfile;
 create table if not exists CERT.TTS ( RNUM int , CTS timestamp) 
 stored as sequencefile;
 0 
 1 1996-01-01 00:00:00.0
 2 1996-01-01 12:00:00.0
 3 1996-01-01 23:59:30.12300
 4 2000-01-01 00:00:00.0
 5 2000-01-01 12:00:00.0
 6 2000-01-01 23:59:30.12300
 7 2000-12-31 00:00:00.0
 8 2000-12-31 12:00:00.0
 9 2000-12-31 12:15:30.12300

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle

2012-08-16 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436246#comment-13436246
 ] 

Gang Tim Liu commented on HIVE-3390:


@Carl, got you. do you have some instructions to generate upgrade script for 
derby, postgres and Orable? Are you using SchemaTool? If you have instructions, 
it will be big help. thanks

 Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, 
 and Oracle
 -

 Key: HIVE-3390
 URL: https://issues.apache.org/jira/browse/HIVE-3390
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu

 This is a follow-up for HIVE-3072.
 We need upgrade scripts for Derby, Postgres, and Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle

2012-08-16 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436251#comment-13436251
 ] 

Carl Steinbach commented on HIVE-3390:
--

Schema tool won't generate the upgrade script for you. These need to be written 
by hand. I recommend looking at the other upgrade script examples in the 
derby/postgres/oracle directories.

Also, it's probably worth doing this last once you're fairly certain that 
people won't request any more changes to the JDO mapping file.

 Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, 
 and Oracle
 -

 Key: HIVE-3390
 URL: https://issues.apache.org/jira/browse/HIVE-3390
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu

 This is a follow-up for HIVE-3072.
 We need upgrade scripts for Derby, Postgres, and Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle

2012-08-16 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436253#comment-13436253
 ] 

Gang Tim Liu commented on HIVE-3390:


I see. will do it last.

I am making changes and target to get you a patch to review today.

thanks

 Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, 
 and Oracle
 -

 Key: HIVE-3390
 URL: https://issues.apache.org/jira/browse/HIVE-3390
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu

 This is a follow-up for HIVE-3072.
 We need upgrade scripts for Derby, Postgres, and Oracle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #107

2012-08-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/

--
[...truncated 36653 lines...]
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-08-16_14-18-28_732_5449941647958900016/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_190836605.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Copying file: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2012-08-16_14-18-33_058_8798019762878503929/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-08-16_14-18-33_058_8798019762878503929/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_1853619718.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_29133.txt
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_156525202.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: 

[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop

2012-08-16 Thread Andrew Chalfant (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436323#comment-13436323
 ] 

Andrew Chalfant commented on HIVE-3068:
---

pingping

 Add ability to export table metadata as JSON on table drop
 --

 Key: HIVE-3068
 URL: https://issues.apache.org/jira/browse/HIVE-3068
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Serializers/Deserializers
Reporter: Andrew Chalfant
Assignee: Andrew Chalfant
Priority: Minor
  Labels: features, newbie
 Attachments: HIVE-3068.2.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table is dropped, the contents go to the users trash but the metadata 
 is lost. It would be super neat to be able to save the metadata as well so 
 that tables could be trivially re-instantiated via thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3268) expressions in cluster by are not working

2012-08-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436365#comment-13436365
 ] 

Hudson commented on HIVE-3268:
--

Integrated in Hive-trunk-h0.21 #1611 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1611/])
HIVE-3268. expressions in cluster by are not working. (njain via 
kevinwilfong) (Revision 1373918)

 Result = SUCCESS
kevinwilfong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373918
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientnegative/expr_clusterby1.q
* /hive/trunk/ql/src/test/queries/clientnegative/expr_distributeby1.q
* /hive/trunk/ql/src/test/queries/clientnegative/expr_distributeby_sortby_1.q
* /hive/trunk/ql/src/test/queries/clientnegative/expr_orderby1.q
* /hive/trunk/ql/src/test/queries/clientnegative/expr_sortby1.q
* /hive/trunk/ql/src/test/results/clientnegative/expr_clusterby1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/expr_distributeby1.q.out
* 
/hive/trunk/ql/src/test/results/clientnegative/expr_distributeby_sortby_1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/expr_orderby1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/expr_sortby1.q.out


 expressions in cluster by are not working
 -

 Key: HIVE-3268
 URL: https://issues.apache.org/jira/browse/HIVE-3268
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3268.1.patch, hive.3268.2.patch


 The following query fails:
 select key+key, value from src cluster by key+key, value;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table

2012-08-16 Thread Jonathan Natkins (JIRA)
Jonathan Natkins created HIVE-3392:
--

 Summary: Hive unnecessarily validates table SerDes when dropping a 
table
 Key: HIVE-3392
 URL: https://issues.apache.org/jira/browse/HIVE-3392
 Project: Hive
  Issue Type: Bug
Reporter: Jonathan Natkins


natty@hadoop1:~$ hive
hive add jar 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
Added 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
 to class path
Added resource: 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
hive create table test (a int) row format serde 'hive.serde.JSONSerDe';

OK
Time taken: 2.399 seconds


natty@hadoop1:~$ hive
hive drop table test;  
 
FAILED: Hive Internal Error: 
java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
 SerDe com.cloudera.hive.serde.JSONSerDe does not exist))
java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
com.cloudera.hive.serde.JSONSerDe does not exist)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
at 
org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
SerDe com.cloudera.hive.serde.JSONSerDe does not exist)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
... 20 more

hive add jar 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
Added 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
 to class path
Added resource: 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
hive drop table test;
OK
Time taken: 0.658 seconds
hive 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table

2012-08-16 Thread Jonathan Natkins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Natkins updated HIVE-3392:
---

Description: 
natty@hadoop1:~$ hive
hive add jar 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
Added 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
 to class path
Added resource: 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
hive create table test (a int) row format serde 'hive.serde.JSONSerDe';

OK
Time taken: 2.399 seconds


natty@hadoop1:~$ hive
hive drop table test;  
 
FAILED: Hive Internal Error: 
java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
 SerDe hive.serde.JSONSerDe does not exist))
java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
hive.serde.JSONSerDe does not exist)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
at 
org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
SerDe com.cloudera.hive.serde.JSONSerDe does not exist)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
... 20 more

hive add jar 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
Added 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
 to class path
Added resource: 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
hive drop table test;
OK
Time taken: 0.658 seconds
hive 


  was:
natty@hadoop1:~$ hive
hive add jar 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
Added 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
 to class path
Added resource: 
/home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
hive create table test (a int) row format serde 'hive.serde.JSONSerDe';

OK
Time taken: 2.399 seconds


natty@hadoop1:~$ hive
hive drop table test;  
 
FAILED: Hive Internal Error: 
java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
 SerDe com.cloudera.hive.serde.JSONSerDe does not exist))
java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
com.cloudera.hive.serde.JSONSerDe does not exist)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
at 
org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
at 

Hive-trunk-h0.21 - Build # 1611 - Fixed

2012-08-16 Thread Apache Jenkins Server
Changes for Build #1606
[cws] HIVE-2804. Task log retrieval fails on Hadoop 0.23 (Zhenxiao Luo via cws)


Changes for Build #1607
[cws] HIVE-3337. Create Table Like should copy configured Table Parameters 
(Bhushan Mandhani via cws)


Changes for Build #1608

Changes for Build #1609
[hashutosh] HIVE-3385 : fixing 0.23 test build (Sushanth Sowmyan via Ashutosh 
Chauhan)


Changes for Build #1610

Changes for Build #1611
[kevinwilfong] HIVE-3268. expressions in cluster by are not working. (njain via 
kevinwilfong)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1611)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1611/ to 
view the results.

[jira] [Updated] (HIVE-3268) expressions in cluster by are not working

2012-08-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3268:
-

Component/s: Query Processor

 expressions in cluster by are not working
 -

 Key: HIVE-3268
 URL: https://issues.apache.org/jira/browse/HIVE-3268
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.10.0

 Attachments: hive.3268.1.patch, hive.3268.2.patch


 The following query fails:
 select key+key, value from src cluster by key+key, value;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop

2012-08-16 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436416#comment-13436416
 ] 

Edward Capriolo commented on HIVE-3068:
---

Dude don't do that. The average patch sits on the queue for some time and many 
committers volunteer time ill review ASAP.

 Add ability to export table metadata as JSON on table drop
 --

 Key: HIVE-3068
 URL: https://issues.apache.org/jira/browse/HIVE-3068
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Serializers/Deserializers
Reporter: Andrew Chalfant
Assignee: Andrew Chalfant
Priority: Minor
  Labels: features, newbie
 Attachments: HIVE-3068.2.patch.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table is dropped, the contents go to the users trash but the metadata 
 is lost. It would be super neat to be able to save the metadata as well so 
 that tables could be trivially re-instantiated via thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x

2012-08-16 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436422#comment-13436422
 ] 

Carl Steinbach commented on HIVE-3029:
--

@Namit: Looking at this now. Will commit soon. Thanks.

 Update ShimLoader to work with Hadoop 2.x
 -

 Key: HIVE-3029
 URL: https://issues.apache.org/jira/browse/HIVE-3029
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only

2012-08-16 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2925:


Attachment: HIVE-2925.3.patch.txt

 Support non-MR fetching for simple queries with select/limit/filter 
 operations only
 ---

 Key: HIVE-2925
 URL: https://issues.apache.org/jira/browse/HIVE-2925
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-2925.1.patch.txt, HIVE-2925.2.patch.txt, 
 HIVE-2925.3.patch.txt, HIVE-2925.D2607.1.patch, HIVE-2925.D2607.2.patch, 
 HIVE-2925.D2607.3.patch, HIVE-2925.D2607.4.patch


 It's trivial but frequently asked by end-users. Currently, select queries 
 with simple conditions or limit should run MR job which takes some time 
 especially for big tables, making the people irritated.
 For that kind of simple queries, using fetch task would make them happy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only

2012-08-16 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2925:


Status: Patch Available  (was: Open)

Rebased on trunk. Sorry for late reply.

 Support non-MR fetching for simple queries with select/limit/filter 
 operations only
 ---

 Key: HIVE-2925
 URL: https://issues.apache.org/jira/browse/HIVE-2925
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-2925.1.patch.txt, HIVE-2925.2.patch.txt, 
 HIVE-2925.3.patch.txt, HIVE-2925.D2607.1.patch, HIVE-2925.D2607.2.patch, 
 HIVE-2925.D2607.3.patch, HIVE-2925.D2607.4.patch


 It's trivial but frequently asked by end-users. Currently, select queries 
 with simple conditions or limit should run MR job which takes some time 
 especially for big tables, making the people irritated.
 For that kind of simple queries, using fetch task would make them happy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3393) get_json_object and json_tuple should use Jackson library

2012-08-16 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-3393:
---

 Summary: get_json_object and json_tuple should use Jackson library
 Key: HIVE-3393
 URL: https://issues.apache.org/jira/browse/HIVE-3393
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor


The Jackson library's JSON parsers have been shown to be significantly faster 
that json.org's.  The library is already included, so I can't think of a reason 
not to use it.

There's also the potential for further improvements in replacing many of the 
try catch blocks with if statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3393) get_json_object and json_tuple should use Jackson library

2012-08-16 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3393:


Attachment: HIVE-3393.1.patch.txt

 get_json_object and json_tuple should use Jackson library
 -

 Key: HIVE-3393
 URL: https://issues.apache.org/jira/browse/HIVE-3393
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Attachments: HIVE-3393.1.patch.txt


 The Jackson library's JSON parsers have been shown to be significantly faster 
 that json.org's.  The library is already included, so I can't think of a 
 reason not to use it.
 There's also the potential for further improvements in replacing many of the 
 try catch blocks with if statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3393) get_json_object and json_tuple should use Jackson library

2012-08-16 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3393:


Status: Patch Available  (was: Open)

 get_json_object and json_tuple should use Jackson library
 -

 Key: HIVE-3393
 URL: https://issues.apache.org/jira/browse/HIVE-3393
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Attachments: HIVE-3393.1.patch.txt


 The Jackson library's JSON parsers have been shown to be significantly faster 
 that json.org's.  The library is already included, so I can't think of a 
 reason not to use it.
 There's also the potential for further improvements in replacing many of the 
 try catch blocks with if statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3393) get_json_object and json_tuple should use Jackson library

2012-08-16 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436466#comment-13436466
 ] 

Kevin Wilfong commented on HIVE-3393:
-

Uploaded a diff here https://reviews.facebook.net/D4701

 get_json_object and json_tuple should use Jackson library
 -

 Key: HIVE-3393
 URL: https://issues.apache.org/jira/browse/HIVE-3393
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Attachments: HIVE-3393.1.patch.txt


 The Jackson library's JSON parsers have been shown to be significantly faster 
 that json.org's.  The library is already included, so I can't think of a 
 reason not to use it.
 There's also the potential for further improvements in replacing many of the 
 try catch blocks with if statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x

2012-08-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3029:
-

Attachment: HIVE-3029.2.patch.txt

 Update ShimLoader to work with Hadoop 2.x
 -

 Key: HIVE-3029
 URL: https://issues.apache.org/jira/browse/HIVE-3029
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.10.0

 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, 
 HIVE-3029.D3255.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x

2012-08-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3029:
-

   Resolution: Fixed
Fix Version/s: 0.10.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Update ShimLoader to work with Hadoop 2.x
 -

 Key: HIVE-3029
 URL: https://issues.apache.org/jira/browse/HIVE-3029
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.10.0

 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, 
 HIVE-3029.D3255.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3389) running tests for hadoop 23

2012-08-16 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436493#comment-13436493
 ] 

Sushanth Sowmyan commented on HIVE-3389:


Namit, I hit a similar issue a while back and Ashutosh pointed me to the patch 
in HIVE-3029 - I tested with tst.q as you mentioned, and if I apply HIVE-3029, 
it isn't skipped.

 running tests for hadoop 23
 ---

 Key: HIVE-3389
 URL: https://issues.apache.org/jira/browse/HIVE-3389
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Sushanth Sowmyan



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3385) fixing 0.23 test build

2012-08-16 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436496#comment-13436496
 ] 

Sushanth Sowmyan commented on HIVE-3385:


As commented on the HIVE-3389, that issue seems to be fixed by HIVE-3029

 fixing 0.23 test build
 --

 Key: HIVE-3385
 URL: https://issues.apache.org/jira/browse/HIVE-3385
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
  Labels: build, test
 Fix For: 0.10.0

 Attachments: HIVE-3385.patch, HIVE-3385.patch.2


 Follow up jira after HIVE-3341, we need to make hive tests work on 0.23.
 For starters, we need to add in a jar into build/ivy/lib/hadoop0.23.shim/ 
 that includes MiniMRCluster. With 0.23, MiniMRCluster has moved to 
 hadoop-mapreduce-client-jobclient-{$version}-tests.jar and that needs to be 
 included in as an ivy dependence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3226) ColumnPruner is not working on LateralView

2012-08-16 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436535#comment-13436535
 ] 

Navis commented on HIVE-3226:
-

added comments

 ColumnPruner is not working on LateralView
 --

 Key: HIVE-3226
 URL: https://issues.apache.org/jira/browse/HIVE-3226
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-3226.1.patch.txt, HIVE-3226.2.patch.txt


 Column pruning is not applied to LVJ and SEL operator, which makes exceptions 
 at various stages. For example,
 {noformat}
 drop table array_valued_src;
 create table array_valued_src (key string, value arraystring);
 insert overwrite table array_valued_src select key, array(value) from src;
 select sum(val) from (select a.key as key, b.value as array_val from src a 
 join array_valued_src b on a.key=b.key) i lateral view explode (array_val) c 
 as val;
 ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col5]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
   at 
 org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:62)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3387) meta data file size exceeds limit

2012-08-16 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436544#comment-13436544
 ] 

Navis commented on HIVE-3387:
-

Configurations set by set command is not propagated to JobConf for MR job. 
It's just used inside of hive.

In above case you mentioned, value of 
mapreduce.jobtracker.split.metainfo.maxsize applied to hadoop is 10M(default) 
which is 1/10 of your expectation. If you change mapred-site.xml, it would not 
occur.

I also think there should be a way to change properties of JobConf. But some 
permission things should be preceded before that.

 meta data file size exceeds limit
 -

 Key: HIVE-3387
 URL: https://issues.apache.org/jira/browse/HIVE-3387
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: Alexander Alten-Lorenz
 Fix For: 0.9.1


 The cause is certainly that we use an array list instead of a set structure 
 in the split locations API. Looks like a bug in Hive's CombineFileInputFormat.
 Reproduce:
 Set mapreduce.jobtracker.split.metainfo.maxsize=1 when submitting the 
 Hive query. Run a big hive query that write data into a partitioned table. 
 Due to the large number of splits, you encounter an exception on the job 
 submitted to Hadoop and the exception said:
 meta data size exceeds 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-3391) Keep the original query in HiveDriverRunHookContextImpl

2012-08-16 Thread Dawid Dabrowski (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Dabrowski reassigned HIVE-3391:
-

Assignee: Dawid Dabrowski

 Keep the original query in HiveDriverRunHookContextImpl
 ---

 Key: HIVE-3391
 URL: https://issues.apache.org/jira/browse/HIVE-3391
 Project: Hive
  Issue Type: Improvement
Reporter: Dawid Dabrowski
Assignee: Dawid Dabrowski
Priority: Minor
   Original Estimate: 72h
  Remaining Estimate: 72h

 It'd be useful to have access to the original query in hooks. The hook that's 
 executed first is HiveDriverRunHook, let's add it there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3226) ColumnPruner is not working on LateralView

2012-08-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436549#comment-13436549
 ] 

Namit Jain commented on HIVE-3226:
--

+1

Running tests

 ColumnPruner is not working on LateralView
 --

 Key: HIVE-3226
 URL: https://issues.apache.org/jira/browse/HIVE-3226
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-3226.1.patch.txt, HIVE-3226.2.patch.txt


 Column pruning is not applied to LVJ and SEL operator, which makes exceptions 
 at various stages. For example,
 {noformat}
 drop table array_valued_src;
 create table array_valued_src (key string, value arraystring);
 insert overwrite table array_valued_src select key, array(value) from src;
 select sum(val) from (select a.key as key, b.value as array_val from src a 
 join array_valued_src b on a.key=b.key) i lateral view explode (array_val) c 
 as val;
 ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col5]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
   at 
 org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:62)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at 
 org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-3389) running tests for hadoop 23

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-3389.
--

Resolution: Fixed

duplicate of HIVE-3029

 running tests for hadoop 23
 ---

 Key: HIVE-3389
 URL: https://issues.apache.org/jira/browse/HIVE-3389
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Sushanth Sowmyan



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x

2012-08-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436571#comment-13436571
 ] 

Hudson commented on HIVE-3029:
--

Integrated in Hive-trunk-h0.21 #1612 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1612/])
HIVE-3029. Update ShimLoader to work with Hadoop 2.x (Carl Steinbach via 
cws) (Revision 1374101)

 Result = SUCCESS
cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1374101
Files : 
* /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java


 Update ShimLoader to work with Hadoop 2.x
 -

 Key: HIVE-3029
 URL: https://issues.apache.org/jira/browse/HIVE-3029
 Project: Hive
  Issue Type: Bug
  Components: Shims
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.10.0

 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, 
 HIVE-3029.D3255.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets

2012-08-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3375:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed (Thanks kevin and carl)

 bucketed map join should check that the number of files match the number of 
 buckets
 ---

 Key: HIVE-3375
 URL: https://issues.apache.org/jira/browse/HIVE-3375
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch


 Currently, we get NPE if that is not the case

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira