[jira] Updated: (HIVE-2015) Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages

2011-03-01 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2015:
-

Component/s: Diagnosability

 Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages
 

 Key: HIVE-2015
 URL: https://issues.apache.org/jira/browse/HIVE-2015
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, Metastore
Reporter: Carl Steinbach

 Every time I start up the Hive CLI with logging enabled I'm treated to the 
 following ERROR log messages courtesy of DataNucleus:
 {code}
 DEBUG metastore.ObjectStore: datanucleus.plugin.pluginRegistryBundleCheck = 
 LOG 
 ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires 
 org.eclipse.core.resources but it cannot be resolved. 
 ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires 
 org.eclipse.core.runtime but it cannot be resolved. 
 ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires 
 org.eclipse.text but it cannot be resolved.
 {code}
 Here's where this comes from:
 * The bin/hive scripts cause Hive to inherit Hadoop's classpath.
 * Hadoop's classpath includes $HADOOP_HOME/lib/core-3.1.1.jar, an Eclipse 
 library.
 * core-3.1.1.jar includes a plugin.xml file defining an OSGI plugin
 * At startup, Datanucleus scans the classpath looking for OSGI plugins, and 
 will attempt to initialize any that it finds, including the Eclipse OSGI 
 plugins located in core-3.1.1.jar
 * Initialization of the OSGI plugin in core-3.1.1.jar fails because of 
 unresolved dependencies.
 * We see an ERROR message telling us that Datanucleus failed to initialize a 
 plugin that we don't care about in the first place.
 I can think of two options for solving this problem:
 # Rewrite the scripts in $HIVE_HOME/bin so that they don't inherit ALL of 
 Hadoop's CLASSPATH.
 # Replace DataNucleus's NOnManagedPluginRegistry with our own implementation 
 that does nothing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2016) alter partition should throw exception if the specified partition does not exist.

2011-03-01 Thread Chinna Rao Lalam (JIRA)
alter partition should throw exception if the specified partition does not 
exist. 
--

 Key: HIVE-2016
 URL: https://issues.apache.org/jira/browse/HIVE-2016
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.8.0
 Environment: Hadoop 0.20.1, hive-0.8.0-SNAPSHOT and SUSE Linux 
Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam


To reproduce the issue follow the below steps

{noformat}
 set hive.exec.drop.ignorenonexistent=false;

 create table page_test(view INT, userid INT, page_url STRING) PARTITIONED 
BY(dt STRING, country STRING) STORED AS A TEXTFILE;

 LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
PARTITION(dt='10-10-2010',country='US');

 LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
PARTITION(dt='10-12-2010',country='IN');
{noformat}

{noformat}
 ALTER TABLE page_test DROP PARTITION (dt='23-02-2010',country='UK');
{noformat}

 This query should throw exception because the requested partition doesn't exist

 This issue related to HIVE-1535


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1976) Exception should be thrown when invalid jar,file,archive is given to add command

2011-03-01 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1976:
---

Attachment: HIVE-1976.2.patch

 Exception should be thrown when invalid jar,file,archive is given to add 
 command
 

 Key: HIVE-1976
 URL: https://issues.apache.org/jira/browse/HIVE-1976
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.7.0
 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1976.2.patch, HIVE-1976.patch


 When executed add command with non existing jar it should throw exception 
 through   HiveStatement
 Ex:
 {noformat}
   add jar /root/invalidpath/testjar.jar
 {noformat}
 Here testjar.jar is not exist so it should throw exception.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1976) Exception should be thrown when invalid jar,file,archive is given to add command

2011-03-01 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1976:
---

Status: Patch Available  (was: Open)

 Exception should be thrown when invalid jar,file,archive is given to add 
 command
 

 Key: HIVE-1976
 URL: https://issues.apache.org/jira/browse/HIVE-1976
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.7.0
 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1976.2.patch, HIVE-1976.patch


 When executed add command with non existing jar it should throw exception 
 through   HiveStatement
 Ex:
 {noformat}
   add jar /root/invalidpath/testjar.jar
 {noformat}
 Here testjar.jar is not exist so it should throw exception.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1959) Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.

2011-03-01 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1959:
---

Affects Version/s: (was: 0.5.0)
   0.8.0
   Status: Patch Available  (was: Open)

 Potential memory leak when same connection used for long time. TaskInfo and 
 QueryInfo objects are getting accumulated on executing more queries on the 
 same connection.
 ---

 Key: HIVE-1959
 URL: https://issues.apache.org/jira/browse/HIVE-1959
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Affects Versions: 0.8.0
 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1959.patch


 *org.apache.hadoop.hive.ql.history.HiveHistory$TaskInfo* and 
 *org.apache.hadoop.hive.ql.history.HiveHistory$QueryInfo* these two objects 
 are getting accumulated on executing more number of queries on the same 
 connection. These objects are getting released only when the connection is 
 closed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1959) Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.

2011-03-01 Thread MIS (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13000970#comment-13000970
 ] 

MIS commented on HIVE-1959:
---

How about using WeakHashMap in place of using HashMap instead of explicitly 
removing from the map! The WeakHashMap can be used for both the fields- 
queryInfoMap and taskInfoMap of HiveHistory.java class.

 Potential memory leak when same connection used for long time. TaskInfo and 
 QueryInfo objects are getting accumulated on executing more queries on the 
 same connection.
 ---

 Key: HIVE-1959
 URL: https://issues.apache.org/jira/browse/HIVE-1959
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Affects Versions: 0.8.0
 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1959.patch


 *org.apache.hadoop.hive.ql.history.HiveHistory$TaskInfo* and 
 *org.apache.hadoop.hive.ql.history.HiveHistory$QueryInfo* these two objects 
 are getting accumulated on executing more number of queries on the same 
 connection. These objects are getting released only when the connection is 
 closed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Hudson: Hive-trunk-h0.20 #587

2011-03-01 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/587/

--
[...truncated 27274 lines...]
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-11-07_029_7539438746644458276/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-03-01 11:11:10,082 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-11-07_029_7539438746644458276/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_20110301_261577518.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-11-11_547_8408294535313140244/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-11-11_547_8408294535313140244/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 

Build failed in Hudson: Hive-0.7.0-h0.20 #22

2011-03-01 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/22/

--
[...truncated 27337 lines...]
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201103011151_895859862.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-51-49_942_3188772478214189372/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-03-01 11:51:53,015 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-51-49_942_3188772478214189372/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201103011151_2046568759.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-51-54_638_6592658643303351152/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-03-01_11-51-54_638_6592658643303351152/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table 

[jira] Commented: (HIVE-1959) Potential memory leak when same connection used for long time. TaskInfo and QueryInfo objects are getting accumulated on executing more queries on the same connection.

2011-03-01 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001085#comment-13001085
 ] 

Ning Zhang commented on HIVE-1959:
--

+1 on WeakHashMap. HiveHistory.endQuery() is called at the finally clause of 
Driver.execute(). There are some early exits (should be changed in my opinion. 
I will file another JIRA to fix this) in the loop of execute() which will 
bypass the finally clause. Using WeakHashMap can prevent leaks in this case.  

 Potential memory leak when same connection used for long time. TaskInfo and 
 QueryInfo objects are getting accumulated on executing more queries on the 
 same connection.
 ---

 Key: HIVE-1959
 URL: https://issues.apache.org/jira/browse/HIVE-1959
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Affects Versions: 0.8.0
 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1959.patch


 *org.apache.hadoop.hive.ql.history.HiveHistory$TaskInfo* and 
 *org.apache.hadoop.hive.ql.history.HiveHistory$QueryInfo* these two objects 
 are getting accumulated on executing more number of queries on the same 
 connection. These objects are getting released only when the connection is 
 closed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1095) Hive in Maven

2011-03-01 Thread Andreas Neumann (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001115#comment-13001115
 ] 

Andreas Neumann commented on HIVE-1095:
---

Hi, I am working on Oozie (github.com/yahoo/oozie) and there is high demand for 
Hive support for Oozie. Currently that is problematic because there is no Hive 
release in the Apache Maven repository. Because Oozie is mavenized, we would 
need Maven artifacts available, preferably in the Apache repository. Any chance 
this could be resolved soon? Thanks -Andreas.


 Hive in Maven
 -

 Key: HIVE-1095
 URL: https://issues.apache.org/jira/browse/HIVE-1095
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Gerrit Jansen van Vuuren
Priority: Minor
 Attachments: HIVE-1095-trunk.patch, hiveReleasedToMaven.tar.gz


 Getting hive into maven main repositories
 Documentation on how to do this is on:
 http://maven.apache.org/guides/mini/guide-central-repository-upload.html

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-2016) alter partition should throw exception if the specified partition does not exist.

2011-03-01 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001129#comment-13001129
 ] 

Ning Zhang commented on HIVE-2016:
--

Currently CLI print out an error message to Session.err. Throwing exception may 
cause backward compatibility issue. Is this issue mostly related to usage 
thrown Hive thrift server? if so we need to figure out a way to pass 
Session.err from HiveServer to client side. 

 alter partition should throw exception if the specified partition does not 
 exist. 
 --

 Key: HIVE-2016
 URL: https://issues.apache.org/jira/browse/HIVE-2016
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.8.0
 Environment: Hadoop 0.20.1, hive-0.8.0-SNAPSHOT and SUSE Linux 
 Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam

 To reproduce the issue follow the below steps
 {noformat}
  set hive.exec.drop.ignorenonexistent=false;
  create table page_test(view INT, userid INT, page_url STRING) PARTITIONED 
 BY(dt STRING, country STRING) STORED AS A TEXTFILE;
  LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
 PARTITION(dt='10-10-2010',country='US');
  LOAD DATA LOCAL INPATH '/home/test.txt' OVERWRITE INTO TABLE page_test 
 PARTITION(dt='10-12-2010',country='IN');
 {noformat}
 {noformat}
  ALTER TABLE page_test DROP PARTITION (dt='23-02-2010',country='UK');
 {noformat}
  This query should throw exception because the requested partition doesn't 
 exist
  This issue related to HIVE-1535

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2018) avoid loading Hive aux jars in CLI remote mode

2011-03-01 Thread Ning Zhang (JIRA)
avoid loading Hive aux jars in CLI remote mode
--

 Key: HIVE-2018
 URL: https://issues.apache.org/jira/browse/HIVE-2018
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang


CLI load a number of jars (aux jars) including serde, antlr, metastore etc. 
These jars could be large and takes time to load when they are deployed to 
heavy loaded NFS mount points. In CLI remote mode, all these jars are not 
needed by the client side. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-2018) avoid loading Hive aux jars in CLI remote mode

2011-03-01 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2018:
-

Attachment: HIVE-2018.patch

A simple patch that moves the jar loading code after checking if the CLI 
session is local mode. 

 avoid loading Hive aux jars in CLI remote mode
 --

 Key: HIVE-2018
 URL: https://issues.apache.org/jira/browse/HIVE-2018
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2018.patch


 CLI load a number of jars (aux jars) including serde, antlr, metastore etc. 
 These jars could be large and takes time to load when they are deployed to 
 heavy loaded NFS mount points. In CLI remote mode, all these jars are not 
 needed by the client side. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-2018) avoid loading Hive aux jars in CLI remote mode

2011-03-01 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2018:
-

Status: Patch Available  (was: Open)

 avoid loading Hive aux jars in CLI remote mode
 --

 Key: HIVE-2018
 URL: https://issues.apache.org/jira/browse/HIVE-2018
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2018.patch


 CLI load a number of jars (aux jars) including serde, antlr, metastore etc. 
 These jars could be large and takes time to load when they are deployed to 
 heavy loaded NFS mount points. In CLI remote mode, all these jars are not 
 needed by the client side. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2019) Implement NOW() UDF

2011-03-01 Thread Carl Steinbach (JIRA)
Implement NOW() UDF
---

 Key: HIVE-2019
 URL: https://issues.apache.org/jira/browse/HIVE-2019
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach


Reference: 
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_now


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1644) use filter pushdown for automatically accessing indexes

2011-03-01 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001208#comment-13001208
 ] 

He Yongqiang commented on HIVE-1644:


did a quick look at the HIVE-1644.4.patch itself. 

some comments:
1) add testcase for combinehiveinputformat
2) in the new testcase, the newly added conf hive.optimize.autoindex is not 
used?
3) I think there already is an api in Hive.java for getting all indexes on a 
table, No? Please double check.. If not, rename getIndexesOnTable to getIndexes
4) in GenMRTableScan1.java, it is not good to hardcode the inputformat name. 
why not just use indexClassName?
5) in ExecDriver.java, it is also not good here to hardcode the conf name 
hive.index.compact.file, because bitmap index may want to use a different 
name. So maybe should pass these work to some index type specific class
6) in the generateIndexQuery, the temp directory is not a random, so could 
conflict with others (in the same query), and the dir path should not be 
generated there, should be generated in the optimizer which can have global 
control. And if i think insert overwrite directory 'full_path_to_a_dir' select 
.. would fail if the full_path_to_a_dir does not exist (or its parent does not 
exist). please check here
7) In the genereateIndexQuery, what is this used for?
+ParseContext indexQueryPctx = 
RewriteParseContextGenerator.generateOperatorTree(pctx.getConf(), qlCommand);


And today the index optimizer is before the breaking task tree. So the index 
scan task is generated before the task for original table scan. so it is very 
hard to hook them together. The only i can think is to remember the op id for 
the original table scan, and do another process to hook them together after 
breaking task tree. But i think it is too hack.

Maybe a better way to do it is in the physical optimizer. In physical 
optimizer, hive presents a task tree. and the optimizer can go through each 
task, and do the same thing (since each task has the same operator tree). And 
it will be much easier for managing task dependency. And i think most code will 
be the same. And for complex queries, this approach will be cleaner.


 use filter pushdown for automatically accessing indexes
 ---

 Key: HIVE-1644
 URL: https://issues.apache.org/jira/browse/HIVE-1644
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Russell Melick
 Attachments: HIVE-1644.1.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, 
 HIVE-1644.4.patch


 HIVE-1226 provides utilities for analyzing filters which have been pushed 
 down to a table scan.  The next step is to use these for selecting available 
 indexes and generating access plans for those indexes.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-2018) avoid loading Hive aux jars in CLI remote mode

2011-03-01 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001213#comment-13001213
 ] 

He Yongqiang commented on HIVE-2018:


+1, will commit after tests pass.

 avoid loading Hive aux jars in CLI remote mode
 --

 Key: HIVE-2018
 URL: https://issues.apache.org/jira/browse/HIVE-2018
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2018.patch


 CLI load a number of jars (aux jars) including serde, antlr, metastore etc. 
 These jars could be large and takes time to load when they are deployed to 
 heavy loaded NFS mount points. In CLI remote mode, all these jars are not 
 needed by the client side. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2020) Create a separate namespace for Hive variables

2011-03-01 Thread Carl Steinbach (JIRA)
Create a separate namespace for Hive variables
--

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach


Support for variable substitution was added in HIVE-1096. However, variable 
substitution was implemented by reusing the HiveConf namespace, so there is no 
separation between Hive configuration properties and Hive variables.

This ticket encompasses the following enhancements:
* Create a separate namespace for managing Hive variables.
* Add support for setting variables on the command line via '-hivevar x=y'
* Add support for setting variables through the CLI via 'var x=y'
* Provide a means for differentiating between hiveconf, hivevar, system, and 
environment properties in the output of 'set -v'



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1644) use filter pushdown for automatically accessing indexes

2011-03-01 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001218#comment-13001218
 ] 

John Sichi commented on HIVE-1644:
--

Yongqiang, could you reference where exactly in physical optimization code 
you're thinking of?

Also, do you mean move the entire index optimization there, or only the part 
about creation of the task dependency?


 use filter pushdown for automatically accessing indexes
 ---

 Key: HIVE-1644
 URL: https://issues.apache.org/jira/browse/HIVE-1644
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Russell Melick
 Attachments: HIVE-1644.1.patch, HIVE-1644.2.patch, HIVE-1644.3.patch, 
 HIVE-1644.4.patch


 HIVE-1226 provides utilities for analyzing filters which have been pushed 
 down to a table scan.  The next step is to use these for selecting available 
 indexes and generating access plans for those indexes.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2021) Add a configuration property that sets the variable substitution max depth

2011-03-01 Thread Carl Steinbach (JIRA)
Add a configuration property that sets the variable substitution max depth
--

 Key: HIVE-2021
 URL: https://issues.apache.org/jira/browse/HIVE-2021
 Project: Hive
  Issue Type: Improvement
  Components: Configuration, Query Processor
Reporter: Carl Steinbach


The VariableSubstitution class contains a hardcoded MAX_SUBST=40 value which 
defines the maximum number of variable references that are allowed to appear in 
a single Hive statement. This value should be configurable via hiveconf.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1095) Hive in Maven

2011-03-01 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001262#comment-13001262
 ] 

Carl Steinbach commented on HIVE-1095:
--

Relevant ASF documentation:
http://www.apache.org/dev/repository-faq.html
http://www.apache.org/dev/publishing-maven-artifacts.html


 Hive in Maven
 -

 Key: HIVE-1095
 URL: https://issues.apache.org/jira/browse/HIVE-1095
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Gerrit Jansen van Vuuren
Priority: Minor
 Attachments: HIVE-1095-trunk.patch, hiveReleasedToMaven.tar.gz


 Getting hive into maven main repositories
 Documentation on how to do this is on:
 http://maven.apache.org/guides/mini/guide-central-repository-upload.html

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-2022) Making JDO thread-safe by default

2011-03-01 Thread Ning Zhang (JIRA)
Making JDO thread-safe by default
-

 Key: HIVE-2022
 URL: https://issues.apache.org/jira/browse/HIVE-2022
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang


If there are multiple thread accessing metastore concurrently, there are cases 
that JDO threw exceptions because of concurrent access of HashMap inside JDO. 
Setting javax.jdo.option.Multithreaded to true solves this issue. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-2022) Making JDO thread-safe by default

2011-03-01 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2022:
-

Attachment: HIVE-2022.patch

 Making JDO thread-safe by default
 -

 Key: HIVE-2022
 URL: https://issues.apache.org/jira/browse/HIVE-2022
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2022.patch


 If there are multiple thread accessing metastore concurrently, there are 
 cases that JDO threw exceptions because of concurrent access of HashMap 
 inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1941) support explicit view partitioning

2011-03-01 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001281#comment-13001281
 ] 

Paul Yang commented on HIVE-1941:
-

+1 tests passed

 support explicit view partitioning
 --

 Key: HIVE-1941
 URL: https://issues.apache.org/jira/browse/HIVE-1941
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.8.0

 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, 
 HIVE-1941.4.patch, HIVE-1941.5.patch


 Allow creation of a view with an explicit partitioning definition, and 
 support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
 For more information, see
 http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1941) support explicit view partitioning

2011-03-01 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-1941:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Thanks John!

 support explicit view partitioning
 --

 Key: HIVE-1941
 URL: https://issues.apache.org/jira/browse/HIVE-1941
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.8.0

 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, 
 HIVE-1941.4.patch, HIVE-1941.5.patch


 Allow creation of a view with an explicit partitioning definition, and 
 support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
 For more information, see
 http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-2022) Making JDO thread-safe by default

2011-03-01 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001283#comment-13001283
 ] 

Paul Yang commented on HIVE-2022:
-

+1 Will commit once tests pass.

 Making JDO thread-safe by default
 -

 Key: HIVE-2022
 URL: https://issues.apache.org/jira/browse/HIVE-2022
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2022.patch


 If there are multiple thread accessing metastore concurrently, there are 
 cases that JDO threw exceptions because of concurrent access of HashMap 
 inside JDO. Setting javax.jdo.option.Multithreaded to true solves this issue. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira