[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995211#comment-12995211
 ] 

Namit Jain commented on HIVE-1517:
--

Ideally, /PREFIX/db should be created as part of locking. You dont need to 
explicitly lock the db.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995213#comment-12995213
 ] 

Namit Jain commented on HIVE-1517:
--

Sorry, looked at the code again - your locking changes are good - will review 
the remaining patch

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995242#comment-12995242
 ] 

Namit Jain commented on HIVE-1517:
--

The code changes look good - but I am getting a lot of errors while running 
tests.
Can you run tests again ?

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995245#comment-12995245
 ] 

Namit Jain commented on HIVE-1517:
--

archive.q is the first test that fails - it works fine when run stand alone. 
I havent debugged further

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Some queries re locking

2011-02-16 Thread Krishna Kumar
Hello,

While looking into some of the tangential issues encountered while doing
the export/import related work, I have some questions:

1. Should CREATE TABLE lock (shared) the database? I think so from the
discussions, but I do not think it happens now.

2. Similarly LOAD should also lock (exclusive) the table/partition by
adding the table/partition to the outputs.

3. While trying a fix for the above, I ran into another issue. IIUC,
Test[Negative]CliDriver templates starts the zkcluster via QTestUtil ctor,
but this is immediately shutdown via cleanup-teardown call, so most of the
create/loads in createSources run without a zookeeper server, so any attempt
to lock errors out. Is this by intent?

Cheers
 Krishna



Build failed in Hudson: Hive-trunk-h0.20 #560

2011-02-16 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/560/changes

Changes:

[nzhang] HIVE-1995. Mismatched open/commit transaction calls when using 
get_partition() (Paul Yang via Ning Zhang)

--
[...truncated 25917 lines...]
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_03-42-53_387_1663455750151725481/-mr-1
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201102160342_238414354.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key  10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_03-42-56_693_5493969927981102183/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-02-16 03:42:59,413 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key  10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_03-42-56_693_5493969927981102183/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_03-42-59_579_4313374442578404480/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-02-16 03:43:02,258 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_03-42-59_579_4313374442578404480/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.6 sec
[junit] Testcase: testNonHiveCommand took 0.889 sec
[junit] Testcase: testMetastore took 0.247 sec
[junit] Testcase: testGetClusterStatus took 0.103 sec
[junit] Testcase: testFetch took 8.715 sec
[junit] Testcase: testDynamicSerde took 7.133 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ant/build.xml:40:
 

Build failed in Hudson: Hive-0.7.0-h0.20 #3

2011-02-16 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/3/

--
[...truncated 7392 lines...]

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve:
:: loading settings :: file = 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ivy/ivysettings.xml

ivy-retrieve:

compile:
 [echo] Compiling: hive
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/serde/build.xml:52:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

compile-test:
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build-common.xml:317:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds
[javac] Compiling 20 source files to 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/serde/test/classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/serde/src/test/org/apache/hadoop/hive/serde2/dynamic_type/TestDynamicSerDe.java
 uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build-common.xml:330:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ant/build.xml:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deploy-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ant/build.xml:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

jar:

init:

core-compile:
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/service/build.xml:59:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

compile:

compile-test:
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build-common.xml:317:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds
[javac] Compiling 1 source file to 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/test/classes
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build-common.xml:330:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

test:

test-shims:

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ant/build.xml:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deploy-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ant/build.xml:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

jar:

init:

compile:

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-retrieve-hadoop-source:
:: loading settings :: file = 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/ivy/ivysettings.xml
[ivy:retrieve] :: resolving dependencies :: 
org.apache.hadoop.hive#shims;work...@vesta.apache.org
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  found hadoop#core;0.20.0 in hadoop-source
[ivy:retrieve]  found hadoop#core;0.20.3-CDH3-SNAPSHOT in hadoop-source
[ivy:retrieve] :: resolution report :: resolve 1721ms :: artifacts dl 1ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|

[jira] Commented: (HIVE-1817) Remove Hive dependency on unreleased commons-cli 2.0 Snapshot

2011-02-16 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995379#comment-12995379
 ] 

John Sichi commented on HIVE-1817:
--

Oops, you're right, my environment was messed up.  I'll do some more testing 
and then commit.

 Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
 -

 Key: HIVE-1817
 URL: https://issues.apache.org/jira/browse/HIVE-1817
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure, CLI
Reporter: Carl Steinbach
Assignee: Carl Steinbach
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1817.2.patch.txt, HIVE-1817.3.patch.txt, 
 HIVE-1817.4.patch.txt, HIVE-1817.wip.1.patch.txt


 The Hive CLI depends on commons-cli-2.0-SNAPSHOT. This branch of of the 
 commons-cli project is dead.
 Hive needs to use commons-cli-1.2 instead. See MAPREDUCE-767 for more 
 information.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1928) GRANT/REVOKE should handle privileges as tokens, not identifiers

2011-02-16 Thread Jonathan Natkins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Natkins updated HIVE-1928:
---

Assignee: Jonathan Natkins
  Status: Patch Available  (was: Open)

 GRANT/REVOKE should handle privileges as tokens, not identifiers
 

 Key: HIVE-1928
 URL: https://issues.apache.org/jira/browse/HIVE-1928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Security
Affects Versions: 0.7.0
Reporter: Carl Steinbach
Assignee: Jonathan Natkins
Priority: Critical
 Attachments: HIVE-1928.1.patch


 The grammar for the GRANT and REVOKE Privileges statements currently handle 
 the list of privileges as a list of
 identifiers. Since most of the privileges are also keywords in the HQL 
 grammar this requires users
 to individually quote-escape each of the privileges, e.g:
 {code}
 grant `Create` on table authorization_part to user hive_test_user;
 grant `Update` on table authorization_part to user hive_test_user;
 grant `Drop` on table authorization_part to user hive_test_user;
 grant `select` on table src to user hive_test_user;
 {code}
 Both MySQL and the SQL standard treat privileges as tokens. Hive should do 
 the same.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1928) GRANT/REVOKE should handle privileges as tokens, not identifiers

2011-02-16 Thread Jonathan Natkins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Natkins updated HIVE-1928:
---

Attachment: HIVE-1928.1.patch

https://reviews.apache.org/r/427/

 GRANT/REVOKE should handle privileges as tokens, not identifiers
 

 Key: HIVE-1928
 URL: https://issues.apache.org/jira/browse/HIVE-1928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Security
Affects Versions: 0.7.0
Reporter: Carl Steinbach
Assignee: Jonathan Natkins
Priority: Critical
 Attachments: HIVE-1928.1.patch


 The grammar for the GRANT and REVOKE Privileges statements currently handle 
 the list of privileges as a list of
 identifiers. Since most of the privileges are also keywords in the HQL 
 grammar this requires users
 to individually quote-escape each of the privileges, e.g:
 {code}
 grant `Create` on table authorization_part to user hive_test_user;
 grant `Update` on table authorization_part to user hive_test_user;
 grant `Drop` on table authorization_part to user hive_test_user;
 grant `select` on table src to user hive_test_user;
 {code}
 Both MySQL and the SQL standard treat privileges as tokens. Hive should do 
 the same.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




cli dependency change

2011-02-16 Thread John Sichi
After you svn up the commit for HIVE-1817 and do ant clean package, you may get 
errors like the ones below.  If so, it means you were relying on the Hadoop 
0.20.0 tarball we use for Hive development, and you need to update your 
$HADOOP_HOME to point to

hive-trunk/build/hadoopcore/hadoop-0.20.1 instead of 0.20.0

Errors:

Cannot find hadoop installation: $HADOOP_HOME must be set or hadoop must be in 
the path

Hive requires Hadoop 0.20.x (x = 1).
'hadoop version' returned:
Hadoop 0.20.0 Subversion 
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504 
Compiled by ndaley on Thu Apr 9 05:18:40 UTC 2009

JVS



[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995440#comment-12995440
 ] 

Siying Dong commented on HIVE-1517:
---

I applied that patch to a clean directory and I am running the tests. It is 
still running but archive.q already passed. Maybe try to do a clean before test?

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1980) Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts

2011-02-16 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1980:
-

Attachment: HIVE-1980.patch

 Merging using mapreduce rather than map-only job failed in case of dynamic 
 partition inserts
 

 Key: HIVE-1980
 URL: https://issues.apache.org/jira/browse/HIVE-1980
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1980.patch


 In dynamic partition insert and if merge is set to true and 
 hive.mergejob.maponly=false, the merge MapReduce job will fail. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1980) Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts

2011-02-16 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1980:
-

Status: Patch Available  (was: Open)

 Merging using mapreduce rather than map-only job failed in case of dynamic 
 partition inserts
 

 Key: HIVE-1980
 URL: https://issues.apache.org/jira/browse/HIVE-1980
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1980.patch


 In dynamic partition insert and if merge is set to true and 
 hive.mergejob.maponly=false, the merge MapReduce job will fail. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Hudson: Hive-trunk-h0.20 #561

2011-02-16 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/561/

--
[...truncated 25918 lines...]
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_11-51-45_385_2695947955312831206/-mr-1
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201102161151_1116181896.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key  10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_11-51-48_822_3415836056332837519/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-02-16 11:51:51,484 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key  10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_11-51-48_822_3415836056332837519/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_11-51-51_651_5928746621248243453/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-02-16 11:51:54,351 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_11-51-51_651_5928746621248243453/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.417 sec
[junit] Testcase: testNonHiveCommand took 0.99 sec
[junit] Testcase: testMetastore took 0.278 sec
[junit] Testcase: testGetClusterStatus took 0.105 sec
[junit] Testcase: testFetch took 9.653 sec
[junit] Testcase: testDynamicSerde took 7.561 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ant/build.xml:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deploy-ant-tasks:


[jira] Updated: (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name

2011-02-16 Thread Kirk True (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated HIVE-1996:


Description: 
Steps:

1. From the command line copy the kv2.txt data file into the current user's 
HDFS directory:

{{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt kv2.txt}}

2. In Hive, create the table:

{{create table tst_src1 (key_ int, value_ string);}}

3. Load the data into the table from HDFS:

{{load data inpath './kv2.txt' into table tst_src1;}}

4. Repeat step 1
5. Repeat step 3

Expected:

To have kv2.txt renamed in HDFS and then copied to the destination as per 
HIVE-307.

Actual:

File is renamed, but Hive.copyFiles doesn't see the change in srcs as it 
continues to use the same array elements (with the un-renamed, old file names). 
It crashes with this error:

{noformat}
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
{noformat}

  was:
Steps:

1. From the command line copy the kv2.txt data file into the current user's 
HDFS directory:

{{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
kv2.txt}}

2. In Hive, create the table:

{{create table tst_src1 (key_ int, value_ string);}}

3. Load the data into the table from HDFS:

{{load data inpath './kv2.txt' into table tst_src1;}}

4. Repeat step 1
5. Repeat step 3

Expected:

To have kv2.txt renamed in HDFS and then copied to the destination as per 
HIVE-307.

Actual:

File is renamed, but Hive.copyFiles doesn't see the change in srcs as it 
continues to use the same array elements (with the un-renamed, old file names). 
It crashes with this error:

{{java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
}}


 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Kirk True

 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt 

[jira] Commented: (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name

2011-02-16 Thread Kirk True (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995499#comment-12995499
 ] 

Kirk True commented on HIVE-1996:
-

This is very closely related to, but not the same as, HIVE-307. That bug 
specifically pertains to {{LOCAL}} files.

 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Kirk True

 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt renamed in HDFS and then copied to the destination as per 
 HIVE-307.
 Actual:
 File is renamed, but Hive.copyFiles doesn't see the change in srcs as it 
 continues to use the same array elements (with the un-renamed, old file 
 names). It crashes with this error:
 {noformat}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
 at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name

2011-02-16 Thread Kirk True (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated HIVE-1996:


Description: 
Steps:

1. From the command line copy the kv2.txt data file into the current user's 
HDFS directory:

{{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt kv2.txt}}

2. In Hive, create the table:

{{create table tst_src1 (key_ int, value_ string);}}

3. Load the data into the table from HDFS:

{{load data inpath './kv2.txt' into table tst_src1;}}

4. Repeat step 1
5. Repeat step 3

Expected:

To have kv2.txt renamed in HDFS and then copied to the destination as per 
HIVE-307.

Actual:

File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} as 
it continues to use the same array elements (with the un-renamed, old file 
names). It crashes with this error:

{noformat}
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
{noformat}

  was:
Steps:

1. From the command line copy the kv2.txt data file into the current user's 
HDFS directory:

{{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt kv2.txt}}

2. In Hive, create the table:

{{create table tst_src1 (key_ int, value_ string);}}

3. Load the data into the table from HDFS:

{{load data inpath './kv2.txt' into table tst_src1;}}

4. Repeat step 1
5. Repeat step 3

Expected:

To have kv2.txt renamed in HDFS and then copied to the destination as per 
HIVE-307.

Actual:

File is renamed, but Hive.copyFiles doesn't see the change in srcs as it 
continues to use the same array elements (with the un-renamed, old file names). 
It crashes with this error:

{noformat}
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
{noformat}


 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Kirk True

 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To 

[jira] Resolved: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-16 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi resolved HIVE-1981.
--

Resolution: Fixed

Fixed as part of another commit (HIVE-1817).


 TestHadoop20SAuthBridge failed on current trunk
 ---

 Key: HIVE-1981
 URL: https://issues.apache.org/jira/browse/HIVE-1981
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Ning Zhang
Assignee: Carl Steinbach
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1981.1.patch.txt


 I'm on the latest trunk and ant package test failed on 
 TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Question about Transform and M/R scripts

2011-02-16 Thread Vijay
Hi,

I'm trying this use case: do a simple select from an existing table
and pass the results through a reduce script to do some analysis. The
table has web logs so the select uses a pseudo user ID as the key and
the rest of the data as values. My expectation is that a single reduce
script should receive all logs for a given user so that I can do some
path based analysis. Are there any issues with this idea so far?

When I try it though, hive is not doing what I'd expect. The
particular query is not generating any reduce tasks at all. Here's a
sample query:

FROM(
  SELECT userid, time, url
  FROM weblogs
) weblogs
reduce weblogs.userid, weblogs.time, weblogs.url
using 'counter.pl'
as user, count;

Thanks,
Vijay


Re: Question about Transform and M/R scripts

2011-02-16 Thread Edward Capriolo
On Wed, Feb 16, 2011 at 5:07 PM, Vijay tec...@gmail.com wrote:
 Hi,

 I'm trying this use case: do a simple select from an existing table
 and pass the results through a reduce script to do some analysis. The
 table has web logs so the select uses a pseudo user ID as the key and
 the rest of the data as values. My expectation is that a single reduce
 script should receive all logs for a given user so that I can do some
 path based analysis. Are there any issues with this idea so far?

 When I try it though, hive is not doing what I'd expect. The
 particular query is not generating any reduce tasks at all. Here's a
 sample query:

 FROM(
  SELECT userid, time, url
  FROM weblogs
 ) weblogs
 reduce weblogs.userid, weblogs.time, weblogs.url
 using 'counter.pl'
 as user, count;

 Thanks,
 Vijay


It is hard to tell without the script. Is your pl script working on pipes?

ie.

while (in){
  echo $_
}


[jira] Commented: (HIVE-1788) Add more calls to the metastore thrift interface

2011-02-16 Thread Ashish Thusoo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995586#comment-12995586
 ] 

Ashish Thusoo commented on HIVE-1788:
-

In my particular application I need to get the entire table object. If I get 
only the names and then call back to the metastore to get a table object at a 
time, it will be very slow. I have not measured the speed of the call with many 
tables, however, the hope is that with the offset and limit fields the 
application can stream those tables across multiple calls as opposed to getting 
them in one single gigantic batch. I will do the measurement though to find out 
how bad this is.

The offsets would not be consistent if new tables are created in between, 
however, this level of consistency is not needed by the application. Think of 
an application displaying all the tables and their associated columns for a 
particular user and does that in a paginated way. During pagination if new 
tables appear or disappear is not critical from the application point of view. 
The only other way of ensuring that things are consistent would be to run the 
whole query without offsets and limits or do all this in a long transaction. 
The later would be bad because it will hold locks on tables while pagination is 
happening and that would be really bad for other clients. Thoughts?

 Add more calls to the metastore thrift interface
 

 Key: HIVE-1788
 URL: https://issues.apache.org/jira/browse/HIVE-1788
 Project: Hive
  Issue Type: New Feature
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo
 Attachments: HIVE-1788.txt


 For administrative purposes the following calls to the metastore thrift 
 interface would be very useful:
 1. Get the table metadata for all the tables owned by a particular users
 2. Ability to iterate over this set of tables
 3. Ability to change a particular key value property of the table

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1211) Tapping logs from child processes

2011-02-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1211:
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-0.7. Thanks Jonathan!

 Tapping logs from child processes
 -

 Key: HIVE-1211
 URL: https://issues.apache.org/jira/browse/HIVE-1211
 Project: Hive
  Issue Type: Improvement
  Components: Logging
Reporter: bc Wong
Assignee: Jonathan Natkins
 Fix For: 0.7.0

 Attachments: HIVE-1211-2.patch, HIVE-1211.1.patch, 
 HIVE-1211.3.patch.txt, HIVE-1211.4.patch.txt, HIVE-1211.5.patch.txt, 
 HIVE-1211.6.patch.txt, HIVE-1211.7.patch.txt, HIVE-1211.8.patch.txt


 Stdout/stderr from child processes (e.g. {{MapRedTask}}) are redirected to 
 the parent's stdout/stderr. There is little one can do to to sort out which 
 log is from which query.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Hudson: Hive-0.7.0-h0.20 #4

2011-02-16 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/4/changes

Changes:

[jvs] HIVE-1817. Remove Hive dependency on unreleased commons-cli 2.0 Snapshot
(Carl Steinbach via jvs)

--
[...truncated 25788 lines...]
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201102161610_1408374625.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_16-10-30_461_5556028776539932583/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-16 16:10:33,463 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_16-10-30_461_5556028776539932583/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201102161610_1831017391.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-16_16-10-35_028_965483959839104039/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 

[jira] Commented: (HIVE-1928) GRANT/REVOKE should handle privileges as tokens, not identifiers

2011-02-16 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995600#comment-12995600
 ] 

Carl Steinbach commented on HIVE-1928:
--

+1. Will commit if tests pass.


 GRANT/REVOKE should handle privileges as tokens, not identifiers
 

 Key: HIVE-1928
 URL: https://issues.apache.org/jira/browse/HIVE-1928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Security
Affects Versions: 0.7.0
Reporter: Carl Steinbach
Assignee: Jonathan Natkins
Priority: Critical
 Attachments: HIVE-1928.1.patch


 The grammar for the GRANT and REVOKE Privileges statements currently handle 
 the list of privileges as a list of
 identifiers. Since most of the privileges are also keywords in the HQL 
 grammar this requires users
 to individually quote-escape each of the privileges, e.g:
 {code}
 grant `Create` on table authorization_part to user hive_test_user;
 grant `Update` on table authorization_part to user hive_test_user;
 grant `Drop` on table authorization_part to user hive_test_user;
 grant `select` on table src to user hive_test_user;
 {code}
 Both MySQL and the SQL standard treat privileges as tokens. Hive should do 
 the same.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-1928. GRANT/REVOKE should handle privileges as tokens, not identifiers

2011-02-16 Thread Jonathan Natkins

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/427/
---

(Updated 2011-02-16 16:17:14.544784)


Review request for hive.


Changes
---

Updating with an additional test change


Summary
---

Review request for HIVE-1928.


This addresses bug HIVE-1928.
https://issues.apache.org/jira/browse/HIVE-1928


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 6fea990 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 9d9cea1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g c5574b0 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/DefaultHiveAuthorizationProvider.java
 89af39d 
  ql/src/java/org/apache/hadoop/hive/ql/security/authorization/Privilege.java 
1449091 
  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/PrivilegeRegistry.java
 7cfc0e5 
  ql/src/test/queries/clientnegative/authorization_fail_1.q 063ef52 
  ql/src/test/queries/clientnegative/authorization_fail_3.q 1d45ad6 
  ql/src/test/queries/clientnegative/authorization_fail_4.q b58288b 
  ql/src/test/queries/clientnegative/authorization_fail_5.q 5be2965 
  ql/src/test/queries/clientnegative/authorization_fail_7.q ef80525 
  ql/src/test/queries/clientnegative/authorization_part.q 2d5efe3 
  ql/src/test/queries/clientpositive/authorization_1.q 0a26c2e 
  ql/src/test/queries/clientpositive/authorization_2.q 733cf5e 
  ql/src/test/queries/clientpositive/authorization_3.q 577886a 
  ql/src/test/queries/clientpositive/authorization_4.q d679d78 
  ql/src/test/queries/clientpositive/authorization_5.q df809f0 
  ql/src/test/queries/clientpositive/keyword_1.q c7fc640 
  ql/src/test/results/clientnegative/authorization_fail_1.q.out ad3b956 
  ql/src/test/results/clientnegative/authorization_fail_3.q.out 37f819b 
  ql/src/test/results/clientnegative/authorization_fail_4.q.out 3136b53 
  ql/src/test/results/clientnegative/authorization_fail_5.q.out fb599de 
  ql/src/test/results/clientnegative/authorization_fail_7.q.out 898e6f5 
  ql/src/test/results/clientnegative/authorization_part.q.out 028d3e3 
  ql/src/test/results/clientpositive/authorization_1.q.out 319d7ab 
  ql/src/test/results/clientpositive/authorization_2.q.out 5e947e4 
  ql/src/test/results/clientpositive/authorization_3.q.out 1006d52 
  ql/src/test/results/clientpositive/authorization_4.q.out fa3fa9d 
  ql/src/test/results/clientpositive/authorization_5.q.out 7578981 
  ql/src/test/results/clientpositive/keyword_1.q.out 4a0a194 

Diff: https://reviews.apache.org/r/427/diff


Testing
---


Thanks,

Jonathan



[jira] Commented: (HIVE-1694) Accelerate GROUP BY execution using indexes

2011-02-16 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995613#comment-12995613
 ] 

John Sichi commented on HIVE-1694:
--

Note:  I pointed the Harvey Mudd team over to your branch, so they're copying 
bits and pieces of necessary support into their patch.  Once they're a little 
further along, we can figure out how to reconcile the two before commit.


 Accelerate GROUP BY execution using indexes
 ---

 Key: HIVE-1694
 URL: https://issues.apache.org/jira/browse/HIVE-1694
 Project: Hive
  Issue Type: New Feature
  Components: Indexing, Query Processor
Affects Versions: 0.7.0
Reporter: Nikhil Deshpande
Assignee: Nikhil Deshpande
 Attachments: HIVE-1694.1.patch.txt, HIVE-1694_2010-10-28.diff, 
 demo_q1.hql, demo_q2.hql


 The index building patch (Hive-417) is checked into trunk, this JIRA issue 
 tracks supporting indexes in Hive compiler  execution engine for SELECT 
 queries.
 This is in ref. to John's comment at
 https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869
 on creating separate JIRA issue for tracking index usage in optimizer  query 
 execution.
 The aim of this effort is to use indexes to accelerate query execution (for 
 certain class of queries). E.g.
 - Filters and range scans (already being worked on by He Yongqiang as part of 
 HIVE-417?)
 - Joins (index based joins)
 - Group By, Order By and other misc cases
 The proposal is multi-step:
 1. Building index based operators, compiler and execution engine changes
 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose 
 between index scans, full table scans etc.)
 This JIRA initially focuses on the first step. This JIRA is expected to hold 
 the information about index based plans  operator implementations for above 
 mentioned cases. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1941) support explicit view partitioning

2011-02-16 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995626#comment-12995626
 ] 

He Yongqiang commented on HIVE-1941:


sorry, i was in the middle of other things. will go through the code again. 



 support explicit view partitioning
 --

 Key: HIVE-1941
 URL: https://issues.apache.org/jira/browse/HIVE-1941
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, 
 HIVE-1941.4.patch


 Allow creation of a view with an explicit partitioning definition, and 
 support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
 For more information, see
 http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Work stopped: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-1517 stopped by Siying Dong.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Work started: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-1517 started by Siying Dong.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1517:
--

Attachment: HIVE-1517.5.patch

fix test outputs of two new added tests after rebasing.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1517:
--

Status: Patch Available  (was: Open)

not huge difference though. I fixed two test outputs but it doesn't seem to be 
related to Namit's test failures. Namit, can you do a ant clean and then ant 
package and then run the tests again?

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1517:
--

Attachment: HIVE-1517.5.patch

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Hudson: Hive-trunk-h0.20 #563

2011-02-16 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/563/changes

Changes:

[cws] HIVE-1211 Tapping logs from child processes (Jonathan Natkins via cws)

--
[...truncated 25343 lines...]
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt'
 INTO TABLE srcbucket2
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt'
 INTO TABLE srcbucket2
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@srcbucket2
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt'
 INTO TABLE srcbucket2
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt'
 INTO TABLE srcbucket2
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@srcbucket2
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt'
 INTO TABLE srcbucket2
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt'
 INTO TABLE srcbucket2
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@srcbucket2
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 INTO TABLE src
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 INTO TABLE src
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt'
 INTO TABLE src1
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt'
 INTO TABLE src1
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src1
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq'
 INTO TABLE src_sequencefile
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq'
 INTO TABLE src_sequencefile
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src_sequencefile
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/complex.seq'
 INTO TABLE src_thrift
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/complex.seq'
 INTO TABLE src_thrift
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@src_thrift
[junit] OK
[junit] PREHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/json.txt'
 INTO TABLE src_json
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: query: LOAD DATA LOCAL INPATH 
'https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/json.txt'
 INTO 

Re: hadoop core 0.20.2 not found

2011-02-16 Thread abhinav narain
On Thu, Feb 10, 2011 at 2:17 PM, Carl Steinbach c...@cloudera.com wrote:

 Hi Abhinav,

 Assuming that your .ant directory already contains all of the Hive
 dependencies, then you should be able to build Hive in an offline mode
 using
 the following ant command:

 % ant -Divy.cache.name=offline

 Setting ivy.cache.name=offline tells Ivy to look for dependencies in its
 local cache (.ant/cache) before looking for things on the network.

FYI.
This did not work.


 I also found a blog post that describes how to access Ivy resources over a
 proxy via a local instance of cntlm:


 http://ramathoughts.blogspot.com/2010/04/dealing-with-pentaho-bi-server-build.html

 I have already done these things.

 In my opinion the cntlm route looks like the easiest to get working, and it
 doesn't require you to make any special settings to your Hive build
 properties.

 I did not make any changes to these properties. Just added cntlm and added
ANT_OPTS variable in .bashrc.

I don't know why it does not work and still contacts remote server even
though offline mode is given in *ant* as instructed by you.

regards,
Abhinav


 On Tue, Feb 8, 2011 at 11:48 PM, abhinav narain
 abhinavnarai...@gmail.comwrote:

  I compiled the code on remote machine without a proxy and it worked.
  Then, I copied the .ant folder to my comp(in lab) and the hive code also
 
  Now, I can see some ivy*.xml files and jar files in the
  org.apache.hbase/hbase folder .
 
  I build the same code using the new .ant folder in my home and I again
 get
  the same error as before of hbase-0.89.0-SNAPSHOT.jar not found.
 
  Is everyone on the development of hive having public ips and none behind
 a
  proxy ? facing similar issues ?
 
  I am unable to understand, why is it unable to fetch the jars when they
 are
  already present in the cache ?
 
 
  same error again :
 
 
 
 http://repo1.maven.org/maven2/org/apache/hbase/hbase/0.89.0-SNAPSHOT/hbase-0.89.0-SNAPSHOT.pom
  [ivy:resolve]   -- artifact
  org.apache.hbase#hbase;0.89.0-SNAPSHOT!hbase.jar(test-jar):
  [ivy:resolve]
 
 
 http://repo1.maven.org/maven2/org/apache/hbase/hbase/0.89.0-SNAPSHOT/hbase-0.89.0-SNAPSHOT.jar
  [ivy:resolve]   -- artifact
  org.apache.hbase#hbase;0.89.0-SNAPSHOT!hbase.jar:
  [ivy:resolve]
 
 
 http://repo1.maven.org/maven2/org/apache/hbase/hbase/0.89.0-SNAPSHOT/hbase-0.89.0-SNAPSHOT.jar
  [ivy:resolve]  datanucleus-repo: tried
  [ivy:resolve]   -- artifact
  org.apache.hbase#hbase;0.89.0-SNAPSHOT!hbase.jar:
  [ivy:resolve]
 
 
 http://www.datanucleus.org/downloads/maven2/org/apache/hbase/hbase/0.89.0-SNAPSHOT/hbase-0.89.0-SNAPSHOT.jar
  [ivy:resolve]   -- artifact
  org.apache.hbase#hbase;0.89.0-SNAPSHOT!hbase.jar(test-jar):
  [ivy:resolve]
 
 
 http://www.datanucleus.org/downloads/maven2/org/apache/hbase/hbase/0.89.0-SNAPSHOT/hbase-0.89.0-SNAPSHOT.jar
  [ivy:resolve] ::
  [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
  [ivy:resolve] ::
  [ivy:resolve] :: org.apache.hbase#hbase;0.89.0-SNAPSHOT: not
 found
 
 Abhinav Narain
 
  On Wed, Feb 9, 2011 at 2:31 AM, Carl Steinbach c...@cloudera.com
 wrote:
  
   Hi Abhinav,
  
   Please make sure the .ant cache directory contains the following
 files:
  
   .ant/cache/org.apache.hbase/hbase/jars/hbase-0.89.0-SNAPSHOT.jar
  
  
 
 .ant/cache/org.apache.hbase/hbase/test-jars/hbase-0.89.0-SNAPSHOT-tests.jar
  
   I don't have any more suggestions if this does not work. Sorry.
  
   Carl
  
  
 



[jira] Updated: (HIVE-1517) ability to select across a database

2011-02-16 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1517:
--

Attachment: HIVE-1517.6.patch

This patch fixed a couple of test outputs for TestContriCliTest. 

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch, HIVE-1517.6.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1941) support explicit view partitioning

2011-02-16 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995666#comment-12995666
 ] 

Paul Yang commented on HIVE-1941:
-

It looks like it's possible with the current thrift add_partition() method to 
create a partition for a view with a non-null SD/location. Can we put in a 
check to guard against this case?

Other than that, it looks good from the metastore/replication side.

 support explicit view partitioning
 --

 Key: HIVE-1941
 URL: https://issues.apache.org/jira/browse/HIVE-1941
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, 
 HIVE-1941.4.patch


 Allow creation of a view with an explicit partitioning definition, and 
 support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
 For more information, see
 http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995667#comment-12995667
 ] 

Namit Jain commented on HIVE-1517:
--

+1

For some reason, the tests are failing in my environment.
While I fix the problem, Yongqiang, can you commit ?

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch, HIVE-1517.6.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995669#comment-12995669
 ] 

Namit Jain commented on HIVE-1918:
--

Can you upload a patch to review-board

 Add export/import facilities to the hive system
 ---

 Key: HIVE-1918
 URL: https://issues.apache.org/jira/browse/HIVE-1918
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
 HIVE-1918.patch.3.txt, HIVE-1918.patch.4.txt, HIVE-1918.patch.txt, 
 hive-metastore-er.pdf


 This is an enhancement request to add export/import features to hive.
 With this language extension, the user can export the data of the table - 
 which may be located in different hdfs locations in case of a partitioned 
 table - as well as the metadata of the table into a specified output 
 location. This output location can then be moved over to another different 
 hadoop/hive instance and imported there.  
 This should work independent of the source and target metastore dbms used; 
 for instance, between derby and mysql.
 For partitioned tables, the ability to export/import a subset of the 
 partition must be supported.
 Howl will add more features on top of this: The ability to create/use the 
 exported data even in the absence of hive, using MR or Pig. Please see 
 http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1941) support explicit view partitioning

2011-02-16 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995668#comment-12995668
 ] 

Paul Yang commented on HIVE-1941:
-

Similarly, we should handle the case when calling append_partition() on a view.

 support explicit view partitioning
 --

 Key: HIVE-1941
 URL: https://issues.apache.org/jira/browse/HIVE-1941
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, 
 HIVE-1941.4.patch


 Allow creation of a view with an explicit partitioning definition, and 
 support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
 For more information, see
 http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-16 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995673#comment-12995673
 ] 

He Yongqiang commented on HIVE-1517:


running tests, will commit after tests pass.

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
Priority: Blocker
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
 HIVE-1517.3.patch, HIVE-1517.4.patch, HIVE-1517.5.patch, HIVE-1517.6.patch


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (HIVE-1997) Map join followed by multi-table insert will generate duplicated result

2011-02-16 Thread Ted Xu (JIRA)
Map join followed by multi-table insert will generate duplicated result
---

 Key: HIVE-1997
 URL: https://issues.apache.org/jira/browse/HIVE-1997
 Project: Hive
  Issue Type: Bug
Reporter: Ted Xu
 Fix For: 0.7.0


Map join followed by multi-table insert will generate duplicated result, if the 
insert targets contain both direct insert (FileSinkOperator logic) and group 
by/distribute by (ReduceSinkOperator logic).

The following query regenerate the case:
{code}
FROM
(SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, 
y.value as value2
 FROM src1 x JOIN src y ON (x.key = y.key)) subq
INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-12')
SELECT key1, value1
INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-13')
SELECT key2, value2
GROUP BY key2, value2;
{code}
In that query above, records of table destpart(ds='2010-12-12') is duplicated.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1928) GRANT/REVOKE should handle privileges as tokens, not identifiers

2011-02-16 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1928:
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to branch-0.7 and trunk. Thanks Natty!

 GRANT/REVOKE should handle privileges as tokens, not identifiers
 

 Key: HIVE-1928
 URL: https://issues.apache.org/jira/browse/HIVE-1928
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Security
Affects Versions: 0.7.0
Reporter: Carl Steinbach
Assignee: Jonathan Natkins
Priority: Critical
 Fix For: 0.7.0

 Attachments: HIVE-1928.1.patch, HIVE-1928.2.patch


 The grammar for the GRANT and REVOKE Privileges statements currently handle 
 the list of privileges as a list of
 identifiers. Since most of the privileges are also keywords in the HQL 
 grammar this requires users
 to individually quote-escape each of the privileges, e.g:
 {code}
 grant `Create` on table authorization_part to user hive_test_user;
 grant `Update` on table authorization_part to user hive_test_user;
 grant `Drop` on table authorization_part to user hive_test_user;
 grant `select` on table src to user hive_test_user;
 {code}
 Both MySQL and the SQL standard treat privileges as tokens. Hive should do 
 the same.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira