[jira] [Commented] (HIVE-3221) HiveConf.getPositionFromInternalName does not support more than sinle digit column numbers

2012-07-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418119#comment-13418119
 ] 

Ashutosh Chauhan commented on HIVE-3221:


+1. Looks good. Will commit if tests pass.

> HiveConf.getPositionFromInternalName does not support more than sinle digit 
> column numbers
> --
>
> Key: HIVE-3221
> URL: https://issues.apache.org/jira/browse/HIVE-3221
> Project: Hive
>  Issue Type: Bug
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-3221.patch
>
>
> For positions above 9, HiveConf.getPositionFromInternalName only looks at the 
> last digit, and thus, causes collisions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3251) Hive doesn't remove scrach directories while killing running MR job

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418109#comment-13418109
 ] 

Namit Jain commented on HIVE-3251:
--

+1

> Hive doesn't remove scrach directories while killing running MR job
> ---
>
> Key: HIVE-3251
> URL: https://issues.apache.org/jira/browse/HIVE-3251
> Project: Hive
>  Issue Type: Bug
>  Components: Server Infrastructure
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3151.patch
>
>
> While killing running MR job, hive doesn't clean up scratch directory 
> (mapred.cache.files). So that, afterwards, scratch directory is left there in 
> hdfs. HDFS name node doesn't know it and try to do lease recovery. while such 
> instances happen more, it will eventually crash namenode.
> The fix is to leverage hdfs clean up functionality. While creating scratch 
> dirs, hive registers it to hdfs cleanup hook. While killing happens, hdfs 
> will clean them up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3267) escaped columns in cluster/distribute/order/sort by are not working

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3267:
-

Status: Patch Available  (was: Open)

> escaped columns in cluster/distribute/order/sort by are not working
> ---
>
> Key: HIVE-3267
> URL: https://issues.apache.org/jira/browse/HIVE-3267
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> The following query:
> select `key`, value from src cluster by `key`, value;
> fails

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3267) escaped columns in cluster/distribute/order/sort by are not working

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3267:
-

Description: 
The following query:

select `key`, value from src cluster by `key`, value;


fails

  was:

The following query:

select `key`, value from src cluster by `key`, value;


fails

Summary: escaped columns in cluster/distribute/order/sort by are not 
working  (was: escaped columns in cluster by are not working)

> escaped columns in cluster/distribute/order/sort by are not working
> ---
>
> Key: HIVE-3267
> URL: https://issues.apache.org/jira/browse/HIVE-3267
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> The following query:
> select `key`, value from src cluster by `key`, value;
> fails

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3246) java primitive type for binary datatype should be byte[]

2012-07-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418081#comment-13418081
 ] 

Ashutosh Chauhan commented on HIVE-3246:


+1. Looks good. Running tests.

> java primitive type for binary datatype should be byte[]
> 
>
> Key: HIVE-3246
> URL: https://issues.apache.org/jira/browse/HIVE-3246
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-3246.1.patch, HIVE-3246.2.patch
>
>
> PrimitiveObjectInspector.getPrimitiveJavaObject is supposed to return a java 
> object. But in case of binary datatype, it returns ByteArrayRef (not java 
> standard type). The suitable java object for it would be byte[]. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-3267) escaped columns in cluster by are not working

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-3267:


Assignee: Namit Jain

> escaped columns in cluster by are not working
> -
>
> Key: HIVE-3267
> URL: https://issues.apache.org/jira/browse/HIVE-3267
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> The following query:
> select `key`, value from src cluster by `key`, value;
> fails

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2798) add an option to change the primary region for a table

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2798:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

This is not needed

> add an option to change the primary region for a table
> --
>
> Key: HIVE-2798
> URL: https://issues.apache.org/jira/browse/HIVE-2798
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> This should error out if any of the partitions are not present in the primary 
> region

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2798) add an option to change the primary region for a table

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2798:
-

Status: Patch Available  (was: Open)

> add an option to change the primary region for a table
> --
>
> Key: HIVE-2798
> URL: https://issues.apache.org/jira/browse/HIVE-2798
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> This should error out if any of the partitions are not present in the primary 
> region

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2786) Throw an error if the user tries to insert a table into a region other than the primary region

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2786:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

This is not needed

> Throw an error if the user tries to insert a table into a region other than 
> the primary region
> --
>
> Key: HIVE-2786
> URL: https://issues.apache.org/jira/browse/HIVE-2786
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> By default, the user can only insert into the primary region.
> Add an option to insert into the secondary region also.
> The config variable is 'hive.insert.secondary.regions' - default for that 
> variable is false

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2786) Throw an error if the user tries to insert a table into a region other than the primary region

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2786:
-

Status: Patch Available  (was: Open)

> Throw an error if the user tries to insert a table into a region other than 
> the primary region
> --
>
> Key: HIVE-2786
> URL: https://issues.apache.org/jira/browse/HIVE-2786
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> By default, the user can only insert into the primary region.
> Add an option to insert into the secondary region also.
> The config variable is 'hive.insert.secondary.regions' - default for that 
> variable is false

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2785) support use region

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2785:
-

Status: Patch Available  (was: Open)

> support use region
> --
>
> Key: HIVE-2785
> URL: https://issues.apache.org/jira/browse/HIVE-2785
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> use physicalRegion;
> use physicalRegion ;
> should be supported

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2785) support use region

2012-07-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2785:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

This is not needed anymore

> support use region
> --
>
> Key: HIVE-2785
> URL: https://issues.apache.org/jira/browse/HIVE-2785
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Namit Jain
>
> use physicalRegion;
> use physicalRegion ;
> should be supported

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3126) Generate & build the velocity based Hive tests on windows by fixing the path issues

2012-07-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418064#comment-13418064
 ] 

Ashutosh Chauhan commented on HIVE-3126:


Changes look good to me. Carl, do you have any more comments for it?

> Generate & build the velocity based Hive tests on windows by fixing the path 
> issues
> ---
>
> Key: HIVE-3126
> URL: https://issues.apache.org/jira/browse/HIVE-3126
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.9.0, 0.10.0, 0.9.1
>Reporter: Kanna Karanam
>Assignee: Kanna Karanam
>  Labels: Windows, test
> Fix For: 0.10.0
>
> Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, 
> HIVE-3126.3.patch.txt, HIVE-3126.4.patch.txt, HIVE-3126.5.patch.txt, 
> HIVE-3126.6.patch.txt, HIVE-3126.7.patch.txt, HIVE-3126.8.patch.txt
>
>
> 1)Escape the backward slash in Canonical Path if unit test runs on windows.
> 2)Diff comparison – 
>  a.   Ignore the extra spacing on windows
>  b.   Ignore the different line endings on windows & Unix
>  c.   Convert the file paths to windows specific. (Handle spaces 
> etc..)
> 3)Set the right file scheme & class path separators while invoking the junit 
> task from 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3276) optimize union sub-queries

2012-07-18 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3276:


 Summary: optimize union sub-queries
 Key: HIVE-3276
 URL: https://issues.apache.org/jira/browse/HIVE-3276
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Nadeem Moidu



It might be a good idea to optimize simple union queries containing map-reduce 
jobs in at least one of the sub-qeuries.

For eg:

a query like:



insert overwrite table T1 partition P1
select * from 
(
  subq1
union all
  subq2
) u;


today creates 3 map-reduce jobs, one for subq1, another for subq2 and 
the final one for the union. 

It might be a good idea to optimize this. Instead of creating the union 
task, it might be simpler to create a move task (or something like a move
task), where the outputs of the two sub-queries will be moved to the final 
directory. This can easily extend to more than 2 sub-queries in the union.

This is only useful if there is a select * followed by filesink after the
union. This can be independently useful, and also be used to optimize the
skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418043#comment-13418043
 ] 

Namit Jain commented on HIVE-3205:
--

i will take a look

> Bucketed mapjoin on partitioned table which has no partition throws NPE
> ---
>
> Key: HIVE-3205
> URL: https://issues.apache.org/jira/browse/HIVE-3205
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.04
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> {code}
> create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> set hive.optimize.bucketmapjoin = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> explain
> SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2
> FROM hive_test_smb_bucket1 a JOIN
> hive_test_smb_bucket2 b
> ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and  b.key 
> IS NOT NULL;
> {code}
> throws NPE
> {noformat}
> 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - 
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100)
>   at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418029#comment-13418029
 ] 

Joydeep Sen Sarma commented on HIVE-3275:
-

that sounds like a reasonable approach. it's a hive test, not hadoop - so as 
long as hive is trying to generate a non-local mode job (i am guessing that's 
what's being tested here) and that's verified against some hadoop tree - we are 
good.

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3275.1.patch.txt
>
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417999#comment-13417999
 ] 

Zhenxiao Luo commented on HIVE-3275:


@Joydeep:

Any comments are appreciated. I'd like to know the idea from autolocal1.q's 
original author.

Thanks,
Zhenxiao

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3275.1.patch.txt
>
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417998#comment-13417998
 ] 

Zhenxiao Luo commented on HIVE-3275:


In summary,

In hadoop0.20,

JobClient initialization would try to get JobTracker's address, which throws 
the expected exception.

While, in hadoop0.23,

JobClient Initialization would try which protocol to choose.

If MR1, it would do the same as hadoop0.20.

If MR2, it does not try to get JobTracker's address in JobClient 
initialization. No exception thrown at this time.

This will be an exception when the jobClient submitJob.

Since the expected exception on hadoop0.23 diffs for MR1 and MR2, and the 
execution path has changed for MR1 and MR2, My plan is only running this test 
on hadoop0.20.

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3275.1.patch.txt
>
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3275:
---

Status: Patch Available  (was: Open)

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3275.1.patch.txt
>
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3275:
---

Attachment: HIVE-3275.1.patch.txt

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3275.1.patch.txt
>
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417990#comment-13417990
 ] 

Zhenxiao Luo commented on HIVE-3275:


review request submitted at:
https://reviews.facebook.net/D4221

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417972#comment-13417972
 ] 

Zhenxiao Luo commented on HIVE-3275:


My plan is to keep autolocal1.q running only in hadoop0.20.

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417965#comment-13417965
 ] 

Zhenxiao Luo commented on HIVE-3275:


The reason is:

1. On hadoop0.20 or Hadoop0.23 MR1,

JobClient jc = new JobClient(job);

this line is throwing exception in ExecDriver.java.

It calls into JobClient.java:

public JobClient(JobConf conf) throws IOException {
setConf(conf);
init(conf);
  }

  /**
   * Connect to the default {@link JobTracker}.
   * @param conf the job configuration.
   * @throws IOException
   */
  public void init(JobConf conf) throws IOException {
String tracker = conf.get("mapred.job.tracker", "local");
if ("local".equals(tracker)) {
  this.jobSubmitClient = new LocalJobRunner(conf);
} else {
  this.jobSubmitClient = createRPCProxy(JobTracker.getAddress(conf), conf);
}
  }

When createRPCProxy() is called, jobtracker is trying to getAddress(conf) of 
the non-existed host(abracadabra), and throws the expected exception:
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: abracadabra

Here is the log and stack trace when running in hadoop0.20 to proof the above 
observation:

2012-07-18 17:53:38,210 ERROR exec.Task (SessionState.java:printError(400)) - 
Job Submission failed with exception 'java.lang.IllegalArgumentException(Does 
not contain a valid host:port authority: abracadabra)'
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: abracadabra 
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)   
  
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)   
  
at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2119) 
  
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:497)
at org.apache.hadoop.mapred.JobClient.(JobClient.java:469)
  
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:418)   
  
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)   
  
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) 
   
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) 
  
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) 
  
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341) 
  
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:671)
  
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_autolocal1(TestNegativeCliDriver.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
   
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
at java.lang.reflect.Method.invoke(Method.java:616)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)  
  
at junit.framework.TestResult$1.protect(TestResult.java:110)
  
at junit.framework.TestResult.runProtected(TestResult.java:128) 
  
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
  
at junit.framework.TestSuite.run(TestSuite.java:238)
  
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
 
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
   
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)

2012-07-18 17:53:38,215 ERROR ql.Driver (SessionState.java:printError(400)) - 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MapRedTask 

2. When running in hadoop0.23 MR2. MapReduce2 is using Yarn framework,

JobClient jc = new JobClient(job)

this line in ExecDriver.java is calling into MR2's implementation of JobClient:

public JobClient(Configuration conf

[jira] [Commented] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417952#comment-13417952
 ] 

Zhenxiao Luo commented on HIVE-3275:


After adding the following to autolocal1.q to initialize MR2 yarn framework:

[~/Code/hive]git diff ql/src/test/queries/clientnegative/autolocal1.q
diff --git a/ql/src/test/queries/clientnegative/autolocal1.q 
b/ql/src/test/queries/clientnegative/autolocal1.q
index 6bee177..8623eb5 100644
--- a/ql/src/test/queries/clientnegative/autolocal1.q
+++ b/ql/src/test/queries/clientnegative/autolocal1.q
@@ -1,3 +1,4 @@
+set mapreduce.framework.name=yarn;
 set mapred.job.tracker=abracadabra;
 set hive.exec.mode.local.auto.inputbytes.max=1;
 set hive.exec.mode.local.auto=true;

Still getting the following diffs:

diff -a 
/home/cloudera/Code/hive/build/ql/test/logs/clientnegative/autolocal1.q.out 
/home/cloudera/Code/hive/ql/src/test/results/clientnegative/autolocal1.q.out
[junit] 5c5
[junit] < Job Submission failed with exception 
'java.lang.reflect.UndeclaredThrowableException(null)'
[junit] —
[junit] > Job Submission failed with exception 
'java.lang.IllegalArgumentException(Does not contain a valid host:port 
authority: abracadabra)'

> Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2
> --
>
> Key: HIVE-3275
> URL: https://issues.apache.org/jira/browse/HIVE-3275
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
>
> autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
> problem:
> Begin query: autolocal1.q
> diff -a 
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
>  
> /var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
> 5c5
> < Job Submission failed with exception 'java.io.IOException(Cannot initialize 
> Cluster. Please check your configuration for mapreduce.framework.name and the 
> correspond server addresses.)'
> —
> > Job Submission failed with exception 
> > 'java.lang.IllegalArgumentException(Does not contain a valid host:port 
> > authority: abracadabra)'
> Exception: Client execution results failed with error code = 1
> See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
> more logs.
> Failed query: autolocal1.q

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3275) Fix autolocal1.q testcase failure when building hive on hadoop0.23 MR2

2012-07-18 Thread Zhenxiao Luo (JIRA)
Zhenxiao Luo created HIVE-3275:
--

 Summary: Fix autolocal1.q testcase failure when building hive on 
hadoop0.23 MR2
 Key: HIVE-3275
 URL: https://issues.apache.org/jira/browse/HIVE-3275
 Project: Hive
  Issue Type: Bug
Reporter: Zhenxiao Luo
Assignee: Zhenxiao Luo


autolocal1.q is failing only on hadoop0.23 MR2, due to cluster initialization 
problem:

Begin query: autolocal1.q
diff -a 
/var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/build/ql/test/logs/clientnegative/autolocal1.q.out
 
/var/lib/jenkins/workspace/zhenxiao-CDH4-Hive-0.9.0/ql/src/test/results/clientnegative/autolocal1.q.out
5c5
< Job Submission failed with exception 'java.io.IOException(Cannot initialize 
Cluster. Please check your configuration for mapreduce.framework.name and the 
correspond server addresses.)'
—
> Job Submission failed with exception 'java.lang.IllegalArgumentException(Does 
> not contain a valid host:port authority: abracadabra)'
Exception: Client execution results failed with error code = 1
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
Failed query: autolocal1.q


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

2012-07-18 Thread David Worms (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417872#comment-13417872
 ] 

David Worms commented on HIVE-2843:
---

Thank for offering your help, I should have a little of time this week so i'll 
try to follow the recommandation posted on the wiki.

> UDAF to convert an aggregation to a map
> ---
>
> Key: HIVE-2843
> URL: https://issues.apache.org/jira/browse/HIVE-2843
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.9.0
>Reporter: David Worms
>Priority: Minor
>  Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. 
> The source code is available on GitHub at https://github.com/wdavidw/hive-udf 
> in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function 
> convert an aggregation into a map and is internally using a Java `HashMap`. 
> The second function extends the first one. It convert an aggregation into an 
> ordered map and is internally using a Java `TreeMap`. They both extends the 
> `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post 
> at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue 
> while following the guideline posted on the wiki to create an appropriate 
> path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3273) Add avro jars into hive execution classpath

2012-07-18 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3273:
---

Attachment: HIVE-3273.1.patch.txt

> Add avro jars into hive execution classpath
> ---
>
> Key: HIVE-3273
> URL: https://issues.apache.org/jira/browse/HIVE-3273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3273.1.patch.txt
>
>
> avro*.jar should be added to hive execution classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3273) Add avro jars into hive execution classpath

2012-07-18 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3273:
---

Status: Patch Available  (was: Open)

> Add avro jars into hive execution classpath
> ---
>
> Key: HIVE-3273
> URL: https://issues.apache.org/jira/browse/HIVE-3273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-3273.1.patch.txt
>
>
> avro*.jar should be added to hive execution classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3273) Add avro jars into hive execution classpath

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417807#comment-13417807
 ] 

Zhenxiao Luo commented on HIVE-3273:


review request submitted:
https://reviews.facebook.net/D4209

> Add avro jars into hive execution classpath
> ---
>
> Key: HIVE-3273
> URL: https://issues.apache.org/jira/browse/HIVE-3273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
>
> avro*.jar should be added to hive execution classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3273) Add avro jars into hive execution classpath

2012-07-18 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417805#comment-13417805
 ] 

Zhenxiao Luo commented on HIVE-3273:


Since Hadoop classpath is not setting correctly in build-common.xml "test" 
target, even currently we did not add avro jars into hive execution classpath, 
it is still running OK.

While, we may get class def not found error, if not putting avro jars into the 
hive mapreduce job execution classpath:

Caused by: java.lang.ClassNotFoundException: org.apache.avro.io.DatumReader

> Add avro jars into hive execution classpath
> ---
>
> Key: HIVE-3273
> URL: https://issues.apache.org/jira/browse/HIVE-3273
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
>
> avro*.jar should be added to hive execution classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3274) Fix the Hadoop classpath in build-common.xml "test" target

2012-07-18 Thread Zhenxiao Luo (JIRA)
Zhenxiao Luo created HIVE-3274:
--

 Summary: Fix the Hadoop classpath in build-common.xml "test" target
 Key: HIVE-3274
 URL: https://issues.apache.org/jira/browse/HIVE-3274
 Project: Hive
  Issue Type: Bug
Reporter: Zhenxiao Luo
Assignee: Zhenxiao Luo


the Hadoop classpath is set incorrectly in the "test" target in 
build-common.xml. 
This classpath should match the classpath that bin/hadoop would run with.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3273) Add avro jars into hive execution classpath

2012-07-18 Thread Zhenxiao Luo (JIRA)
Zhenxiao Luo created HIVE-3273:
--

 Summary: Add avro jars into hive execution classpath
 Key: HIVE-3273
 URL: https://issues.apache.org/jira/browse/HIVE-3273
 Project: Hive
  Issue Type: Bug
Reporter: Zhenxiao Luo
Assignee: Zhenxiao Luo


avro*.jar should be added to hive execution classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417757#comment-13417757
 ] 

Rohini Palaniswamy commented on HIVE-3098:
--

Sorry. Ignore 3). That is valid. 

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #79

2012-07-18 Thread Apache Jenkins Server
See 

--
[...truncated 36573 lines...]
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2012-07-18_15-20-28_993_2812778166127156239/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file: 

[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2012-07-18_15-20-33_348_9028743055537807573/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2012-07-18_15-20-33_348_9028743055537807573/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POS

[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417753#comment-13417753
 ] 

Rohini Palaniswamy commented on HIVE-3098:
--

Few comments: 

1) Better to have AGE_THRESHOLD configurable instead of a constant.

2) LOG.debug("Cleaning up file-system handles for: " + ugi); - Can be info

3) if (ugi == null) ugi = newUgi; is redundant. There is no way ugi can be null 
after putIfAbsent call.
{code}
ugi = ugiCache.putIfAbsent(key, newUgi);
+   if (ugi == null) // New entry.
+ ugi = newUgi;
{code}

4) Would be good to move UGICache to a separate class.

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417728#comment-13417728
 ] 

Ashutosh Chauhan commented on HIVE-3098:


[~mithun] Thanks mithun for addressing the concerns. I will take a look 
shortly. Will it be easy for you to also update the phabricator link ?

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3098:
---

Status: Patch Available  (was: Open)

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3098:
---

Attachment: (was: Hive_3098.patch)

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3098:
---

Attachment: Hive_3098.patch

(Re-upload, fixing the ASF license.)

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3098:
---

Attachment: Hive_3098.patch

Here's an extension of the old patch, complete with FileSystem.closeAllForUGI() 
and cache-aging. This should address concerns about cache-cleanup and further 
memory-leaks.

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-07-18 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-3098:
---

Attachment: (was: HIVE-3098.patch)

> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> -
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 100 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3272) RetryingRawStore will perform partial transaction on retry

2012-07-18 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-3272:
---

 Summary: RetryingRawStore will perform partial transaction on retry
 Key: HIVE-3272
 URL: https://issues.apache.org/jira/browse/HIVE-3272
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Priority: Critical


By the time the RetryingRawStore retries a command the transaction encompassing 
it has already been rolled back.  This means that it will perform the remainder 
of the raw store commands outside of a transaction, unless there is another one 
encapsulating it which is definitely not always the case, and then fail when it 
tries to commit the transaction as there is none open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3126) Generate & build the velocity based Hive tests on windows by fixing the path issues

2012-07-18 Thread Kanna Karanam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kanna Karanam updated HIVE-3126:


Status: Patch Available  (was: Open)

> Generate & build the velocity based Hive tests on windows by fixing the path 
> issues
> ---
>
> Key: HIVE-3126
> URL: https://issues.apache.org/jira/browse/HIVE-3126
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.9.0, 0.10.0, 0.9.1
>Reporter: Kanna Karanam
>Assignee: Kanna Karanam
>  Labels: Windows, test
> Fix For: 0.10.0
>
> Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, 
> HIVE-3126.3.patch.txt, HIVE-3126.4.patch.txt, HIVE-3126.5.patch.txt, 
> HIVE-3126.6.patch.txt, HIVE-3126.7.patch.txt, HIVE-3126.8.patch.txt
>
>
> 1)Escape the backward slash in Canonical Path if unit test runs on windows.
> 2)Diff comparison – 
>  a.   Ignore the extra spacing on windows
>  b.   Ignore the different line endings on windows & Unix
>  c.   Convert the file paths to windows specific. (Handle spaces 
> etc..)
> 3)Set the right file scheme & class path separators while invoking the junit 
> task from 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3126) Generate & build the velocity based Hive tests on windows by fixing the path issues

2012-07-18 Thread Kanna Karanam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kanna Karanam updated HIVE-3126:


Attachment: HIVE-3126.8.patch.txt

Thanks Carl. Updated the diff with some of your suggestions.

> Generate & build the velocity based Hive tests on windows by fixing the path 
> issues
> ---
>
> Key: HIVE-3126
> URL: https://issues.apache.org/jira/browse/HIVE-3126
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.9.0, 0.10.0, 0.9.1
>Reporter: Kanna Karanam
>Assignee: Kanna Karanam
>  Labels: Windows, test
> Fix For: 0.10.0
>
> Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, 
> HIVE-3126.3.patch.txt, HIVE-3126.4.patch.txt, HIVE-3126.5.patch.txt, 
> HIVE-3126.6.patch.txt, HIVE-3126.7.patch.txt, HIVE-3126.8.patch.txt
>
>
> 1)Escape the backward slash in Canonical Path if unit test runs on windows.
> 2)Diff comparison – 
>  a.   Ignore the extra spacing on windows
>  b.   Ignore the different line endings on windows & Unix
>  c.   Convert the file paths to windows specific. (Handle spaces 
> etc..)
> 3)Set the right file scheme & class path separators while invoking the junit 
> task from 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3025) Fix Hive ARCHIVE command on 0.22 and 0.23

2012-07-18 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417333#comment-13417333
 ] 

Vikram Dixit K commented on HIVE-3025:
--

I am unable to apply the patch on trunk as it says files are missing.

ql/src/test/queries/clientpositive/archive_mr_1806.q
ql/src/test/queries/clientpositive/archive_multi_mr_1806.q
ql/src/test/results/clientpositive/archive_mr_1806.q.out
ql/src/test/results/clientpositive/archive_multi_mr_1806.q.out

Please let me know if I am missing anything. I am trying to apply it on the svn 
repo.

Thanks,
Vikram

> Fix Hive ARCHIVE command on 0.22 and 0.23
> -
>
> Key: HIVE-3025
> URL: https://issues.apache.org/jira/browse/HIVE-3025
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-3025.D3195.1.patch
>
>
> archive.q and archive_multi.q fail when Hive is run on top of Hadoop 0.22 or 
> 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #79

2012-07-18 Thread Apache Jenkins Server
See 


--
[...truncated 10106 lines...]
 [echo] Project: odbc
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/odbc/src/conf
 does not exist.

ivy-resolve-test:
 [echo] Project: odbc

ivy-retrieve-test:
 [echo] Project: odbc

compile-test:
 [echo] Project: odbc

create-dirs:
 [echo] Project: serde
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/serde/src/test/resources
 does not exist.

init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:
 [echo] Project: serde
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-serde-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-serde-default.html

ivy-retrieve:
 [echo] Project: serde

dynamic-serde:

compile:
 [echo] Project: serde

ivy-resolve-test:
 [echo] Project: serde

ivy-retrieve-test:
 [echo] Project: serde

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/serde/test/classes
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/service/src/test/resources
 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-service-default.html

ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/service/test/classes

test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common/java;/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml

ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common/jav

Hive-trunk-h0.21 - Build # 1551 - Still Failing

2012-07-18 Thread Apache Jenkins Server
Changes for Build #1549

Changes for Build #1550
[namit] HIVE-3230 Make logging of plan progress in HadoopJobExecHelper 
configurable
(Kevin Wilfong via namit)


Changes for Build #1551



No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1551)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1551/ to 
view the results.

[jira] [Commented] (HIVE-3246) java primitive type for binary datatype should be byte[]

2012-07-18 Thread Travis Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417155#comment-13417155
 ] 

Travis Crawford commented on HIVE-3246:
---

I patched v2 of this patch into a clean trunk and was able to run the query 
that failed in HIVE-3266. It was a simple "select *" from a table using 
ThriftDeserializer that has a binary field.

> java primitive type for binary datatype should be byte[]
> 
>
> Key: HIVE-3246
> URL: https://issues.apache.org/jira/browse/HIVE-3246
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-3246.1.patch, HIVE-3246.2.patch
>
>
> PrimitiveObjectInspector.getPrimitiveJavaObject is supposed to return a java 
> object. But in case of binary datatype, it returns ByteArrayRef (not java 
> standard type). The suitable java object for it would be byte[]. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3266) Hive queries over thrift binary fields fail

2012-07-18 Thread Travis Crawford (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Crawford updated HIVE-3266:
--

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Marking as a duplicate of HIVE-3246. I patched that in and no longer 
experienced the issue.

> Hive queries over thrift binary fields fail
> ---
>
> Key: HIVE-3266
> URL: https://issues.apache.org/jira/browse/HIVE-3266
> Project: Hive
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3266.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

2012-07-18 Thread Philip Tromans (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417144#comment-13417144
 ] 

Philip Tromans commented on HIVE-2843:
--

Hi David,

I think that this would be a really useful addition to Hive. I was just about 
to write the same UDAF when I came across yours. I think implode_to_map is a 
good name for it, because as you say, implode has quite a few meanings, but I'm 
interested in what others have to say.

I don't mind preparing / submitting the patch / test cases if you're working on 
other things.

Cheers,

Phil.

> UDAF to convert an aggregation to a map
> ---
>
> Key: HIVE-2843
> URL: https://issues.apache.org/jira/browse/HIVE-2843
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.9.0
>Reporter: David Worms
>Priority: Minor
>  Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. 
> The source code is available on GitHub at https://github.com/wdavidw/hive-udf 
> in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function 
> convert an aggregation into a map and is internally using a Java `HashMap`. 
> The second function extends the first one. It convert an aggregation into an 
> ordered map and is internally using a Java `TreeMap`. They both extends the 
> `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post 
> at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue 
> while following the guideline posted on the wiki to create an appropriate 
> path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3271) Previledge can be granted by any user(not owner) to any user(even to the same user)

2012-07-18 Thread Unnikrishnan V T (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417063#comment-13417063
 ] 

Unnikrishnan V T commented on HIVE-3271:


Another user who is not a supergroup member can also view the table.

> Previledge can be granted by any user(not owner) to any user(even to the same 
> user)
> ---
>
> Key: HIVE-3271
> URL: https://issues.apache.org/jira/browse/HIVE-3271
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.8.1
>Reporter: Unnikrishnan V T
>Priority: Critical
> Attachments: Screenshot.png
>
>
> I have created two users user 'unni' and user 'sachin'. user unni created a 
> table 'test3' so that user sachin cannot view that table. But user sachin is 
> able to grant all permission to the table test3.
> I have set 
> 1)hive.security.authorization.enabled=true(in hive-site.xml)
> 2)dfs.permissions=true(in hdfs-site.xml)
> 3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
> User sachin and user unni are in supergroup group.
> The user sachin is even able to revoke all permissions from the owner of the 
> table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3271) Previledge can be granted by any user(not owner) to any user(even to the same user)

2012-07-18 Thread Unnikrishnan V T (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Unnikrishnan V T updated HIVE-3271:
---

Attachment: Screenshot.png

> Previledge can be granted by any user(not owner) to any user(even to the same 
> user)
> ---
>
> Key: HIVE-3271
> URL: https://issues.apache.org/jira/browse/HIVE-3271
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.8.1
>Reporter: Unnikrishnan V T
>Priority: Critical
> Attachments: Screenshot.png
>
>
> I have created two users user 'unni' and user 'sachin'. user unni created a 
> table 'test3' so that user sachin cannot view that table. But user sachin is 
> able to grant all permission to the table test3.
> I have set 
> 1)hive.security.authorization.enabled=true(in hive-site.xml)
> 2)dfs.permissions=true(in hdfs-site.xml)
> 3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
> User sachin and user unni are in supergroup group.
> The user sachin is even able to revoke all permissions from the owner of the 
> table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3271) Previledge can be granted by any user(not owner) to any user(even to the same user)

2012-07-18 Thread Unnikrishnan V T (JIRA)
Unnikrishnan V T created HIVE-3271:
--

 Summary: Previledge can be granted by any user(not owner) to any 
user(even to the same user)
 Key: HIVE-3271
 URL: https://issues.apache.org/jira/browse/HIVE-3271
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.8.1
Reporter: Unnikrishnan V T
Priority: Critical


I have created two users user 'unni' and user 'sachin'. user unni created a 
table 'test3' so that user sachin cannot view that table. But user sachin is 
able to grant all permission to the table test3.
I have set 
1)hive.security.authorization.enabled=true(in hive-site.xml)
2)dfs.permissions=true(in hdfs-site.xml)
3)dfs.permissions.supergroup=supergroup(in hdfs-site.xml)
User sachin and user unni are in supergroup group.
The user sachin is even able to revoke all permissions from the owner of the 
table user unni.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3230) Make logging of plan progress in HadoopJobExecHelper configurable

2012-07-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417021#comment-13417021
 ] 

Hudson commented on HIVE-3230:
--

Integrated in Hive-trunk-h0.21 #1550 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1550/])
HIVE-3230 Make logging of plan progress in HadoopJobExecHelper configurable
(Kevin Wilfong via namit) (Revision 1362782)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1362782
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java


> Make logging of plan progress in HadoopJobExecHelper configurable
> -
>
> Key: HIVE-3230
> URL: https://issues.apache.org/jira/browse/HIVE-3230
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 0.10.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3230.1.patch.txt, HIVE-3230.2.patch.txt, 
> HIVE-3230.3.patch.txt, HIVE-3230.4.patch.txt
>
>
> Currently, by default, every second a job is run a massive JSON string 
> containing the query plan, the tasks, and some counters is logged to the 
> hive_job_log.  For large, long running jobs that can easily reach gigabytes 
> of data. This logging should be configurable as average user doesn't need 
> this logging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1550 - Still Failing

2012-07-18 Thread Apache Jenkins Server
Changes for Build #1549

Changes for Build #1550
[namit] HIVE-3230 Make logging of plan progress in HadoopJobExecHelper 
configurable
(Kevin Wilfong via namit)




1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
at junit.framework.Assert.fail(Assert.java:50)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:10644)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build 
#$BUILD_NUMBER)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1550/ to 
view the results.

[jira] [Commented] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE

2012-07-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416992#comment-13416992
 ] 

Navis commented on HIVE-3205:
-

@Namit Jain, commented

> Bucketed mapjoin on partitioned table which has no partition throws NPE
> ---
>
> Key: HIVE-3205
> URL: https://issues.apache.org/jira/browse/HIVE-3205
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.04
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> {code}
> create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> set hive.optimize.bucketmapjoin = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> explain
> SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2
> FROM hive_test_smb_bucket1 a JOIN
> hive_test_smb_bucket2 b
> ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and  b.key 
> IS NOT NULL;
> {code}
> throws NPE
> {noformat}
> 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - 
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100)
>   at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3252) Add environment context to metastore Thrift calls

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416990#comment-13416990
 ] 

Namit Jain commented on HIVE-3252:
--

@John. see comments on phabricator

> Add environment context to metastore Thrift calls
> -
>
> Key: HIVE-3252
> URL: https://issues.apache.org/jira/browse/HIVE-3252
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: John Reese
>Assignee: John Reese
>Priority: Minor
>
> Currently in the Hive Thrift metastore API create_table, add_partition, 
> alter_table, alter_partition have with_environment_context analogs.  It would 
> be really useful to add similar methods from drop_partition, drop_table, and 
> append_partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3244) Add table property which constraints sorting/bucketing for data loading

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416986#comment-13416986
 ] 

Namit Jain commented on HIVE-3244:
--

Let us use strict mode for that.
Adding more and more properties may be more confusing.

> Add table property which constraints sorting/bucketing for data loading
> ---
>
> Key: HIVE-3244
> URL: https://issues.apache.org/jira/browse/HIVE-3244
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.10
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> This ticket is intended to implement "INSERT INTO" to bucketed table.
> With hive.enforce.bucketing option, user can append data to bucketed table. 
> But current implementation depends on lexical order of file names for 
> determining bucket number of file, which is not always true.
> So if file name is suffixed with bucket number when inserting(moving), it can 
> be acquired rightly when it is needed, such as in BucketMapJoinOptimizer.
> With simple prototype codes, which will be attached after writing this, the 
> test query
> {noformat}
> create table bucket_test (key int, value string) clustered by (key) sorted by 
> (key) into 4 buckets TBLPROPERTIES
> ('FORCEDBUCKETING'='TRUE', 'FORCEDSORTING'='TRUE');
> set hive.optimize.bucketmapjoin = true;
> insert into table bucket_test select key, value from src1;
> explain extended select /*+MAPJOIN(b)*/ * from bucket_test a join bucket_test 
> b on a.key=b.key;
> insert into table bucket_test select key, value from src1;
> explain extended select /*+MAPJOIN(b)*/ * from bucket_test a join bucket_test 
> b on a.key=b.key;
> {noformat}
> resulted as below
> {noformat}
> 1. first plan
>  b {00_0_[0]=[00_0_[0]], 01_0_[1]=[01_0_[1]], 
> 02_0_[2]=[02_0_[2]], 03_0_[3]=[03_0_[3]]}
> 2. second plan
>  b {00_0_[0]=[00_0_[0], 00_0_copy_1_[0]], 
> 00_0_copy_1_[0]=[00_0_[0], 00_0_copy_1_[0]], 
> 01_0_[1]=[01_0_[1], 01_0_copy_1_[1]], 
> 01_0_copy_1_[1]=[01_0_[1], 01_0_copy_1_[1]], 
> 02_0_[2]=[02_0_[2], 02_0_copy_1_[2]], 
> 02_0_copy_1_[2]=[02_0_[2], 02_0_copy_1_[2]], 
> 03_0_[3]=[03_0_[3], 03_0_copy_1_[3]], 
> 03_0_copy_1_[3]=[03_0_[3], 03_0_copy_1_[3]]}
> {noformat}
> Currently, I've prevented direct loading via 'LOAD DATA' for forced bucket 
> table. But with proper name validation, that could be allowed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416984#comment-13416984
 ] 

Namit Jain commented on HIVE-3205:
--

@Navis, comments


> Bucketed mapjoin on partitioned table which has no partition throws NPE
> ---
>
> Key: HIVE-3205
> URL: https://issues.apache.org/jira/browse/HIVE-3205
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.04
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> {code}
> create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
> string) clustered by (key) sorted by (key) into 2 buckets;
> set hive.optimize.bucketmapjoin = true;
> set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> explain
> SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2
> FROM hive_test_smb_bucket1 a JOIN
> hive_test_smb_bucket2 b
> ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and  b.key 
> IS NOT NULL;
> {code}
> throws NPE
> {noformat}
> 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - 
> FAILED: NullPointerException null
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
>   at 
> org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100)
>   at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3269) selectlist needs to be superset of the cluster list

2012-07-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416979#comment-13416979
 ] 

Navis commented on HIVE-3269:
-

Yes, adding dummy column was not hard. But alias problem was quite difficult.

> selectlist needs to be superset of the cluster list
> ---
>
> Key: HIVE-3269
> URL: https://issues.apache.org/jira/browse/HIVE-3269
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>
> The query:
> create table T (key1 string, key2 string, key3 string);
> select key1, key2 from T cluster by key1, key2, key3;
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3269) selectlist needs to be superset of the cluster list

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416978#comment-13416978
 ] 

Namit Jain commented on HIVE-3269:
--

Wont it be good to take a similar approach like groupby ?
I mean, have a dummy select first, and then cluster by/order by can resolve.

> selectlist needs to be superset of the cluster list
> ---
>
> Key: HIVE-3269
> URL: https://issues.apache.org/jira/browse/HIVE-3269
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>
> The query:
> create table T (key1 string, key2 string, key3 string);
> select key1, key2 from T cluster by key1, key2, key3;
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HIVE-3269) selectlist needs to be superset of the cluster list

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416978#comment-13416978
 ] 

Namit Jain edited comment on HIVE-3269 at 7/18/12 10:27 AM:


Wont it be good to take a similar approach like groupby ?
I mean, have a dummy select first, and then cluster by/order by can resolve.

But then, it would be difficult to support column aliases.

  was (Author: namit):
Wont it be good to take a similar approach like groupby ?
I mean, have a dummy select first, and then cluster by/order by can resolve.
  
> selectlist needs to be superset of the cluster list
> ---
>
> Key: HIVE-3269
> URL: https://issues.apache.org/jira/browse/HIVE-3269
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>
> The query:
> create table T (key1 string, key2 string, key3 string);
> select key1, key2 from T cluster by key1, key2, key3;
> fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3270) SMBJoin should be applied atomically

2012-07-18 Thread Navis (JIRA)
Navis created HIVE-3270:
---

 Summary: SMBJoin should be applied atomically 
 Key: HIVE-3270
 URL: https://issues.apache.org/jira/browse/HIVE-3270
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Priority: Minor


For using smbjoin, mapjoin hint(/*+MAPJOIN(..)*/) + bucket mapjoin 
configuration(hive.optimize.bucketmapjoin) also should be preceded.

But when BucketMapJoinOptimizer or SortedMergeBucketMapJoinOptimizer fails to 
convert mapjoin to smbjoin by some reason, it will be executed as a (bucket) 
mapjoin, which possibly cause OOM in map task.

I think there should be a hint for SMBJoin, which tries to convert common-join 
to smbjoin and when it fails, simply fallbacks to common-join.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3269) selectlist needs to be superset of the cluster list

2012-07-18 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3269:


 Summary: selectlist needs to be superset of the cluster list
 Key: HIVE-3269
 URL: https://issues.apache.org/jira/browse/HIVE-3269
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain


The query:

create table T (key1 string, key2 string, key3 string);
select key1, key2 from T cluster by key1, key2, key3;

fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3244) Add table property which constraints sorting/bucketing for data loading

2012-07-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3244:


Status: Patch Available  (was: Open)

Rebased on trunk.

@Namit Jain
This is not backward compatible functionality, so making this as a default 
behavior seemed to introduce problems, IMHO.
I'll change it if you insist.

> Add table property which constraints sorting/bucketing for data loading
> ---
>
> Key: HIVE-3244
> URL: https://issues.apache.org/jira/browse/HIVE-3244
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.10
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> This ticket is intended to implement "INSERT INTO" to bucketed table.
> With hive.enforce.bucketing option, user can append data to bucketed table. 
> But current implementation depends on lexical order of file names for 
> determining bucket number of file, which is not always true.
> So if file name is suffixed with bucket number when inserting(moving), it can 
> be acquired rightly when it is needed, such as in BucketMapJoinOptimizer.
> With simple prototype codes, which will be attached after writing this, the 
> test query
> {noformat}
> create table bucket_test (key int, value string) clustered by (key) sorted by 
> (key) into 4 buckets TBLPROPERTIES
> ('FORCEDBUCKETING'='TRUE', 'FORCEDSORTING'='TRUE');
> set hive.optimize.bucketmapjoin = true;
> insert into table bucket_test select key, value from src1;
> explain extended select /*+MAPJOIN(b)*/ * from bucket_test a join bucket_test 
> b on a.key=b.key;
> insert into table bucket_test select key, value from src1;
> explain extended select /*+MAPJOIN(b)*/ * from bucket_test a join bucket_test 
> b on a.key=b.key;
> {noformat}
> resulted as below
> {noformat}
> 1. first plan
>  b {00_0_[0]=[00_0_[0]], 01_0_[1]=[01_0_[1]], 
> 02_0_[2]=[02_0_[2]], 03_0_[3]=[03_0_[3]]}
> 2. second plan
>  b {00_0_[0]=[00_0_[0], 00_0_copy_1_[0]], 
> 00_0_copy_1_[0]=[00_0_[0], 00_0_copy_1_[0]], 
> 01_0_[1]=[01_0_[1], 01_0_copy_1_[1]], 
> 01_0_copy_1_[1]=[01_0_[1], 01_0_copy_1_[1]], 
> 02_0_[2]=[02_0_[2], 02_0_copy_1_[2]], 
> 02_0_copy_1_[2]=[02_0_[2], 02_0_copy_1_[2]], 
> 03_0_[3]=[03_0_[3], 03_0_copy_1_[3]], 
> 03_0_copy_1_[3]=[03_0_[3], 03_0_copy_1_[3]]}
> {noformat}
> Currently, I've prevented direct loading via 'LOAD DATA' for forced bucket 
> table. But with proper name validation, that could be allowed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3268) expressions in cluster by are not working

2012-07-18 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3268:


 Summary: expressions in cluster by are not working
 Key: HIVE-3268
 URL: https://issues.apache.org/jira/browse/HIVE-3268
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain


The following query fails:

select key+key, value from src cluster by key+key, value;


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3267) escaped columns in cluster by are not working

2012-07-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416917#comment-13416917
 ] 

Namit Jain commented on HIVE-3267:
--

https://reviews.facebook.net/D4203

> escaped columns in cluster by are not working
> -
>
> Key: HIVE-3267
> URL: https://issues.apache.org/jira/browse/HIVE-3267
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>
> The following query:
> select `key`, value from src cluster by `key`, value;
> fails

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira