[jira] [Updated] (HIVE-6564) WebHCat E2E tests that launch MR jobs fail on check job completion timeout

2014-03-06 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-6564:
-

Attachment: HIVE-6564.patch

Attaching a patch that deals with JSON module incompatibility.

 WebHCat E2E tests that launch MR jobs fail on check job completion timeout
 --

 Key: HIVE-6564
 URL: https://issues.apache.org/jira/browse/HIVE-6564
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6564.patch


 WebHCat E2E tests that fire off an MR job are not correctly being detected as 
 complete so those tests are timing out.
 The problem is happening because of JSON module available through cpan which 
 returns 1 or 0 instead of true or false.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6564) WebHCat E2E tests that launch MR jobs fail on check job completion timeout

2014-03-06 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-6564:
-

Status: Patch Available  (was: Open)

 WebHCat E2E tests that launch MR jobs fail on check job completion timeout
 --

 Key: HIVE-6564
 URL: https://issues.apache.org/jira/browse/HIVE-6564
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6564.patch


 WebHCat E2E tests that fire off an MR job are not correctly being detected as 
 complete so those tests are timing out.
 The problem is happening because of JSON module available through cpan which 
 returns 1 or 0 instead of true or false.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6564) WebHCat E2E tests that launch MR jobs fail on check job completion timeout

2014-03-06 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-6564:
-

Description: 
WebHCat E2E tests that fire off an MR job are not correctly being detected as 
complete so those tests are timing out.
The problem is happening because of JSON module available through cpan which 
returns 1 or 0 instead of true or false.
NO PRECOMMIT TESTS


  was:
WebHCat E2E tests that fire off an MR job are not correctly being detected as 
complete so those tests are timing out.
The problem is happening because of JSON module available through cpan which 
returns 1 or 0 instead of true or false.



 WebHCat E2E tests that launch MR jobs fail on check job completion timeout
 --

 Key: HIVE-6564
 URL: https://issues.apache.org/jira/browse/HIVE-6564
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.13.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Attachments: HIVE-6564.patch


 WebHCat E2E tests that fire off an MR job are not correctly being detected as 
 complete so those tests are timing out.
 The problem is happening because of JSON module available through cpan which 
 returns 1 or 0 instead of true or false.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6565) OrcSerde should be added as NativeSerDe in SerDeUtils

2014-03-06 Thread Branky Shao (JIRA)
Branky Shao created HIVE-6565:
-

 Summary: OrcSerde should be added as NativeSerDe in SerDeUtils
 Key: HIVE-6565
 URL: https://issues.apache.org/jira/browse/HIVE-6565
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.12.0
Reporter: Branky Shao


If the table defined as ORC format, the columns info can be fetched from 
StorageDescriptor, no need to get from SerDe.

And obviously, ORC is a Hive's native file format, therefore, OrcSerde should 
be added as NativeSerDe in SerDeUtils.

The fix is fairly simple, just add single line in SerDeUtils :

nativeSerDeNames.add(org.apache.hadoop.hive.ql.io.orc.OrcSerde.class.getName());
 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Attachment: HIVE-6430.patch

New code probably has tons of bugs, but some old tests I ran have passed, let's 
try HiveQA. I will run tez tests

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Status: Patch Available  (was: Open)

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-6566:


 Summary: Incorrect union-all plan with map-joins on Tez
 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


The tez dag is hooked up incorrectly for some union all queries involving map 
joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-887) Allow SELECT col without a mapreduce job

2014-03-06 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922156#comment-13922156
 ] 

Navis commented on HIVE-887:


We should defaultize hive.fetch.task.conversion as more, someday.

 Allow SELECT col without a mapreduce job
 --

 Key: HIVE-887
 URL: https://issues.apache.org/jira/browse/HIVE-887
 Project: Hive
  Issue Type: New Feature
 Environment: All
Reporter: Eric Sun
Assignee: Ning Zhang
 Fix For: 0.10.0


 I often find myself needing to take a quick look at a particular column of a 
 Hive table.
 I usually do this by doing a 
 SELECT * from table LIMIT 20;
 from the CLI.  Doing this is pretty fast since it doesn't require a mapreduce 
 job.  However, it's tough to examine just 1 or 2 columns when the table is 
 very wide.
 So, I might do
 SELECT col from table LIMIT 20;
 but it's much slower since it requires a map-reduce.  It'd be really 
 convenient if a map-reduce wasn't necessary.
 Currently a good work around is to do
 hive -e select * from table | cut --key=n
 but it'd be more convenient if it were built in since it alleviates the need 
 for column counting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6566:
-

Attachment: HIVE-6566.1.patch

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6566:
-

Status: Patch Available  (was: Open)

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6332) HCatConstants Documentation needed

2014-03-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922159#comment-13922159
 ] 

Lefty Leverenz commented on HIVE-6332:
--

Looks good overall.  Of course I have some editorial nits, but they shouldn't 
clutter up this jira.  One typo you could fix now:  
hcat.dynamic.partitioning.custom.patttern (triple t).

An introduction would be helpful, mentioning the HCatConstants.java file and 
explaining basic usage.

Why are cache parameters hcatalog.hive.xxx while all other parameters are 
hcat.xxx?  (I'm asking about hcat vs. hcatalog, not the hive part.)

This sentence in the first section confuses me:  An override to specify where 
HCatStorer will write to, defined from pig jobs, either directly by user, or by 
using org.apache.hive.hcatalog.pig.HCatStorerWrapper.  Does it mean that Pig 
jobs specify hcat.pig.storer.external.location?  Could you give examples of 
specifying by user and by HCatStorerWrapper?

In the Data Promotion section, this sentence seems a bit off:  On the write 
side, it is expected that the user pass in valid HCatRecords with data 
correctly.  Does that mean with data correctly typed for Hive?

That's it for my first pass.  I'll take another look later.

 HCatConstants Documentation needed
 --

 Key: HIVE-6332
 URL: https://issues.apache.org/jira/browse/HIVE-6332
 Project: Hive
  Issue Type: Task
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan

 HCatConstants documentation is near non-existent, being defined only as 
 comments in code for the various parameters. Given that a lot of api winds up 
 being implemented as knobs that can be tweaked here, we should have a public 
 facing doc for this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes

2014-03-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922295#comment-13922295
 ] 

Hive QA commented on HIVE-6551:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12632789/HIVE-6551.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5358 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket5
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1637/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1637/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12632789

 group by after join with skew join optimization references invalid task 
 sometimes
 -

 Key: HIVE-6551
 URL: https://issues.apache.org/jira/browse/HIVE-6551
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6551.1.patch.txt


 For example,
 {noformat}
 hive set hive.auto.convert.join = true;
 hive set hive.optimize.skewjoin = true;
 hive set hive.skewjoin.key = 3;
 hive 
  EXPLAIN FROM 
  (SELECT src.* FROM src) x
  JOIN 
  (SELECT src.* FROM src) Y
  ON (x.key = Y.key)
  SELECT sum(hash(Y.key)), sum(hash(Y.value));
 OK
 STAGE DEPENDENCIES:
   Stage-8 is a root stage
   Stage-6 depends on stages: Stage-8
   Stage-5 depends on stages: Stage-6 , consists of Stage-4, Stage-2
   Stage-4
   Stage-2 depends on stages: Stage-4, Stage-1
   Stage-0 is a root stage
 ...
 {noformat}
 Stage-2 references not-existing Stage-1



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6508) Mismatched results between vector and non-vector mode with decimal field

2014-03-06 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922304#comment-13922304
 ] 

Remus Rusanu commented on HIVE-6508:


The value 0 comes in the input vector unscaled (scale 0). As aggregates (SUM, 
STDxx) are being updated, they us the scale of the input value, not the scale 
of the input column. So any 0 in the input will round the intermediate 
fractional part of the intermediate. Final result is off. AVG uses a special 
scale so is not affected. MIN/MAX use the input value scale, but has no side 
effects. Fix is to pass in the column scale explictly, rather than assume the 
input value scale has the column scale. Ultimately the behavior of passing in 
unscaled 0s is wrong, but this comes from the row-mode join modus-operandi and 
I don't want to change that. Hardening the aggregates against this case is more 
robust.

 Mismatched results between vector and non-vector mode with decimal field
 

 Key: HIVE-6508
 URL: https://issues.apache.org/jira/browse/HIVE-6508
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu

 Following query has a little mismatch in result as compared to the non-vector 
 mode.
 {code}
 select d_year, i_brand_id, i_brand,
sum(ss_ext_sales_price) as sum_agg
 from date_dim
 join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk
 join item on store_sales.ss_item_sk = item.i_item_sk
 where i_manufact_id = 128
   and d_moy = 11
 group by d_year, i_brand, i_brand_id
 order by d_year, sum_agg desc, i_brand_id
 limit 100;
 {code}
 This query is on tpcds data.
 The field ss_ext_sales_price is of type decimal(7,2) and everything else is 
 an integer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6508) Mismatched results between vector and non-vector mode with decimal field

2014-03-06 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6508:
---

Status: Patch Available  (was: Open)

 Mismatched results between vector and non-vector mode with decimal field
 

 Key: HIVE-6508
 URL: https://issues.apache.org/jira/browse/HIVE-6508
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Attachments: HIVE-6508.1.patch


 Following query has a little mismatch in result as compared to the non-vector 
 mode.
 {code}
 select d_year, i_brand_id, i_brand,
sum(ss_ext_sales_price) as sum_agg
 from date_dim
 join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk
 join item on store_sales.ss_item_sk = item.i_item_sk
 where i_manufact_id = 128
   and d_moy = 11
 group by d_year, i_brand, i_brand_id
 order by d_year, sum_agg desc, i_brand_id
 limit 100;
 {code}
 This query is on tpcds data.
 The field ss_ext_sales_price is of type decimal(7,2) and everything else is 
 an integer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6508) Mismatched results between vector and non-vector mode with decimal field

2014-03-06 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6508:
---

Attachment: HIVE-6508.1.patch

 Mismatched results between vector and non-vector mode with decimal field
 

 Key: HIVE-6508
 URL: https://issues.apache.org/jira/browse/HIVE-6508
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Attachments: HIVE-6508.1.patch


 Following query has a little mismatch in result as compared to the non-vector 
 mode.
 {code}
 select d_year, i_brand_id, i_brand,
sum(ss_ext_sales_price) as sum_agg
 from date_dim
 join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk
 join item on store_sales.ss_item_sk = item.i_item_sk
 where i_manufact_id = 128
   and d_moy = 11
 group by d_year, i_brand, i_brand_id
 order by d_year, sum_agg desc, i_brand_id
 limit 100;
 {code}
 This query is on tpcds data.
 The field ss_ext_sales_price is of type decimal(7,2) and everything else is 
 an integer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-06 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Status: Open  (was: Patch Available)

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.2.patch, HIVE-5998.3.patch, 
 HIVE-5998.4.patch, HIVE-5998.5.patch, HIVE-5998.6.patch, HIVE-5998.7.patch, 
 HIVE-5998.8.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-06 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Status: Patch Available  (was: Open)

Looks like Jenkins lost its queue, resubmit

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.2.patch, HIVE-5998.3.patch, 
 HIVE-5998.4.patch, HIVE-5998.5.patch, HIVE-5998.6.patch, HIVE-5998.7.patch, 
 HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-06 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Attachment: HIVE-5998.9.patch

.8 resubmitted

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.2.patch, HIVE-5998.3.patch, 
 HIVE-5998.4.patch, HIVE-5998.5.patch, HIVE-5998.6.patch, HIVE-5998.7.patch, 
 HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6536) Reduce dependencies of org.apache.hive:hive-jdbc maven module

2014-03-06 Thread Kevin Minder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Minder updated HIVE-6536:
---

Attachment: hive-hdbc-maven-dependencies-0-13.log

Here is the tree for 0.13.0.
Note that this appears to be missing 
org.apache.hadoop:hadoop-mapreduce-client-core
and this seems to be required to use the driver. 

 Reduce dependencies of org.apache.hive:hive-jdbc maven module
 -

 Key: HIVE-6536
 URL: https://issues.apache.org/jira/browse/HIVE-6536
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0
 Environment: org.apache.hive:hive-jdbc:jar:0.12.0
Reporter: Kevin Minder
 Attachments: hive-hdbc-maven-dependencies-0-13.log, 
 hive-jdbc-maven-dependencies.log


 The Hive JDBC driver maven module requires a significant number of 
 dependencies that are likely unnecessary and will result in bloating of 
 consumers.  Most of this is a result of the dependency on 
 org.apache.hive:hive-cli.  I have attached a portion of the output from mvn 
 depedency:tree output for a client that depends on the 
 org.apache.hive:hive-jdbc module.  Note the extra 2.0.6.1-102 in the output 
 is the result of our local build and publish to a local nexus repo.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6536) Reduce dependencies of org.apache.hive:hive-jdbc maven module

2014-03-06 Thread Kevin Minder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Minder updated HIVE-6536:
---

Attachment: (was: hive-hdbc-maven-dependencies-0-13.log)

 Reduce dependencies of org.apache.hive:hive-jdbc maven module
 -

 Key: HIVE-6536
 URL: https://issues.apache.org/jira/browse/HIVE-6536
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0
 Environment: org.apache.hive:hive-jdbc:jar:0.12.0
Reporter: Kevin Minder
 Attachments: hive-jdbc-maven-dependencies-0-13.log, 
 hive-jdbc-maven-dependencies.log


 The Hive JDBC driver maven module requires a significant number of 
 dependencies that are likely unnecessary and will result in bloating of 
 consumers.  Most of this is a result of the dependency on 
 org.apache.hive:hive-cli.  I have attached a portion of the output from mvn 
 depedency:tree output for a client that depends on the 
 org.apache.hive:hive-jdbc module.  Note the extra 2.0.6.1-102 in the output 
 is the result of our local build and publish to a local nexus repo.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6536) Reduce dependencies of org.apache.hive:hive-jdbc maven module

2014-03-06 Thread Kevin Minder (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Minder updated HIVE-6536:
---

Attachment: hive-jdbc-maven-dependencies-0-13.log

 Reduce dependencies of org.apache.hive:hive-jdbc maven module
 -

 Key: HIVE-6536
 URL: https://issues.apache.org/jira/browse/HIVE-6536
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0
 Environment: org.apache.hive:hive-jdbc:jar:0.12.0
Reporter: Kevin Minder
 Attachments: hive-jdbc-maven-dependencies-0-13.log, 
 hive-jdbc-maven-dependencies.log


 The Hive JDBC driver maven module requires a significant number of 
 dependencies that are likely unnecessary and will result in bloating of 
 consumers.  Most of this is a result of the dependency on 
 org.apache.hive:hive-cli.  I have attached a portion of the output from mvn 
 depedency:tree output for a client that depends on the 
 org.apache.hive:hive-jdbc module.  Note the extra 2.0.6.1-102 in the output 
 is the result of our local build and publish to a local nexus repo.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 15873: Query cancel should stop running MR tasks

2014-03-06 Thread Thejas Nair


 On Feb. 27, 2014, 11:08 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java, line 110
  https://reviews.apache.org/r/15873/diff/3/?file=478815#file478815line110
 
  When pollFinished is running, this shutdown() function will not be able 
  to make progress. Which means that the query cancellation will happen only 
  after a task (could be an MR task) is complete.
  
  It seems synchronizing around shutdown should be sufficient, either by 
  making it volatile or having synchronized methods around it.
  
  Since thread safe concurrent collection classes are being used here, I 
  don't see other concurrency issues that would make it necessary to make all 
  these functions synchronized. 
  
  
 
 
 Navis Ryu wrote:
 It just only polls status of running tasks and goes into wait state quite 
 quickly, so it would not hinder shutdown process. Furthermore, two threads, 
 polling and shutdown, has a race condition on both collections, runnable and 
 running, so those should be guarded by shared something.
 
 Thejas Nair wrote:
 Yes, it will go into the wait state quickly. But I haven't understood how 
 the wait helps here. There is no notify in this code, so the wait will always 
 wait for 2 seconds. It will be no different from a sleep(2000) .
 So it looks like the polling outside loop will continue until all the 
 currently running jobs are complete.

 
 Navis Ryu wrote:
 In javadoc, Object.wait()
 
 The current thread must own this object's monitor. The thread 
 releases ownership of this monitor and waits until another thread 
 notifies threads waiting on this object's monitor
 
 In wait state, any other thread can take the monitor (in sleep, it's not 
 possible). So shutdown thread does not need to wait for 2 seconds. Polling 
 thread might notice 2 seconds after shutdown as you said because it's not 
 notified. But I think it's not a big deal. Isn't it?

Thanks for the explanation!


- Thejas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15873/#review35625
---


On March 4, 2014, 8:02 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15873/
 ---
 
 (Updated March 4, 2014, 8:02 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5901
 https://issues.apache.org/jira/browse/HIVE-5901
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Currently, query canceling does not stop running MR job immediately.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 332cadb 
   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java c51a9c8 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java 854cd52 
   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java ead7b59 
 
 Diff: https://reviews.apache.org/r/15873/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu
 




Re: Review Request 15873: Query cancel should stop running MR tasks

2014-03-06 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15873/#review36359
---



ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
https://reviews.apache.org/r/15873/#comment67310

This second addToRunnable(tsk) is redundant.



- Thejas Nair


On March 4, 2014, 8:02 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15873/
 ---
 
 (Updated March 4, 2014, 8:02 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5901
 https://issues.apache.org/jira/browse/HIVE-5901
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Currently, query canceling does not stop running MR job immediately.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 332cadb 
   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java c51a9c8 
   ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java 854cd52 
   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java ead7b59 
 
 Diff: https://reviews.apache.org/r/15873/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Navis Ryu
 




[jira] [Commented] (HIVE-5901) Query cancel should stop running MR tasks

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922571#comment-13922571
 ] 

Thejas M Nair commented on HIVE-5901:
-

[~navis] I have added a comment on reviewboard on the latest update there. 
But there seems to be slight difference in the HIVE-5901.6.patch.txt patch here 
and latest one in reviewboard. In HIVE-5901.6.patch.txt the 'boolean shutdown' 
has been made volatile, which is not necessary as you pointed out with the 
changes to add synchronization.


 Query cancel should stop running MR tasks
 -

 Key: HIVE-5901
 URL: https://issues.apache.org/jira/browse/HIVE-5901
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5901.1.patch.txt, HIVE-5901.2.patch.txt, 
 HIVE-5901.3.patch.txt, HIVE-5901.4.patch.txt, HIVE-5901.5.patch.txt, 
 HIVE-5901.6.patch.txt


 Currently, query canceling does not stop running MR job immediately.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (HIVE-5901) Query cancel should stop running MR tasks

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922571#comment-13922571
 ] 

Thejas M Nair edited comment on HIVE-5901 at 3/6/14 2:42 PM:
-

[~navis] I have added a comment on reviewboard on the latest update there. 
But there seems to be slight difference in the HIVE-5901.6.patch.txt patch here 
and latest one in reviewboard. In HIVE-5901.6.patch.txt the 'boolean shutdown' 
has been made volatile, which is not necessary as you pointed out with the 
changes to make the functions synchronized.



was (Author: thejas):
[~navis] I have added a comment on reviewboard on the latest update there. 
But there seems to be slight difference in the HIVE-5901.6.patch.txt patch here 
and latest one in reviewboard. In HIVE-5901.6.patch.txt the 'boolean shutdown' 
has been made volatile, which is not necessary as you pointed out with the 
changes to add synchronization.


 Query cancel should stop running MR tasks
 -

 Key: HIVE-5901
 URL: https://issues.apache.org/jira/browse/HIVE-5901
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-5901.1.patch.txt, HIVE-5901.2.patch.txt, 
 HIVE-5901.3.patch.txt, HIVE-5901.4.patch.txt, HIVE-5901.5.patch.txt, 
 HIVE-5901.6.patch.txt


 Currently, query canceling does not stop running MR job immediately.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore

2014-03-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5218:


Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Marking as duplicate. HIVE-5099 has patch that upgrades to a newer datanucleus 
version.


 datanucleus does not work with MS SQLServer in Hive metastore
 -

 Key: HIVE-5218
 URL: https://issues.apache.org/jira/browse/HIVE-5218
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 0.13.0

 Attachments: 
 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, 
 HIVE-5218-trunk.patch, HIVE-5218-trunk.patch, HIVE-5218-v2.patch, 
 HIVE-5218.2.patch, HIVE-5218.patch


 HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of 
 datanucleus doesn't work with SQLServer as the metastore. The problem is that 
 datanucleus tries to use fully qualified object name to find a table in the 
 database but couldn't find it.
 If I downgrade the version to HIVE-2084, SQLServer works fine.
 It could be a bug in datanucleus.
 This is the detailed exception I'm getting when using datanucleus 3.2.x with 
 SQL Server:
 {noformat}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTa
 sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling 
 table
 .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE
 at 
 org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc
 eption(NucleusJDOHelper.java:596)
 at 
 org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe
 rsistenceManager.java:732)
 …
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS
 tore.java:111)
 at $Proxy0.createTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_core(HiveMetaStore.java:1071)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl
 e_with_environment_context(HiveMetaStore.java:1104)
 …
 at $Proxy11.create_table_with_environment_context(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr
 eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401)
 NestedThrowablesStackTrace:
 com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object 
 name
 d 'SEQUENCE_TABLE' in the database.
 at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError
 (SQLServerException.java:197)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ
 erStatement.java:1493)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ
 LServerStatement.java:775)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute
 (SQLServerStatement.java:676)
 at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe
 rverConnection.java:1400)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer
 verStatement.java:179)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS
 erverStatement.java:154)
 at 
 com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat
 ement.java:649)
 at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A
 bstractTable.java:760)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi
 st(AbstractTable.java:711)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.
 java:425)
 at 
 org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.
 java:488)
 at 
 org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE
 xists(TableGenerator.java:242)
 at 
 org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt
 ainGenerationBlock(AbstractRDBMSGenerator.java:86)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati
 onBlock(AbstractGenerator.java:197)
 at 
 org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG
 enerator.java:105)
 at 
 org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene
 rator(RDBMSStoreManager.java:2019)
 at 
 

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server

2014-03-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5099:


Attachment: HIVE-5099.3.patch

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch, HIVE-5099.3.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5.get_partitions_by_filter(Unknown Source)
   at 
 

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server

2014-03-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5099:


Status: Patch Available  (was: Open)

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch, HIVE-5099.3.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5.get_partitions_by_filter(Unknown Source)
   at 
 

[jira] [Commented] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922640#comment-13922640
 ] 

Thejas M Nair commented on HIVE-5099:
-

+1 . 
Rebased the patch, changes were only to surrounding line version numbers.


 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch, HIVE-5099.3.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at 

[jira] [Updated] (HIVE-6487) PTest2 do not copy failed source directories

2014-03-06 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6487:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Thank you Szehon! I have committed this to trunk!!

 PTest2 do not copy failed source directories
 

 Key: HIVE-6487
 URL: https://issues.apache.org/jira/browse/HIVE-6487
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Szehon Ho
 Fix For: 0.14.0

 Attachments: HIVE-6487.2.patch, HIVE-6487.patch


 Right now we copy the entire source directory for failed tests back to the 
 master (up to 5). They are 10GB per so it takes a very long time. We should 
 remove this feature.
 Remove the cp command from batch-exec.vm:
 https://github.com/apache/hive/blob/trunk/testutils/ptest2/src/main/resources/batch-exec.vm#L91
 also don't publish the number of failed tests as a template variable:
 NO_PRECOMMIT_TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922684#comment-13922684
 ] 

Thejas M Nair commented on HIVE-5099:
-

FYI, the new versions of datanucleus in this patch has been tested to work with 
following databases

MySQL 5.x
Oracle11g r2
Postgres 8.x
Postgres 9.x
SQL Server - Windows

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch, HIVE-5099.3.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 

[jira] [Commented] (HIVE-6495) TableDesc.getDeserializer() should use correct classloader when calling Class.forName()

2014-03-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922685#comment-13922685
 ] 

Ashutosh Chauhan commented on HIVE-6495:


[~jdere] Looks like this need to be reuploaded for jenkins to pick it up.

 TableDesc.getDeserializer() should use correct classloader when calling 
 Class.forName()
 ---

 Key: HIVE-6495
 URL: https://issues.apache.org/jira/browse/HIVE-6495
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6495.1.patch


 User is getting an error with the following stack trace below.  It looks like 
 when Class.forName() is called, it may not be using the correct class loader 
 (JavaUtils.getClassLoader() is used in other contexts when the loaded jar may 
 be required).
 {noformat}
 FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
 Failed with exception java.lang.ClassNotFoundException: 
 my.serde.ColonSerdejava.lang.RuntimeException: 
 java.lang.ClassNotFoundException: my.serde.ColonSerde
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:231)
 at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:608)
 at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:80)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:497)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:352)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:995)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1038)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.ClassNotFoundException: my.serde.ColonSerde
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:190)
 at 
 org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:66)
 ... 20 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3635) allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for the boolean hive type

2014-03-06 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922710#comment-13922710
 ] 

Brock Noland commented on HIVE-3635:


+1 LGTM

  allow 't', 'T', '1', 'f', 'F', and '0' to be allowable true/false values for 
 the boolean hive type
 ---

 Key: HIVE-3635
 URL: https://issues.apache.org/jira/browse/HIVE-3635
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.9.0
Reporter: Alexander Alten-Lorenz
Assignee: Xuefu Zhang
 Attachments: HIVE-3635.1.patch, HIVE-3635.2.patch, HIVE-3635.patch


 interpret t as true and f as false for boolean types. PostgreSQL exports 
 represent it that way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6446) Ability to specify hadoop.bin.path from command line -D

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6446:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Remus!

 Ability to specify hadoop.bin.path from command line -D
 ---

 Key: HIVE-6446
 URL: https://issues.apache.org/jira/browse/HIVE-6446
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6446.1.patch, HIVE-6446.2.patch


 the surefire plugin configures hadoop.bin.path as a system property:
 {code}
 hadoop.bin.path${basedir}/${hive.path.to.root}/testutils/hadoop/hadoop.bin.path
 {code}
 On Windows testing, this should be: 
 {code}
 hadoop.bin.path${basedir}/${hive.path.to.root}/testutils/hadoop.cmd/hadoop.bin.path
 {code}
 Additionally, it would be useful to be able to  specify the Hadoop CLI 
 location from -D mvn command line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6567) show grant ... on all fails with NPE

2014-03-06 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6567:
---

 Summary: show grant ... on all fails with NPE
 Key: HIVE-6567
 URL: https://issues.apache.org/jira/browse/HIVE-6567
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair


With sql std auth -
{code}
hive show grant user user1 on all;
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. null

2014-03-06 08:52:39,238 ERROR exec.DDLTask (DDLTask.java:execute(423)) - 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.Utilities.getDbTableName(Utilities.java:2033)
at 
org.apache.hadoop.hive.ql.exec.DDLTask.getHivePrivilegeObject(DDLTask.java:819)
at org.apache.hadoop.hive.ql.exec.DDLTask.showGrantsV2(DDLTask.java:612)
at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:515)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:388)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1456)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1229)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:874)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:864)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5933) SQL std auth - add support to metastore api to list all privileges for a user

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922748#comment-13922748
 ] 

Thejas M Nair commented on HIVE-5933:
-

show grant user hive_test_user on all; added in HIVE-6122 provides the 
equivalent functionality of 
SHOW GRANTS FOR user;
SHOW GRANTS FOR role;
But the command fails with sql std auth with an NPE . Created HIVE-6567 to 
track it.


 SQL std auth - add support to metastore api to list all privileges for a user
 -

 Key: HIVE-5933
 URL: https://issues.apache.org/jira/browse/HIVE-5933
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
   Original Estimate: 24h
  Remaining Estimate: 24h

 This is for supporting SHOW GRANTS statements -
 SHOW GRANTS;
 SHOW GRANTS FOR user;
 SHOW GRANTS FOR role;



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5933) SQL std auth - add support to metastore api to list all privileges for a user

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922752#comment-13922752
 ] 

Thejas M Nair commented on HIVE-5933:
-

Duplicates HIVE-6122 except for 'show grants;' command which is not supported 
via HIVE-6122 .


 SQL std auth - add support to metastore api to list all privileges for a user
 -

 Key: HIVE-5933
 URL: https://issues.apache.org/jira/browse/HIVE-5933
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
   Original Estimate: 24h
  Remaining Estimate: 24h

 This is for supporting SHOW GRANTS statements -
 SHOW GRANTS;
 SHOW GRANTS FOR user;
 SHOW GRANTS FOR role;



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6508) Mismatched results between vector and non-vector mode with decimal field

2014-03-06 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922775#comment-13922775
 ] 

Jitendra Nath Pandey commented on HIVE-6508:


+1. The patch looks good to me.

 Mismatched results between vector and non-vector mode with decimal field
 

 Key: HIVE-6508
 URL: https://issues.apache.org/jira/browse/HIVE-6508
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Attachments: HIVE-6508.1.patch


 Following query has a little mismatch in result as compared to the non-vector 
 mode.
 {code}
 select d_year, i_brand_id, i_brand,
sum(ss_ext_sales_price) as sum_agg
 from date_dim
 join store_sales on date_dim.d_date_sk = store_sales.ss_sold_date_sk
 join item on store_sales.ss_item_sk = item.i_item_sk
 where i_manufact_id = 128
   and d_moy = 11
 group by d_year, i_brand, i_brand_id
 order by d_year, sum_agg desc, i_brand_id
 limit 100;
 {code}
 This query is on tpcds data.
 The field ss_ext_sales_price is of type decimal(7,2) and everything else is 
 an integer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6568) Vectorized cast of decimal to string produces strings with trailing zeros.

2014-03-06 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-6568:
--

 Summary: Vectorized cast of decimal to string produces strings 
with trailing zeros.
 Key: HIVE-6568
 URL: https://issues.apache.org/jira/browse/HIVE-6568
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


A decimal value 1.23 with scale 5 is represented in string as 1.23000. This 
behavior is different from HiveDecimal behavior.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6551) group by after join with skew join optimization references invalid task sometimes

2014-03-06 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922810#comment-13922810
 ] 

Ashutosh Chauhan commented on HIVE-6551:


+1

 group by after join with skew join optimization references invalid task 
 sometimes
 -

 Key: HIVE-6551
 URL: https://issues.apache.org/jira/browse/HIVE-6551
 Project: Hive
  Issue Type: Bug
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6551.1.patch.txt


 For example,
 {noformat}
 hive set hive.auto.convert.join = true;
 hive set hive.optimize.skewjoin = true;
 hive set hive.skewjoin.key = 3;
 hive 
  EXPLAIN FROM 
  (SELECT src.* FROM src) x
  JOIN 
  (SELECT src.* FROM src) Y
  ON (x.key = Y.key)
  SELECT sum(hash(Y.key)), sum(hash(Y.value));
 OK
 STAGE DEPENDENCIES:
   Stage-8 is a root stage
   Stage-6 depends on stages: Stage-8
   Stage-5 depends on stages: Stage-6 , consists of Stage-4, Stage-2
   Stage-4
   Stage-2 depends on stages: Stage-4, Stage-1
   Stage-0 is a root stage
 ...
 {noformat}
 Stage-2 references not-existing Stage-1



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6558) HiveServer2 Plain SASL authentication broken after hadoop 2.3 upgrade

2014-03-06 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922823#comment-13922823
 ] 

Prasad Mujumdar commented on HIVE-6558:
---

[~thejas],  [~ashutoshc] would you mind taking a look. This should be a blocker 
for 0.13 release. Thanks!

 HiveServer2 Plain SASL authentication broken after hadoop 2.3 upgrade
 -

 Key: HIVE-6558
 URL: https://issues.apache.org/jira/browse/HIVE-6558
 Project: Hive
  Issue Type: Bug
  Components: Authentication, HiveServer2
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
Priority: Blocker
 Attachments: HIVE-6558.2.patch


 Java only includes Plain SASL client and not server. Hence HiveServer2 
 includes a Plain SASL server implementation. Now Hadoop has its own Plain 
 SASL server [HADOOP-9020|https://issues.apache.org/jira/browse/HADOOP-9020] 
 which is part of Hadoop 2.3 
 [release|http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/releasenotes.html].
 The two servers use different Sasl callbacks and the servers are registered 
 in java.security.Provider via static code. As a result the HiveServer2 
 instance could be using Hadoop's Plain SASL server which breaks the 
 authentication.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Thoughts on new metastore APIs

2014-03-06 Thread Thejas Nair
Thanks for discussing this, Brock. I agree that is important to
consider while writing a new metastore api call.
But I think this (single input/output struct)  should be a guideline,
I am not sure if this should be used in every case.

What you are saying shows that there is a tradeoff between ending up
with more functions vs ending up with more input/output
structs/classes. I am not sure if having more input/output structs is
any better.

Take the case of create_table/create_table_with_environment_context
that you mentioned. Even though create_table had a single input
argument Table, instead of adding  EnvironmentContext contents to
Table struct, the authors decided to create a new function with
additional  EnvironmentContext argument. This makes sense because the
Table struct is used by other functions as well such as get_table, and
EnvironmentContext fields don't make sense for those cases as it is
not persisted as part of table.

Which means that the only way to prevent creation of
create_table_with_environment_context method would have been to have a
CreateTableArg struct as input argument instead of Table as the
argument. ie, creating a different struct for the single input/output
every function is only way you can be sure that you don't need more
functions.

This approach of reducing the number of functions also means that you
would start encoding different types of actions within the single
input argument.

Consider the case of get_partition vs get_partition_by_name. It would
need a single struct with an enum that tells if it to lookup based on
the partition key-values or the name, and based on the enum it would
use different fields in the struct. I feel having different functions
is more readable for this case.


For example, the api in HIVE-5931 would need to change from

listRolePrincipalGrant get_principals_in_role(1:string role_name)

to


struct GetPrincipalInRoleOutput{
 1:listRolePrincipalGrant rolePrincList;
}

struct GetPrincipalInRoleInput{
1:string role_name;
}
GetPrincipalInRoleOutput get_principals_in_role(1:
GetPrincipalInRoleInput input);


I am not sure if the insurance costs in terms of readability is low
here. I think we should consider the risk in each case of function
proliferation and pay the costs accordingly.
Let me know if I have misunderstood what you are proposing here.

Thanks,
Thejas



On Wed, Mar 5, 2014 at 11:39 AM, Brock Noland br...@cloudera.com wrote:
 Hi,


 There is a ton of great work going into the 0.13 release.
 Specifically, we are adding a ton of APIs to the metastore:

 https://github.com/apache/hive/blame/trunk/metastore/if/hive_metastore.thrift

 Few of these new API's follow the best practice of a single request
 and response struct. Some follow this partially by having a single
 response object but take no arguments while others return void and
 take a single request object.  Still others, mostly related to
 authorization, do not even partially follow this pattern.

 The single request/response struct model is extremely important as
 changing the number of arguments is a backwards incompatible change.
 Therefore the only way to change an api is to add *new* methods calls.
 This is why we have so many crazy APIs in the hive metastore such as
 create_table/create_table_with_environment_context and 12 (yes,
 twelve) ways to get partitions.

 I would like to suggest that we require all new APIs to follow the
 single request/response struct model. That is any new API that would
 be committed *after* today.

 I have heard the following arguments against this approach which I
 believe to be invalid:

 *This API will never change (or never return a value or never take
 another value)*
 We all have been writing code enough that we don't know, there are
 unknown unknowns. By following the single request/response struct
 model for *all* APIs we can future proof ourselves. Why wouldn't we
 want to buy insurance now when it's cheap?

 *The performance impact of wrapping an object is too much*
 These calls are being made over the network which is orders of
 magnitude slower than creating a small, simple, and lightweight object
 to wrap method arguments and response values.

 Cheers,
 Brock

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Hung Precommit Jenkins Jobs

2014-03-06 Thread Brock Noland
Ashutosh informed me that the precommit build was hung. Long story
short, PTest2 had completed. For example, see this job which took 8
hours:

http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1636/

That correpsonds to:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1637/execution.txt

which if you scroll all the way to the bottom, you will note it
finished long before that:

2014-03-06 05:48:22,194  INFO PTest.run:207 Executed 5358 tests
2014-03-06 05:48:22,194  INFO PTest.run:209 PERF: Phase ExecutionPhase
took 101 minutes
2014-03-06 05:48:22,194  INFO PTest.run:209 PERF: Phase PrepPhase took 5 minutes
2014-03-06 05:48:22,194  INFO PTest.run:209 PERF: Phase ReportingPhase
took 0 minutes
2014-03-06 05:48:22,194  INFO JIRAService.postComment:136 Comment:

{color:red}Overall{color}: -1 at least one tests failed



Long story short, it looks like Bigtop Jenkins is not as reliable as
we would like.


[jira] [Updated] (HIVE-6555) TestSchemaTool is failing on trunk after branching

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6555:
---

Status: Patch Available  (was: Open)

 TestSchemaTool is failing on trunk after branching
 --

 Key: HIVE-6555
 URL: https://issues.apache.org/jira/browse/HIVE-6555
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6555-branch13.patch, HIVE-6555.patch


 This is because version was bumped to 0.14 in pom file and there are no 
 metastore scripts for 0.14 yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6338) Improve exception handling in createDefaultDb() in Metastore

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6338:
---

Status: Open  (was: Patch Available)

 Improve exception handling in createDefaultDb() in Metastore
 

 Key: HIVE-6338
 URL: https://issues.apache.org/jira/browse/HIVE-6338
 Project: Hive
  Issue Type: Task
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0, 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6338.1.patch, HIVE-6338.patch


 There is a suggestion on HIVE-5959 comment list on possible improvements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6417) sql std auth - new users in admin role config should get added

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6417:
---

Attachment: HIVE-6417.1.patch

Re-attach for Hive QA to pick up.

 sql std auth - new users in admin role config should get added
 --

 Key: HIVE-6417
 URL: https://issues.apache.org/jira/browse/HIVE-6417
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6417.1.patch, HIVE-6417.patch


 if metastore is started with hive.users.in.admin.role=user1, then user1 is 
 added admin role to metastore.
 If the value is changed to hive.users.in.admin.role=user2, then user2 should 
 get added to the role in metastore. Right now, if the admin role exists, new 
 users don't get added.
 A work-around is -  user1 adding user2 to the admin role using grant role 
 statement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6338) Improve exception handling in createDefaultDb() in Metastore

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6338:
---

Attachment: HIVE-6338.1.patch

Reattaching for Hive QA to pick up.

 Improve exception handling in createDefaultDb() in Metastore
 

 Key: HIVE-6338
 URL: https://issues.apache.org/jira/browse/HIVE-6338
 Project: Hive
  Issue Type: Task
  Components: Metastore
Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6338.1.patch, HIVE-6338.patch


 There is a suggestion on HIVE-5959 comment list on possible improvements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6529) Tez output files are out of date

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6529:
---

Resolution: Implemented
Status: Resolved  (was: Patch Available)

looks like it was already done as part of 60ff41c Tue Feb 25 07:58:52 2014 
+ Merge latest trunk into branch. (Gunther Hagleitner)

 Tez output files are out of date
 

 Key: HIVE-6529
 URL: https://issues.apache.org/jira/browse/HIVE-6529
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-6529.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (HIVE-6538) yet another annoying exception in test logs

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-6538:



sorry wrong jira

 yet another annoying exception in test logs
 ---

 Key: HIVE-6538
 URL: https://issues.apache.org/jira/browse/HIVE-6538
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Szehon Ho
Priority: Trivial
 Attachments: HIVE-6538.2.patch, HIVE-6538.patch


 Whenever you look at failed q tests you have to go thru this useless 
 exception.
 {noformat}
 2014-03-03 11:22:54,872 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - 
 MetaException(message:NoSuchObjectException(message:Function 
 default.qtest_get_java_boolean does not exist))
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4575)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4702)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy8.get_function(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1526)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at $Proxy9.getFunction(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2603)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:578)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:599)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:606)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:94)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:60)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:345)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1078)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1121)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1014)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
   at org.apache.hadoop.hive.ql.QTestUtil.runCmd(QTestUtil.java:655)
   at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:772)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:46)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.internal.runners.SuiteMethod.testFromSuiteMethod(SuiteMethod.java:34)
   at org.junit.internal.runners.SuiteMethod.init(SuiteMethod.java:23)
   at 
 org.junit.internal.builders.SuiteMethodBuilder.runnerForClass(SuiteMethodBuilder.java:14)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:29)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:262)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
   at 
 

[jira] [Updated] (HIVE-6538) yet another annoying exception in test logs

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6538:
---

Resolution: Implemented
Status: Resolved  (was: Patch Available)

Looks like this was done as part of 60ff41c Tue Feb 25 07:58:52 2014 + 
Merge latest trunk into branch. (Gunther Hagleitner)

 yet another annoying exception in test logs
 ---

 Key: HIVE-6538
 URL: https://issues.apache.org/jira/browse/HIVE-6538
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Szehon Ho
Priority: Trivial
 Attachments: HIVE-6538.2.patch, HIVE-6538.patch


 Whenever you look at failed q tests you have to go thru this useless 
 exception.
 {noformat}
 2014-03-03 11:22:54,872 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - 
 MetaException(message:NoSuchObjectException(message:Function 
 default.qtest_get_java_boolean does not exist))
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4575)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4702)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy8.get_function(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1526)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at $Proxy9.getFunction(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2603)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:578)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:599)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:606)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:94)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:60)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:345)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1078)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1121)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1014)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
   at org.apache.hadoop.hive.ql.QTestUtil.runCmd(QTestUtil.java:655)
   at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:772)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:46)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.internal.runners.SuiteMethod.testFromSuiteMethod(SuiteMethod.java:34)
   at org.junit.internal.runners.SuiteMethod.init(SuiteMethod.java:23)
   at 
 org.junit.internal.builders.SuiteMethodBuilder.runnerForClass(SuiteMethodBuilder.java:14)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:29)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:262)
   at 
 

Re: Thoughts on new metastore APIs

2014-03-06 Thread Brock Noland
On Thu, Mar 6, 2014 at 12:13 PM, Thejas Nair the...@hortonworks.com wrote:
 Thanks for discussing this, Brock. I agree that is important to
 consider while writing a new metastore api call.
 But I think this (single input/output struct)  should be a guideline,
 I am not sure if this should be used in every case.

As with any rule, there are always exceptions. However, looking at the
new API's I don't see an instance where it would would have been
harmful to use this model. Exceptions should be extremely rare since
thousands of RPC implementations successfully use the request/response
model. However, as always, it would be up to the developers working on
the change to make this call. They'd do so knowing they are going
against the guideline and could be asked to justify why they are doing
so.

 What you are saying shows that there is a tradeoff between ending up
 with more functions vs ending up with more input/output
 structs/classes. I am not sure if having more input/output structs is
 any better.

 Take the case of create_table/create_table_with_environment_context
 that you mentioned. Even though create_table had a single input
 argument Table, instead of adding  EnvironmentContext contents to
 Table struct, the authors decided to create a new function with
 additional  EnvironmentContext argument. This makes sense because the
 Table struct is used by other functions as well such as get_table, and
 EnvironmentContext fields don't make sense for those cases as it is
 not persisted as part of table.

 Which means that the only way to prevent creation of
 create_table_with_environment_context method would have been to have a
 CreateTableArg struct as input argument instead of Table as the
 argument. ie, creating a different struct for the single input/output
 every function is only way you can be sure that you don't need more
 functions.

RPC methods are special. They are published to the world and
therefore cannot be easily modified or refactored. Once we create a
new RPC method, we are stuck with it for a very long time. In this
way, Thrift is rather strange in that it allows you to exposed the api
signatures. The request/response model is far more common.

Therefore, if we were creating create_table from scratch, I would
suggest we use:

CreateTableRequest/CreateTableResponse

That way, we can add optional arguments such as environment context
very easily. Although, I'd love to take credit for this idea, I didn't
just come up with this myself. This is a standard way of handling RPC.

 This approach of reducing the number of functions also means that you
 would start encoding different types of actions within the single
 input argument.

 Consider the case of get_partition vs get_partition_by_name. It would
 need a single struct with an enum that tells if it to lookup based on
 the partition key-values or the name, and based on the enum it would
 use different fields in the struct. I feel having different functions
 is more readable for this case.

There will be cases where we'll need similar methods. This point where
is with the request/response model, adding a single parameter doesn't
require an entirely new method. The developers working on the change
would have to make the call as to whether a new functionality requires
a new API or can be handled within the current API.

 For example, the api in HIVE-5931 would need to change from
 
 listRolePrincipalGrant get_principals_in_role(1:string role_name)
 
 to

 
 struct GetPrincipalInRoleOutput{
  1:listRolePrincipalGrant rolePrincList;
 }

 struct GetPrincipalInRoleInput{
 1:string role_name;
 }
 GetPrincipalInRoleOutput get_principals_in_role(1:
 GetPrincipalInRoleInput input);
 

 I am not sure if the insurance costs in terms of readability is low
 here. I think we should consider the risk in each case of function
 proliferation and pay the costs accordingly.
 Let me know if I have misunderstood what you are proposing here.

The Output and Input names do feel odd. As I said above.
Request/Response are standard names for these kinds of objects. Today,
for a method like the one above, it may feel like overheard or extra
work. However, in the future when you want to add another parameter
such as isFilter or encoding etc, then the insurance pays off big
time.

Brock


[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-03-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922869#comment-13922869
 ] 

Xuefu Zhang commented on HIVE-6468:
---

I agree that guard against this is good. Just curious,  however, why a http get 
request would put HS2 in OOM? It's understandable that HS2 doesn't understand 
the request, but how it runs out of memory seems interesting.

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Attachments: HIVE-6468.1.patch.txt


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6434) Restrict function create/drop to admin roles

2014-03-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922877#comment-13922877
 ] 

Thejas M Nair commented on HIVE-6434:
-

+1

 Restrict function create/drop to admin roles
 

 Key: HIVE-6434
 URL: https://issues.apache.org/jira/browse/HIVE-6434
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization, UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6434.1.patch, HIVE-6434.2.patch, HIVE-6434.3.patch, 
 HIVE-6434.4.patch, HIVE-6434.5.patch


 Restrict function create/drop to admin roles, if sql std auth is enabled. 
 This would include temp/permanent functions, as well as macros.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6417) sql std auth - new users in admin role config should get added

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6417:
---

Status: Open  (was: Patch Available)

 sql std auth - new users in admin role config should get added
 --

 Key: HIVE-6417
 URL: https://issues.apache.org/jira/browse/HIVE-6417
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6417.patch


 if metastore is started with hive.users.in.admin.role=user1, then user1 is 
 added admin role to metastore.
 If the value is changed to hive.users.in.admin.role=user2, then user2 should 
 get added to the role in metastore. Right now, if the admin role exists, new 
 users don't get added.
 A work-around is -  user1 adding user2 to the admin role using grant role 
 statement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6338) Improve exception handling in createDefaultDb() in Metastore

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6338:
---

Status: Patch Available  (was: Open)

 Improve exception handling in createDefaultDb() in Metastore
 

 Key: HIVE-6338
 URL: https://issues.apache.org/jira/browse/HIVE-6338
 Project: Hive
  Issue Type: Task
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0, 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6338.1.patch, HIVE-6338.patch


 There is a suggestion on HIVE-5959 comment list on possible improvements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6417) sql std auth - new users in admin role config should get added

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6417:
---

Status: Patch Available  (was: Open)

 sql std auth - new users in admin role config should get added
 --

 Key: HIVE-6417
 URL: https://issues.apache.org/jira/browse/HIVE-6417
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6417.1.patch, HIVE-6417.patch


 if metastore is started with hive.users.in.admin.role=user1, then user1 is 
 added admin role to metastore.
 If the value is changed to hive.users.in.admin.role=user2, then user2 should 
 get added to the role in metastore. Right now, if the admin role exists, new 
 users don't get added.
 A work-around is -  user1 adding user2 to the admin role using grant role 
 statement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18162: HIVE-6434: Restrict function create/drop to admin roles

2014-03-06 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18162/#review36395
---

Ship it!


Ship It!

- Thejas Nair


On Feb. 26, 2014, 2:10 a.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18162/
 ---
 
 (Updated Feb. 26, 2014, 2:10 a.m.)
 
 
 Review request for hive and Thejas Nair.
 
 
 Bugs: HIVE-6434
 https://issues.apache.org/jira/browse/HIVE-6434
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add output entity of DB object to make sure only admin roles can add/drop 
 functions/macros.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
 68a25e0 
   ql/src/java/org/apache/hadoop/hive/ql/parse/MacroSemanticAnalyzer.java 
 0ae07e3 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/Operation2Privilege.java
  c43bcea 
   ql/src/test/queries/clientnegative/authorization_create_func1.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_create_func2.q 
 PRE-CREATION 
   ql/src/test/queries/clientnegative/authorization_create_macro1.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_create_func1.q 
 PRE-CREATION 
   ql/src/test/queries/clientpositive/authorization_create_macro1.q 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_create_func1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_create_func2.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/authorization_create_macro1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientnegative/cluster_tasklog_retrieval.q.out 747aa6a 
   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
 393a3e8 
   ql/src/test/results/clientnegative/create_function_nonexistent_db.q.out 
 ebb069e 
   ql/src/test/results/clientnegative/create_function_nonudf_class.q.out 
 dd66afc 
   ql/src/test/results/clientnegative/create_udaf_failure.q.out 3fc3d36 
   ql/src/test/results/clientnegative/create_unknown_genericudf.q.out af3d50b 
   ql/src/test/results/clientnegative/create_unknown_udf_udaf.q.out e138fd0 
   ql/src/test/results/clientnegative/drop_native_udf.q.out 1913df9 
   
 ql/src/test/results/clientnegative/udf_function_does_not_implement_udf.q.out 
 9ea8668 
   ql/src/test/results/clientnegative/udf_local_resource.q.out b6ea77d 
   ql/src/test/results/clientnegative/udf_nonexistent_resource.q.out ad70d54 
   ql/src/test/results/clientnegative/udf_test_error.q.out a788a10 
   ql/src/test/results/clientnegative/udf_test_error_reduce.q.out 98b42e0 
   ql/src/test/results/clientpositive/authorization_create_func1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_create_macro1.q.out 
 PRE-CREATION 
   ql/src/test/results/clientpositive/autogen_colalias.q.out a074b96 
   ql/src/test/results/clientpositive/compile_processor.q.out 7e9bb29 
   ql/src/test/results/clientpositive/create_func1.q.out 5a249c3 
   ql/src/test/results/clientpositive/create_genericudaf.q.out 96fe2fa 
   ql/src/test/results/clientpositive/create_genericudf.q.out bf1f4ac 
   ql/src/test/results/clientpositive/create_udaf.q.out 2e86a36 
   ql/src/test/results/clientpositive/create_view.q.out ecc7618 
   ql/src/test/results/clientpositive/drop_udf.q.out 422933a 
   ql/src/test/results/clientpositive/macro.q.out c483029 
   ql/src/test/results/clientpositive/ptf_register_tblfn.q.out 11c9724 
   ql/src/test/results/clientpositive/udaf_sum_list.q.out b1922d9 
   ql/src/test/results/clientpositive/udf_compare_java_string.q.out 8e6e365 
   ql/src/test/results/clientpositive/udf_context_aware.q.out 10414fa 
   ql/src/test/results/clientpositive/udf_logic_java_boolean.q.out 88c1984 
   ql/src/test/results/clientpositive/udf_testlength.q.out 4d75482 
   ql/src/test/results/clientpositive/udf_testlength2.q.out 8a1e03e 
   ql/src/test/results/clientpositive/udf_using.q.out 69e5f3b 
   ql/src/test/results/clientpositive/windowing_udaf2.q.out 5043a45 
 
 Diff: https://reviews.apache.org/r/18162/diff/
 
 
 Testing
 ---
 
 positive/negative q files added
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Updated] (HIVE-6538) yet another annoying exception in test logs

2014-03-06 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6538:


Attachment: HIVE-6538.2.patch

Looks like pre-commit queue died last night, I'm resubmitting

 yet another annoying exception in test logs
 ---

 Key: HIVE-6538
 URL: https://issues.apache.org/jira/browse/HIVE-6538
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Szehon Ho
Priority: Trivial
 Attachments: HIVE-6538.2.patch, HIVE-6538.2.patch, HIVE-6538.patch


 Whenever you look at failed q tests you have to go thru this useless 
 exception.
 {noformat}
 2014-03-03 11:22:54,872 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - 
 MetaException(message:NoSuchObjectException(message:Function 
 default.qtest_get_java_boolean does not exist))
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4575)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4702)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy8.get_function(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1526)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at $Proxy9.getFunction(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2603)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:578)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:599)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:606)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:94)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:60)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:345)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1078)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1121)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1014)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
   at org.apache.hadoop.hive.ql.QTestUtil.runCmd(QTestUtil.java:655)
   at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:772)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:46)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.internal.runners.SuiteMethod.testFromSuiteMethod(SuiteMethod.java:34)
   at org.junit.internal.runners.SuiteMethod.init(SuiteMethod.java:23)
   at 
 org.junit.internal.builders.SuiteMethodBuilder.runnerForClass(SuiteMethodBuilder.java:14)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:29)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:262)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
   at 
 

[jira] [Commented] (HIVE-6565) OrcSerde should be added as NativeSerDe in SerDeUtils

2014-03-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922895#comment-13922895
 ] 

Xuefu Zhang commented on HIVE-6565:
---

[~branky] Thanks for pointing this out. I'm curious if this solves any 
particular problem. If so, putting the problem in the description would be very 
helpful. I asked the question because I thought this might solve HIVE-4703. I 
tried but it seemed not helping that. Thanks.

 OrcSerde should be added as NativeSerDe in SerDeUtils
 -

 Key: HIVE-6565
 URL: https://issues.apache.org/jira/browse/HIVE-6565
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.12.0
Reporter: Branky Shao

 If the table defined as ORC format, the columns info can be fetched from 
 StorageDescriptor, no need to get from SerDe.
 And obviously, ORC is a Hive's native file format, therefore, OrcSerde should 
 be added as NativeSerDe in SerDeUtils.
 The fix is fairly simple, just add single line in SerDeUtils :
 nativeSerDeNames.add(org.apache.hadoop.hive.ql.io.orc.OrcSerde.class.getName());
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Timeline for the Hive 0.13 release?

2014-03-06 Thread Harish Butani
ok sure.
Tracking these with the JQL below. I don’t have permission to setup a Shared 
Filter; can someone help with this.
Of the 35 issues: 11 are still open, 22 are patch available, 2 are resolved.

regards,
Harish.

JQL:

id in (HIVE-5317, HIVE-5843, HIVE-6060, HIVE-6319, HIVE-6460, HIVE-5687, 
HIVE-5943, HIVE-5942, HIVE-6547, HIVE-5155, HIVE-6486, HIVE-6455, HIVE-4177, 
HIVE-4764, HIVE-6306, HIVE-6350, HIVE-6485, HIVE-6507, HIVE-6499, HIVE-6325, 
HIVE-6558, HIVE-6403, HIVE-4790, HIVE-4293, HIVE-6551, HIVE-6359, HIVE-6314, 
HIVE-6241, HIVE-5768, HIVE-2752, HIVE-6312, HIVE-6129, HIVE-6012, HIVE-6434, 
HIVE-6562) ORDER BY status ASC, assignee

On Mar 5, 2014, at 6:50 PM, Prasanth Jayachandran 
pjayachand...@hortonworks.com wrote:

 Can you consider HIVE-6562 as well?
 
 HIVE-6562 - Protection from exceptions in ORC predicate evaluation
 
 Thanks
 Prasanth Jayachandran
 
 On Mar 5, 2014, at 5:56 PM, Jason Dere jd...@hortonworks.com wrote:
 
 
 Would like to get these in, if possible:
 
 HIVE-6012 restore backward compatibility of arithmetic operations
 HIVE-6434 Restrict function create/drop to admin roles
 
 On Mar 5, 2014, at 5:41 PM, Navis류승우 navis@nexr.com wrote:
 
 I have really big wish list(65 pending) but it would be time to focus on
 finalization.
 
 - Small bugs
 HIVE-6403 uncorrelated subquery is failing with auto.convert.join=true
 HIVE-4790 MapredLocalTask task does not make virtual columns
 HIVE-4293 Predicates following UDTF operator are removed by PPD
 
 - Trivials
 HIVE-6551 group by after join with skew join optimization references
 invalid task sometimes
 HIVE-6359 beeline -f fails on scripts with tabs in them.
 HIVE-6314 The logging (progress reporting) is too verbose
 HIVE-6241 Remove direct reference of Hadoop23Shims inQTestUtil
 HIVE-5768 Beeline connection cannot be closed with !close command
 HIVE-2752 Index names are case sensitive
 
 - Memory leakage
 HIVE-6312 doAs with plain sasl auth should be session aware
 
 - Implementation is not accord with document
 HIVE-6129 alter exchange is implemented in inverted manner
 
 I'll update the wiki, too.
 
 
 
 
 2014-03-05 12:18 GMT+09:00 Harish Butani hbut...@hortonworks.com:
 
 Tracking jiras to be applied to branch 0.13 here:
 https://cwiki.apache.org/confluence/display/Hive/Hive+0.13+release+status
 
 On Mar 4, 2014, at 5:45 PM, Harish Butani hbut...@hortonworks.com wrote:
 
 the branch is created.
 have changed the poms in both branches.
 Planning to setup a wikipage to track jiras that will get ported to 0.13
 
 regards,
 Harish.
 
 
 On Mar 4, 2014, at 5:05 PM, Harish Butani hbut...@hortonworks.com
 wrote:
 
 branching now. Will be changing the pom files on trunk.
 Will send another email when the branch and trunk changes are in.
 
 
 On Mar 4, 2014, at 4:03 PM, Sushanth Sowmyan khorg...@gmail.com
 wrote:
 
 I have two patches still as patch-available, that have had +1s as
 well, but are waiting on pre-commit tests picking them up go in to
 0.13:
 
 https://issues.apache.org/jira/browse/HIVE-6507 (refactor of table
 property names from string constants to an enum in OrcFile)
 https://issues.apache.org/jira/browse/HIVE-6499 (fixes bug where calls
 like create table and drop table can fail if metastore-side
 authorization is used in conjunction with custom
 inputformat/outputformat/serdes that are not loadable from the
 metastore-side)
 
 
 
 
 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.
 
 
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of this communication is strictly prohibited. If you have 
 received this communication in error, please contact the sender immediately 
 and delete it from your system. Thank You.
 
 
 -- 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader 
 of this message is not the intended recipient, you are hereby notified that 
 any printing, copying, dissemination, distribution, disclosure or 
 forwarding of 

[jira] [Updated] (HIVE-6414) ParquetInputFormat provides data values that do not match the object inspectors

2014-03-06 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6414:


Attachment: HIVE-6414.3.patch

Re-submitting the patch, on behalf of Justin, to retrigger the pre-commit test.

 ParquetInputFormat provides data values that do not match the object 
 inspectors
 ---

 Key: HIVE-6414
 URL: https://issues.apache.org/jira/browse/HIVE-6414
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Justin Coffey
  Labels: Parquet
 Fix For: 0.13.0

 Attachments: HIVE-6414.2.patch, HIVE-6414.3.patch, HIVE-6414.3.patch, 
 HIVE-6414.patch


 While working on HIVE-5998 I noticed that the ParquetRecordReader returns 
 IntWritable for all 'int like' types, in disaccord with the row object 
 inspectors. I though fine, and I worked my way around it. But I see now that 
 the issue trigger failuers in other places, eg. in aggregates:
 {noformat}
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row 
 {cint:528534767,ctinyint:31,csmallint:4963,cfloat:31.0,cdouble:4963.0,cstring1:cvLH6Eat2yFsyy7p}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
 to java.lang.Short
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:808)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 9 more
 Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
 cannot be cast to java.lang.Short
 at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaShortObjectInspector.get(JavaShortObjectInspector.java:41)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:671)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.compare(ObjectInspectorUtils.java:631)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.merge(GenericUDAFMin.java:109)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMin$GenericUDAFMinEvaluator.iterate(GenericUDAFMin.java:96)
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:183)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:641)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:838)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:735)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
 ... 15 more
 {noformat}
 My test is (I'm writing a test .q from HIVE-5998, but the repro does not 
 involve vectorization):
 {noformat}
 create table if not exists alltypes_parquet (
   cint int,
   ctinyint tinyint,
   csmallint smallint,
   cfloat float,
   cdouble double,
   cstring1 string) stored as parquet;
 insert overwrite table alltypes_parquet
   select cint,
 ctinyint,
 csmallint,
 cfloat,
 cdouble,
 cstring1
   from alltypesorc;
 explain select * from alltypes_parquet limit 10; select * from 
 alltypes_parquet limit 10;
 explain select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 select ctinyint,
   max(cint),
   min(csmallint),
   count(cstring1),
   avg(cfloat),
   stddev_pop(cdouble)
   from alltypes_parquet
   group by ctinyint;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6549) removed templeton.jar from webhcat-default.xml

2014-03-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922914#comment-13922914
 ] 

Lefty Leverenz commented on HIVE-6549:
--

More advantages of wikidocs:

* Access -- you can look up the information even when the code isn't available.
* Elaboration -- additional notes and guidance.
* Search -- well, I'd like to say you can always find a config variable by 
googling it, but a random check of Hive config properties had more misses than 
hits.  And one search found an svn copy of hive-default.xml.
* Review -- after initial release, descriptions are more likely to get reviewed 
in the wiki and corrections are easier.  Of course, that leads to a major 
disadvantage:  divergence of the wikidoc from the source file.

 removed templeton.jar from webhcat-default.xml
 --

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor

 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6537) NullPointerException when loading hashtable for MapJoin directly

2014-03-06 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922933#comment-13922933
 ] 

Gunther Hagleitner commented on HIVE-6537:
--

+1

 NullPointerException when loading hashtable for MapJoin directly
 

 Key: HIVE-6537
 URL: https://issues.apache.org/jira/browse/HIVE-6537
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6537.01.patch, HIVE-6537.2.patch.txt, 
 HIVE-6537.patch


 We see the following error:
 {noformat}
 2014-02-20 23:33:15,743 FATAL [main] 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:103)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:149)
 at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:164)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.NullPointerException
 at java.util.Arrays.fill(Arrays.java:2685)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:155)
 at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:81)
 ... 15 more
 {noformat}
 It appears that the tables in Arrays.fill call is nulls. I don't really have 
 full understanding of this path, but what I gleaned so far is this...
 From what I see, tables would be set unconditionally in initializeOp of the 
 sink, and in no other place, so I assume for this code to ever  work that 
 startForward calls it at least some time.
 Here, it doesn't call it, so it's null. 
 Previous loop also uses tables, and should have NPE-d before fill was ever 
 called; it didn't, so I'd assume it never executed. 
 There's a little bit of inconsistency in the above code where directWorks are 
 added to parents unconditionally but sink is only added as child 
 conditionally. I think it may be that some of the direct works are not table 
 scans; in fact given that loop never executes they may be null (which is 
 rather strange). 
 Regardless, it seems that the logic should be fixed, it may be the root cause



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler

2014-03-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922962#comment-13922962
 ] 

Xuefu Zhang commented on HIVE-6411:
---

[~navis], would you mind update the review board with the latest patch? Thanks.

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.2.patch.txt, 
 HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6566:
-

Attachment: HIVE-6566.2.patch

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch, HIVE-6566.2.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922968#comment-13922968
 ] 

Gunther Hagleitner commented on HIVE-6566:
--

.2 adds comments (per review request).

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch, HIVE-6566.2.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6566:
-

Status: Open  (was: Patch Available)

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch, HIVE-6566.2.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6566:
-

Status: Patch Available  (was: Open)

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch, HIVE-6566.2.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6338) Improve exception handling in createDefaultDb() in Metastore

2014-03-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923008#comment-13923008
 ] 

Hive QA commented on HIVE-6338:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12633195/HIVE-6338.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5358 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1638/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1638/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12633195

 Improve exception handling in createDefaultDb() in Metastore
 

 Key: HIVE-6338
 URL: https://issues.apache.org/jira/browse/HIVE-6338
 Project: Hive
  Issue Type: Task
  Components: Metastore
Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6338.1.patch, HIVE-6338.patch


 There is a suggestion on HIVE-5959 comment list on possible improvements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-03-06 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Patch is committed. Mark it as resolved.

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-10.patch, 
 HIVE-5728-2.patch, HIVE-5728-3.patch, HIVE-5728-4.patch, HIVE-5728-5.patch, 
 HIVE-5728-6.patch, HIVE-5728-7.patch, HIVE-5728-8.patch, HIVE-5728-9.patch, 
 HIVE-5728.10.patch, HIVE-5728.11.patch, HIVE-5728.12.patch, HIVE-5728.13.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6507) OrcFile table property names are specified as strings

2014-03-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6507:
---

Status: Open  (was: Patch Available)

 OrcFile table property names are specified as strings
 -

 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6507.2.patch, HIVE-6507.patch


 In HIVE-5504, we had to do some special casing in HCatalog to add a 
 particular set of orc table properties from table properties to job 
 properties.
 In doing so, it's obvious that that is a bit cumbersome, and ideally, the 
 list of all orc file table properties should really be an enum, rather than 
 individual loosely tied constant strings. If we were to clean this up, we can 
 clean up other code that references this to reference the entire enum, and 
 avoid future errors when new table properties are introduced, but other 
 referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6507) OrcFile table property names are specified as strings

2014-03-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6507:
---

Status: Patch Available  (was: Open)

 OrcFile table property names are specified as strings
 -

 Key: HIVE-6507
 URL: https://issues.apache.org/jira/browse/HIVE-6507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6507.2.patch, HIVE-6507.patch


 In HIVE-5504, we had to do some special casing in HCatalog to add a 
 particular set of orc table properties from table properties to job 
 properties.
 In doing so, it's obvious that that is a bit cumbersome, and ideally, the 
 list of all orc file table properties should really be an enum, rather than 
 individual loosely tied constant strings. If we were to clean this up, we can 
 clean up other code that references this to reference the entire enum, and 
 avoid future errors when new table properties are introduced, but other 
 referencing code is not updated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6566) Incorrect union-all plan with map-joins on Tez

2014-03-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923033#comment-13923033
 ] 

Sergey Shelukhin commented on HIVE-6566:


+1

 Incorrect union-all plan with map-joins on Tez
 --

 Key: HIVE-6566
 URL: https://issues.apache.org/jira/browse/HIVE-6566
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6566.1.patch, HIVE-6566.2.patch


 The tez dag is hooked up incorrectly for some union all queries involving map 
 joins.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6137) Hive should report that the file/path doesn’t exist when it doesn’t

2014-03-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-6137:


Attachment: HIVE-6137.6.patch

cc-ing [~ashutoshc] , Slight difference from the previous patch which caused 
e.getCause() to return null.

 Hive should report that the file/path doesn’t exist when it doesn’t
 ---

 Key: HIVE-6137
 URL: https://issues.apache.org/jira/browse/HIVE-6137
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-6137.1.patch, HIVE-6137.2.patch, HIVE-6137.3.patch, 
 HIVE-6137.4.patch, HIVE-6137.5.patch, HIVE-6137.6.patch


 Hive should report that the file/path doesn’t exist when it doesn’t (it now 
 reports SocketTimeoutException):
 Execute a Hive DDL query with a reference to a non-existent blob (such as 
 CREATE EXTERNAL TABLE...) and check Hive logs (stderr):
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
 java.io.IOException)
 This error message is not detailed enough. If a file doesn't exist, Hive 
 should report that it received an error while trying to locate the file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923044#comment-13923044
 ] 

Sergey Shelukhin commented on HIVE-6060:


is it possible to post rb?

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, acid-io.patch, h-5317.patch, 
 h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6338) Improve exception handling in createDefaultDb() in Metastore

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6338:
---

   Resolution: Fixed
Fix Version/s: (was: 0.13.0)
   0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Improve exception handling in createDefaultDb() in Metastore
 

 Key: HIVE-6338
 URL: https://issues.apache.org/jira/browse/HIVE-6338
 Project: Hive
  Issue Type: Task
  Components: Metastore
Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Blocker
 Fix For: 0.14.0

 Attachments: HIVE-6338.1.patch, HIVE-6338.patch


 There is a suggestion on HIVE-5959 comment list on possible improvements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-6569:
--

 Summary: HCatalog still has references to deprecated property 
hive.metastore.local
 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
 Attachments: HIVE-6569.patch

HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
has references to it. Most of it is in tests, but one is in PigHCatUtil, which 
leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6569:
---

Attachment: HIVE-6569.patch

Patch attached.

 HCatalog still has references to deprecated property hive.metastore.local
 -

 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
  Labels: cleanup, hcatalog
 Attachments: HIVE-6569.patch


 HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
 has references to it. Most of it is in tests, but one is in PigHCatUtil, 
 which leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove 
 them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6569:
---

Status: Patch Available  (was: Open)

 HCatalog still has references to deprecated property hive.metastore.local
 -

 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
  Labels: cleanup, hcatalog
 Attachments: HIVE-6569.patch


 HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
 has references to it. Most of it is in tests, but one is in PigHCatUtil, 
 which leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove 
 them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923077#comment-13923077
 ] 

Eugene Koifman commented on HIVE-6569:
--

webhcat-default.xml has a ref to it as well.  Should probably be removed as well

 HCatalog still has references to deprecated property hive.metastore.local
 -

 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
  Labels: cleanup, hcatalog
 Attachments: HIVE-6569.patch


 HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
 has references to it. Most of it is in tests, but one is in PigHCatUtil, 
 which leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove 
 them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-06 Thread Anthony Hsu (JIRA)
Anthony Hsu created HIVE-6570:
-

 Summary: Hive variable substitution does not work with the 
source command
 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu


The following does not work:
{code}
source ${hivevar:test-dir}/test.q;
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-06 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923079#comment-13923079
 ] 

Anthony Hsu commented on HIVE-6570:
---

I have a fix for this issue and will upload a patch shortly.

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu

 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6538) yet another annoying exception in test logs

2014-03-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923087#comment-13923087
 ] 

Sergey Shelukhin commented on HIVE-6538:


has long line, +1 otherwise

 yet another annoying exception in test logs
 ---

 Key: HIVE-6538
 URL: https://issues.apache.org/jira/browse/HIVE-6538
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Szehon Ho
Priority: Trivial
 Attachments: HIVE-6538.2.patch, HIVE-6538.2.patch, HIVE-6538.patch


 Whenever you look at failed q tests you have to go thru this useless 
 exception.
 {noformat}
 2014-03-03 11:22:54,872 ERROR metastore.RetryingHMSHandler 
 (RetryingHMSHandler.java:invoke(143)) - 
 MetaException(message:NoSuchObjectException(message:Function 
 default.qtest_get_java_boolean does not exist))
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4575)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4702)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy8.get_function(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1526)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at $Proxy9.getFunction(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2603)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:578)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:599)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:606)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:94)
   at 
 org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:60)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:445)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:345)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1078)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1121)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1014)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
   at org.apache.hadoop.hive.ql.QTestUtil.runCmd(QTestUtil.java:655)
   at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:772)
   at 
 org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:46)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.internal.runners.SuiteMethod.testFromSuiteMethod(SuiteMethod.java:34)
   at org.junit.internal.runners.SuiteMethod.init(SuiteMethod.java:23)
   at 
 org.junit.internal.builders.SuiteMethodBuilder.runnerForClass(SuiteMethodBuilder.java:14)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:29)
   at 
 org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
   at 
 org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:262)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
   at 
 

[jira] [Updated] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6569:
---

Attachment: HIVE-6569.2.patch

 HCatalog still has references to deprecated property hive.metastore.local
 -

 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
  Labels: cleanup, hcatalog
 Attachments: HIVE-6569.2.patch, HIVE-6569.patch


 HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
 has references to it. Most of it is in tests, but one is in PigHCatUtil, 
 which leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove 
 them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6569) HCatalog still has references to deprecated property hive.metastore.local

2014-03-06 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923118#comment-13923118
 ] 

Sushanth Sowmyan commented on HIVE-6569:


Good catch - updating patch with a couple more instances I found in .xml files.

 HCatalog still has references to deprecated property hive.metastore.local
 -

 Key: HIVE-6569
 URL: https://issues.apache.org/jira/browse/HIVE-6569
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor
  Labels: cleanup, hcatalog
 Attachments: HIVE-6569.2.patch, HIVE-6569.patch


 HIVE-2585 removed the conf parameter hive.metastore.local, but HCatalog still 
 has references to it. Most of it is in tests, but one is in PigHCatUtil, 
 which leads to HCatLoader/HCatStorer jobs giving warnings. We need to remove 
 them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6147) Support avro data stored in HBase columns

2014-03-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923121#comment-13923121
 ] 

Xuefu Zhang commented on HIVE-6147:
---

[~swarnim] I'm not totally convinced that these tests are unrelated, as they 
consistently appeared in the test result. In addition, I manually ran 
TestHCatLoader, and got errors as the following:
{code}
testProjectionsBasic(org.apache.hive.hcatalog.pig.TestHCatLoader)  Time 
elapsed: 0.184 sec   ERROR!
java.io.IOException: Failed to execute create table 
junit_unparted_complex(name string, studentid int, contact 
structphno:string,email:string, currently_registered_courses arraystring, 
current_grades mapstring,string, phnos 
arraystructphno:string,type:string) stored as RCFILE 
tblproperties('hcat.isd'='org.apache.hive.hcatalog.rcfile.RCFileInputDriver','hcat.osd'='org.apache.hive.hcatalog.rcfile.RCFileOutputDriver').
 Driver returned 1 Error: FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
at 
org.apache.hive.hcatalog.pig.TestHCatLoader.executeStatementOnDriver(TestHCatLoader.java:125)
at 
org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:111)
at 
org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:101)
at 
org.apache.hive.hcatalog.pig.TestHCatLoader.createTable(TestHCatLoader.java:115)
at 
org.apache.hive.hcatalog.pig.TestHCatLoader.setup(TestHCatLoader.java:154)
{code}

Please further investigate.

 Support avro data stored in HBase columns
 -

 Key: HIVE-6147
 URL: https://issues.apache.org/jira/browse/HIVE-6147
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-6147.1.patch.txt, HIVE-6147.2.patch.txt, 
 HIVE-6147.3.patch.txt, HIVE-6147.3.patch.txt


 Presently, the HBase Hive integration supports querying only primitive data 
 types in columns. It would be nice to be able to store and query Avro objects 
 in HBase columns by making them visible as structs to Hive. This will allow 
 Hive to perform ad hoc analysis of HBase data which can be deeply structured.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6571) query id should be available for logging during query compilation

2014-03-06 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-6571:


 Summary: query id should be available for logging during query 
compilation
 Key: HIVE-6571
 URL: https://issues.apache.org/jira/browse/HIVE-6571
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor


Would be nice to have the query id set during compilation to tie logs together 
etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6417) sql std auth - new users in admin role config should get added

2014-03-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923144#comment-13923144
 ] 

Hive QA commented on HIVE-6417:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12633196/HIVE-6417.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5359 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hive.beeline.TestSchemaTool.testSchemaInit
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1640/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1640/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12633196

 sql std auth - new users in admin role config should get added
 --

 Key: HIVE-6417
 URL: https://issues.apache.org/jira/browse/HIVE-6417
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6417.1.patch, HIVE-6417.patch


 if metastore is started with hive.users.in.admin.role=user1, then user1 is 
 added admin role to metastore.
 If the value is changed to hive.users.in.admin.role=user2, then user2 should 
 get added to the role in metastore. Right now, if the admin role exists, new 
 users don't get added.
 A work-around is -  user1 adding user2 to the admin role using grant role 
 statement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6571) query id should be available for logging during query compilation

2014-03-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923150#comment-13923150
 ] 

Sergey Shelukhin commented on HIVE-6571:


+1

 query id should be available for logging during query compilation
 -

 Key: HIVE-6571
 URL: https://issues.apache.org/jira/browse/HIVE-6571
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-6571.1.patch


 Would be nice to have the query id set during compilation to tie logs 
 together etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6571) query id should be available for logging during query compilation

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6571:
-

Attachment: HIVE-6571.1.patch

 query id should be available for logging during query compilation
 -

 Key: HIVE-6571
 URL: https://issues.apache.org/jira/browse/HIVE-6571
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-6571.1.patch


 Would be nice to have the query id set during compilation to tie logs 
 together etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6571) query id should be available for logging during query compilation

2014-03-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6571:
-

Status: Patch Available  (was: Open)

 query id should be available for logging during query compilation
 -

 Key: HIVE-6571
 URL: https://issues.apache.org/jira/browse/HIVE-6571
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-6571.1.patch


 Would be nice to have the query id set during compilation to tie logs 
 together etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6571) query id should be available for logging during query compilation

2014-03-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923152#comment-13923152
 ] 

Sergey Shelukhin commented on HIVE-6571:


this queryId really ties the logs together...

 query id should be available for logging during query compilation
 -

 Key: HIVE-6571
 URL: https://issues.apache.org/jira/browse/HIVE-6571
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
Priority: Minor
 Attachments: HIVE-6571.1.patch


 Would be nice to have the query id set during compilation to tie logs 
 together etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6555) TestSchemaTool is failing on trunk after branching

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6555:
---

Status: Open  (was: Patch Available)

 TestSchemaTool is failing on trunk after branching
 --

 Key: HIVE-6555
 URL: https://issues.apache.org/jira/browse/HIVE-6555
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6555-branch13.patch, HIVE-6555.1.patch, 
 HIVE-6555.patch


 This is because version was bumped to 0.14 in pom file and there are no 
 metastore scripts for 0.14 yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6555) TestSchemaTool is failing on trunk after branching

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6555:
---

Status: Patch Available  (was: Open)

 TestSchemaTool is failing on trunk after branching
 --

 Key: HIVE-6555
 URL: https://issues.apache.org/jira/browse/HIVE-6555
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6555-branch13.patch, HIVE-6555.1.patch, 
 HIVE-6555.patch


 This is because version was bumped to 0.14 in pom file and there are no 
 metastore scripts for 0.14 yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6555) TestSchemaTool is failing on trunk after branching

2014-03-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6555:
---

Attachment: HIVE-6555.1.patch

Same patch. Reupload for Hive QA to pick up.

 TestSchemaTool is failing on trunk after branching
 --

 Key: HIVE-6555
 URL: https://issues.apache.org/jira/browse/HIVE-6555
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6555-branch13.patch, HIVE-6555.1.patch, 
 HIVE-6555.patch


 This is because version was bumped to 0.14 in pom file and there are no 
 metastore scripts for 0.14 yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Attachment: HIVE-6430.patch

Reattaching the patch, with some fixes in new code (not working yet). Looks 
like QA didn't pick it up

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-06 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-6570:
-

Assignee: Anthony Hsu

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu

 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6572) Use shimmed version of hadoop conf names for mapred.{min,max}.split.size{.*}

2014-03-06 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-6572:
--

 Summary: Use shimmed version of hadoop conf names for 
mapred.{min,max}.split.size{.*}
 Key: HIVE-6572
 URL: https://issues.apache.org/jira/browse/HIVE-6572
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan


HadoopShims has a method to fetch config parameters by name so that they return 
the appropriate config param name for the appropriate hadoop version. We need 
to be consistent about using these versions.

For eg:. mapred.min.split.size is deprecated with hadoop 2.x, and is instead 
called mapreduce.input.fileinputformat.split.minsize .

Also, there is a bug in Hadoop20SShims and Hadoop20Shims that defines 
MAPREDMINSPLITSIZEPERNODE as mapred.min.split.size.per.rack and 
MAPREDMINSPLITSIZEPERRACK as mapred.min.split.size.per.node. This is wrong and 
confusing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-06 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6570:
--

Attachment: HIVE-6570.1.patch.txt

Added support for Hive variable substitution with the source command, and 
added a test for this in source.q.

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch.txt


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6060) Define API for RecordUpdater and UpdateReader

2014-03-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923235#comment-13923235
 ] 

Owen O'Malley commented on HIVE-6060:
-

I'm not sure why it didn't link, but here:

https://reviews.apache.org/r/18810/diff/

 Define API for RecordUpdater and UpdateReader
 -

 Key: HIVE-6060
 URL: https://issues.apache.org/jira/browse/HIVE-6060
 Project: Hive
  Issue Type: Sub-task
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-6060.patch, acid-io.patch, h-5317.patch, 
 h-5317.patch, h-5317.patch, h-6060.patch, h-6060.patch


 We need to define some new APIs for how Hive interacts with the file formats 
 since it needs to be much richer than the current RecordReader and 
 RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >