[jira] [Created] (HIVE-9618) Deduplicate RS keys for ptf/windowing

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9618:
---

 Summary: Deduplicate RS keys for ptf/windowing
 Key: HIVE-9618
 URL: https://issues.apache.org/jira/browse/HIVE-9618
 Project: Hive
  Issue Type: Improvement
  Components: PTF-Windowing
Reporter: Navis
Assignee: Navis
Priority: Trivial


Currently, partition spec containing same column for partition-by and order-by 
makes duplicated key column for RS. For example, 
{noformat}
explain
select p_mfgr, p_name, p_size, 
rank() over (partition by p_mfgr order by p_name) as r, 
dense_rank() over (partition by p_mfgr order by p_name) as dr, 
sum(p_retailprice) over (partition by p_mfgr order by p_name rows between 
unbounded preceding and current row)  as s1
from noop(on noopwithmap(on noop(on part 
partition by p_mfgr 
order by p_mfgr, p_name
)))
{noformat}

"partition by p_mfgr order by p_mfgr, p_name" makes duplicated key columns like 
below
{noformat}
Reduce Output Operator
key expressions: p_mfgr (type: string), p_mfgr (type: string), p_name 
(type: string)
sort order: +++
Map-reduce partition columns: p_mfgr (type: string)
value expressions: p_size (type: int), p_retailprice (type: double)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311907#comment-14311907
 ] 

Hive QA commented on HIVE-9499:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697379/HIVE-9499.3.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7520 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2715/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2715/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2715/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697379 - PreCommit-HIVE-TRUNK-Build

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>Assignee: Navis
> Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt, 
> HIVE-9499.3.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9617:
--
Attachment: HIVE-9617.1.patch

patch #1

> UDF from_utc_timestamp throws NPE if the second argument is null
> 
>
> Key: HIVE-9617
> URL: https://issues.apache.org/jira/browse/HIVE-9617
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9617.1.patch
>
>
> UDF from_utc_timestamp throws NPE if the second argument is null
> {code}
> select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string));
> FAILED: NullPointerException null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9617:
--
Status: Patch Available  (was: In Progress)

> UDF from_utc_timestamp throws NPE if the second argument is null
> 
>
> Key: HIVE-9617
> URL: https://issues.apache.org/jira/browse/HIVE-9617
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
> Attachments: HIVE-9617.1.patch
>
>
> UDF from_utc_timestamp throws NPE if the second argument is null
> {code}
> select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string));
> FAILED: NullPointerException null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30788: HIVE-9617 UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30788/
---

Review request for hive, Jason Dere and Siying Dong.


Bugs: HIVE-9617
https://issues.apache.org/jira/browse/HIVE-9617


Repository: hive-git


Description
---

HIVE-9617 UDF from_utc_timestamp throws NPE if the second argument is null


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java
 f76fc104774cf77597d8467c9dcf3fe8d05cddce 

Diff: https://reviews.apache.org/r/30788/diff/


Testing
---


Thanks,

Alexander Pivovarov



[jira] [Updated] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9617:
--
Description: 
UDF from_utc_timestamp throws NPE if the second argument is null
{code}
select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string));
FAILED: NullPointerException null
{code}

  was:
UDF from_utc_timestamp throws NPE if the second argument is null
{code}
select from_utc_timestamp(1423465304, cast(null as string));
FAILED: NullPointerException null
{code}


> UDF from_utc_timestamp throws NPE if the second argument is null
> 
>
> Key: HIVE-9617
> URL: https://issues.apache.org/jira/browse/HIVE-9617
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
>
> UDF from_utc_timestamp throws NPE if the second argument is null
> {code}
> select from_utc_timestamp('2015-02-06 10:30:00', cast(null as string));
> FAILED: NullPointerException null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9617 started by Alexander Pivovarov.
-
> UDF from_utc_timestamp throws NPE if the second argument is null
> 
>
> Key: HIVE-9617
> URL: https://issues.apache.org/jira/browse/HIVE-9617
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Minor
>
> UDF from_utc_timestamp throws NPE if the second argument is null
> {code}
> select from_utc_timestamp(1423465304, cast(null as string));
> FAILED: NullPointerException null
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9617) UDF from_utc_timestamp throws NPE if the second argument is null

2015-02-08 Thread Alexander Pivovarov (JIRA)
Alexander Pivovarov created HIVE-9617:
-

 Summary: UDF from_utc_timestamp throws NPE if the second argument 
is null
 Key: HIVE-9617
 URL: https://issues.apache.org/jira/browse/HIVE-9617
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov
Priority: Minor


UDF from_utc_timestamp throws NPE if the second argument is null
{code}
select from_utc_timestamp(1423465304, cast(null as string));
FAILED: NullPointerException null
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311859#comment-14311859
 ] 

Hive QA commented on HIVE-9507:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697377/HIVE-9507.2.patch.txt

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7526 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_inline
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2714/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2714/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2714/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697377 - PreCommit-HIVE-TRUNK-Build

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov reassigned HIVE-6069:
-

Assignee: Alexander Pivovarov

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-6069:
--
Status: Patch Available  (was: In Progress)

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6069 started by Alexander Pivovarov.
-
> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-6069:
--
Attachment: HIVE-6069.1.patch

patch #1

> Improve error message in GenericUDFRound
> 
>
> Key: HIVE-6069
> URL: https://issues.apache.org/jira/browse/HIVE-6069
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Xuefu Zhang
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Attachments: HIVE-6069.1.patch
>
>
> Suggested in HIVE-6039 review board.
> https://reviews.apache.org/r/16329/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30786: HIVE-6069 Improve error message in GenericUDFRound

2015-02-08 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30786/
---

Review request for hive, Jason Dere and Xuefu Zhang.


Bugs: HIVE-6069
https://issues.apache.org/jira/browse/HIVE-6069


Repository: hive-git


Description
---

HIVE-6069 Improve error message in GenericUDFRound


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java 
387de5e3103e2399a3fdb9b2376ac790501eab8e 

Diff: https://reviews.apache.org/r/30786/diff/


Testing
---


Thanks,

Alexander Pivovarov



[jira] [Updated] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9228:

Attachment: HIVE-9228.3.patch.txt

Updated gold file

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
> HIVE-9228.3.patch.txt, create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Attachment: HIVE-9039.23.patch

rebase patch on the most recent changes.

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, 
> HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch
>
>
> CLEAR LIBRARY CACHE
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Patch Available  (was: Open)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, 
> HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch
>
>
> CLEAR LIBRARY CACHE
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9039) Support Union Distinct

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9039:
--
Status: Open  (was: Patch Available)

> Support Union Distinct
> --
>
> Key: HIVE-9039
> URL: https://issues.apache.org/jira/browse/HIVE-9039
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, 
> HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, 
> HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, 
> HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, 
> HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, 
> HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, 
> HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, 
> HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch
>
>
> CLEAR LIBRARY CACHE
> Current version (Hive 0.14) does not support union (or union distinct). It 
> only supports union all. In this patch, we try to add this new feature by 
> rewriting union distinct to union all followed by group by.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9228) Problem with subquery using windowing functions

2015-02-08 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311850#comment-14311850
 ] 

Navis commented on HIVE-9228:
-

[~aihuaxu] Sorry for my breaking in on this issue. I've been working on codes 
around CP for other issues and not wanted others waste time to understand 
complicated PTF operation. I think the fix is almost done. Sorry again.

> Problem with subquery using windowing functions
> ---
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.13.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt, 
> create_table_tab1.sql, tab1.csv
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works 
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1 
> then 1 end ) over (partition by col1, col2) as col5, row_number() over 
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs. 
> 1. The first job is to basically calculate window function for col5.  
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a 
> tmp file since only these columns are used in later stage. While, the PTF 
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with 
> _wcol0 as the result of the window function even it's not used. 
> In the second job, the map operator still reads the 4 columns (col1, col2, 
> col3, col4) from the temp file using the plan. That causes the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8119) Implement Date in ParquetSerde

2015-02-08 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311838#comment-14311838
 ] 

Mohit Sabharwal commented on HIVE-8119:
---

Thanks, [~dongc]! LGTM +1 (non-binding)

[~rdblue], could you take a quick look as well ?

> Implement Date in ParquetSerde
> --
>
> Key: HIVE-8119
> URL: https://issues.apache.org/jira/browse/HIVE-8119
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Mohit Sabharwal
> Attachments: HIVE-8119.1.patch, HIVE-8119.patch
>
>
> Date type in Parquet is discussed here: 
> http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7998) Enhance JDBC Driver to not require class specification

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-7998:
--
 Description: 
The hotspot VM offers a way to avoid having to specify the driver class 
explicitly when using the JDBC driver. 

The DriverManager methods getConnection and getDrivers have been enhanced to 
support the Java Standard Edition Service Provider mechanism. JDBC 4.0 Drivers 
must include the file META-INF/services/java.sql.Driver. This file contains the 
name of the JDBC drivers implementation of java.sql.Driver. For example, to 
load the my.sql.Driver class, the META-INF/services/java.sql.Driver file would 
contain the entry: `my.sql.Driver`
 
Applications no longer need to explicitly load JDBC drivers using 
Class.forName(). Existing programs which currently load JDBC drivers using 
Class.forName() will continue to work without modification.

via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html

  was:
The hotspot VM offers a way to avoid having to specify the driver class 
explicitly when using the JDBC driver. 

The DriverManager methods getConnection and getDrivers have been enhanced to 
support the Java Standard Edition Service Provider mechanism. JDBC 4.0 Drivers 
must include the file META-INF/services/java.sql.Driver. This file contains the 
name of the JDBC drivers implementation of java.sql.Driver. For example, to 
load the my.sql.Driver class, the META-INF/services/java.sql.Driver file would 
contain the entry: `my.sql.Driver`
 
Applications no longer need to explictly load JDBC drivers using 
Class.forName(). Existing programs which currently load JDBC drivers using 
Class.forName() will continue to work without modification.

via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html

  Labels:   (was: TODOC1.2)
Release Note: Applications no longer need to explicitly load JDBC drivers 
using Class.forName()

> Enhance JDBC Driver to not require class specification
> --
>
> Key: HIVE-7998
> URL: https://issues.apache.org/jira/browse/HIVE-7998
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Prateek Rungta
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-7998.1.patch
>
>
> The hotspot VM offers a way to avoid having to specify the driver class 
> explicitly when using the JDBC driver. 
> The DriverManager methods getConnection and getDrivers have been enhanced to 
> support the Java Standard Edition Service Provider mechanism. JDBC 4.0 
> Drivers must include the file META-INF/services/java.sql.Driver. This file 
> contains the name of the JDBC drivers implementation of java.sql.Driver. For 
> example, to load the my.sql.Driver class, the 
> META-INF/services/java.sql.Driver file would contain the entry: 
> `my.sql.Driver`
>  
> Applications no longer need to explicitly load JDBC drivers using 
> Class.forName(). Existing programs which currently load JDBC drivers using 
> Class.forName() will continue to work without modification.
> via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7998) Enhance JDBC Driver to not require class specification

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311830#comment-14311830
 ] 

Alexander Pivovarov commented on HIVE-7998:
---

I documented this on the wiki

> Enhance JDBC Driver to not require class specification
> --
>
> Key: HIVE-7998
> URL: https://issues.apache.org/jira/browse/HIVE-7998
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Prateek Rungta
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Fix For: 1.2.0
>
> Attachments: HIVE-7998.1.patch
>
>
> The hotspot VM offers a way to avoid having to specify the driver class 
> explicitly when using the JDBC driver. 
> The DriverManager methods getConnection and getDrivers have been enhanced to 
> support the Java Standard Edition Service Provider mechanism. JDBC 4.0 
> Drivers must include the file META-INF/services/java.sql.Driver. This file 
> contains the name of the JDBC drivers implementation of java.sql.Driver. For 
> example, to load the my.sql.Driver class, the 
> META-INF/services/java.sql.Driver file would contain the entry: 
> `my.sql.Driver`
>  
> Applications no longer need to explicitly load JDBC drivers using 
> Class.forName(). Existing programs which currently load JDBC drivers using 
> Class.forName() will continue to work without modification.
> via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9586) Too verbose log can hurt performance, we should always check log level first

2015-02-08 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311827#comment-14311827
 ] 

Xin Hao commented on HIVE-9586:
---

Thank you Xuefu.

> Too verbose log can hurt performance, we should always check log level first
> 
>
> Key: HIVE-9586
> URL: https://issues.apache.org/jira/browse/HIVE-9586
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: spark-branch, 1.2.0
>
> Attachments: HIVE-9586.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9143) select user(), current_user()

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311813#comment-14311813
 ] 

Alexander Pivovarov commented on HIVE-9143:
---

Added the function description to Misc. Functions section on the wiki.

> select user(), current_user()
> -
>
> Key: HIVE-9143
> URL: https://issues.apache.org/jira/browse/HIVE-9143
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Hari Sekhon
>Assignee: Alexander Pivovarov
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch, HIVE-9143.3.patch
>
>
> Feature request to add support for determining in HQL session which user I am 
> currently connected as - an old MySQL ability:
> {code}mysql> select user(), current_user();
> +++
> | user() | current_user() |
> +++
> | root@localhost | root@localhost |
> +++
> 1 row in set (0.00 sec)
> {code}
> which doesn't seem to have a counterpart in Hive at this time:
> {code}0: jdbc:hive2://:100> select user();
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Invalid function 'user' (state=42000,code=4)
> 0: jdbc:hive2://:100> select current_user();
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10011]: Line 1:7 Invalid function 'current_user' 
> (state=42000,code=10011){code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9358) Create LAST_DAY UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9358:
--
Fix Version/s: 1.1.0

> Create LAST_DAY UDF
> ---
>
> Key: HIVE-9358
> URL: https://issues.apache.org/jira/browse/HIVE-9358
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0, 1.1.0
>
> Attachments: HIVE-9358.1.patch, HIVE-9358.2.patch
>
>
> LAST_DAY returns the date of the last day of the month that contains date:
> last_day('2015-01-14') = '2015-01-31'
> last_day('2016-02-01') = '2016-02-29'
> last_day function went from oracle  
> http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions072.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9357) Create ADD_MONTHS UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9357:
--
Fix Version/s: 1.1.0

> Create ADD_MONTHS UDF
> -
>
> Key: HIVE-9357
> URL: https://issues.apache.org/jira/browse/HIVE-9357
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0, 1.1.0
>
> Attachments: HIVE-9357.1.patch, HIVE-9357.2.patch, HIVE-9357.3.patch
>
>
> ADD_MONTHS adds a number of months to startdate: 
> add_months('2015-01-14', 1) = '2015-02-14'
> add_months('2015-01-31', 1) = '2015-02-28'
> add_months('2015-02-28', 2) = '2015-04-30'
> add_months('2015-02-28', 12) = '2016-02-29'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9402) Create GREATEST and LEAST udf

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9402:
--
Fix Version/s: 1.1.0

> Create GREATEST and LEAST udf
> -
>
> Key: HIVE-9402
> URL: https://issues.apache.org/jira/browse/HIVE-9402
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0, 1.1.0
>
> Attachments: HIVE-9402.1.patch, HIVE-9402.2.patch, HIVE-9402.3.patch, 
> HIVE-9402.4.patch, HIVE-9402.4.patch, HIVE-9402.5.patch, HIVE-9402.5.patch, 
> HIVE-9402.6.patch, HIVE-9402.7.patch
>
>
> GREATEST function returns the greatest value in a list of values
> Signature: T greatest(T v1, T v2, ...)
> all values should be the same type (like in COALESCE)
> LEAST returns the least value in a list of values
> Signature: T least(T v1, T v2, ...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3405) UDF initcap to obtain a string with the first letter of each word in uppercase other letters in lowercase

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3405:
--
Fix Version/s: 1.1.0

> UDF initcap to obtain a string with the first letter of each word in 
> uppercase other letters in lowercase
> -
>
> Key: HIVE-3405
> URL: https://issues.apache.org/jira/browse/HIVE-3405
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.13.0, 0.14.0, 
> 0.15.0, 0.14.1
>Reporter: Archana Nair
>Assignee: Alexander Pivovarov
>  Labels: patch
> Fix For: 0.15.0, 1.1.0
>
> Attachments: HIVE-3405.1.patch.txt, HIVE-3405.2.patch, 
> HIVE-3405.3.patch, HIVE-3405.4.patch, HIVE-3405.5.patch, HIVE-3405.5.patch
>
>
> Hive current releases lacks a INITCAP function  which returns String with 
> first letter of the word in uppercase.INITCAP returns String, with the first 
> letter of each word in uppercase, all other letters in same case. Words are 
> delimited by white space.This will be useful report generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9143) select user(), current_user()

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9143:
--
  Labels:   (was: TODOC1.2)
Release Note: Returns current user name

> select user(), current_user()
> -
>
> Key: HIVE-9143
> URL: https://issues.apache.org/jira/browse/HIVE-9143
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Hari Sekhon
>Assignee: Alexander Pivovarov
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch, HIVE-9143.3.patch
>
>
> Feature request to add support for determining in HQL session which user I am 
> currently connected as - an old MySQL ability:
> {code}mysql> select user(), current_user();
> +++
> | user() | current_user() |
> +++
> | root@localhost | root@localhost |
> +++
> 1 row in set (0.00 sec)
> {code}
> which doesn't seem to have a counterpart in Hive at this time:
> {code}0: jdbc:hive2://:100> select user();
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Invalid function 'user' (state=42000,code=4)
> 0: jdbc:hive2://:100> select current_user();
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10011]: Line 1:7 Invalid function 'current_user' 
> (state=42000,code=10011){code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9402) Create GREATEST and LEAST udf

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311809#comment-14311809
 ] 

Alexander Pivovarov commented on HIVE-9402:
---

Added the function description to the wiki

> Create GREATEST and LEAST udf
> -
>
> Key: HIVE-9402
> URL: https://issues.apache.org/jira/browse/HIVE-9402
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9402.1.patch, HIVE-9402.2.patch, HIVE-9402.3.patch, 
> HIVE-9402.4.patch, HIVE-9402.4.patch, HIVE-9402.5.patch, HIVE-9402.5.patch, 
> HIVE-9402.6.patch, HIVE-9402.7.patch
>
>
> GREATEST function returns the greatest value in a list of values
> Signature: T greatest(T v1, T v2, ...)
> all values should be the same type (like in COALESCE)
> LEAST returns the least value in a list of values
> Signature: T least(T v1, T v2, ...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9402) Create GREATEST and LEAST udf

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9402:
--
  Labels:   (was: TODOC15)
Release Note: Returns the greatest/least value of the list of values

> Create GREATEST and LEAST udf
> -
>
> Key: HIVE-9402
> URL: https://issues.apache.org/jira/browse/HIVE-9402
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9402.1.patch, HIVE-9402.2.patch, HIVE-9402.3.patch, 
> HIVE-9402.4.patch, HIVE-9402.4.patch, HIVE-9402.5.patch, HIVE-9402.5.patch, 
> HIVE-9402.6.patch, HIVE-9402.7.patch
>
>
> GREATEST function returns the greatest value in a list of values
> Signature: T greatest(T v1, T v2, ...)
> all values should be the same type (like in COALESCE)
> LEAST returns the least value in a list of values
> Signature: T least(T v1, T v2, ...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9616) Hive 0.14

2015-02-08 Thread srinivas (JIRA)
srinivas created HIVE-9616:
--

 Summary: Hive 0.14
 Key: HIVE-9616
 URL: https://issues.apache.org/jira/browse/HIVE-9616
 Project: Hive
  Issue Type: Bug
Reporter: srinivas


Hi, 

I am using hive 0.14 version which will support all crud operation as said by 
support team

I am not able to select specific columns to insert, like 

insert into table table1 id,name,sal select id,name,sal from table2 where 
table1.id = table2.id



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-08 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reopened HIVE-9350:
-

Reverted the patch to address classnotfound issue in hive-exec.


> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9358) Create LAST_DAY UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311803#comment-14311803
 ] 

Alexander Pivovarov commented on HIVE-9358:
---

Added the function description to the wiki.

> Create LAST_DAY UDF
> ---
>
> Key: HIVE-9358
> URL: https://issues.apache.org/jira/browse/HIVE-9358
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9358.1.patch, HIVE-9358.2.patch
>
>
> LAST_DAY returns the date of the last day of the month that contains date:
> last_day('2015-01-14') = '2015-01-31'
> last_day('2016-02-01') = '2016-02-29'
> last_day function went from oracle  
> http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions072.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9358) Create LAST_DAY UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9358:
--
  Labels:   (was: TODOC15)
Release Note: Returns the last day of the month which the date belongs to.

> Create LAST_DAY UDF
> ---
>
> Key: HIVE-9358
> URL: https://issues.apache.org/jira/browse/HIVE-9358
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9358.1.patch, HIVE-9358.2.patch
>
>
> LAST_DAY returns the date of the last day of the month that contains date:
> last_day('2015-01-14') = '2015-01-31'
> last_day('2016-02-01') = '2016-02-29'
> last_day function went from oracle  
> http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions072.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8119) Implement Date in ParquetSerde

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311802#comment-14311802
 ] 

Hive QA commented on HIVE-8119:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697375/HIVE-8119.1.patch

{color:green}SUCCESS:{color} +1 7526 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2713/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2713/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2713/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697375 - PreCommit-HIVE-TRUNK-Build

> Implement Date in ParquetSerde
> --
>
> Key: HIVE-8119
> URL: https://issues.apache.org/jira/browse/HIVE-8119
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Mohit Sabharwal
> Attachments: HIVE-8119.1.patch, HIVE-8119.patch
>
>
> Date type in Parquet is discussed here: 
> http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3405) UDF initcap to obtain a string with the first letter of each word in uppercase other letters in lowercase

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3405:
--
Release Note: Returns string, with the first letter of each word in 
uppercase, all other letters in lowercase.

> UDF initcap to obtain a string with the first letter of each word in 
> uppercase other letters in lowercase
> -
>
> Key: HIVE-3405
> URL: https://issues.apache.org/jira/browse/HIVE-3405
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.13.0, 0.14.0, 
> 0.15.0, 0.14.1
>Reporter: Archana Nair
>Assignee: Alexander Pivovarov
>  Labels: patch
> Fix For: 0.15.0
>
> Attachments: HIVE-3405.1.patch.txt, HIVE-3405.2.patch, 
> HIVE-3405.3.patch, HIVE-3405.4.patch, HIVE-3405.5.patch, HIVE-3405.5.patch
>
>
> Hive current releases lacks a INITCAP function  which returns String with 
> first letter of the word in uppercase.INITCAP returns String, with the first 
> letter of each word in uppercase, all other letters in same case. Words are 
> delimited by white space.This will be useful report generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9357) Create ADD_MONTHS UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311797#comment-14311797
 ] 

Alexander Pivovarov commented on HIVE-9357:
---

added the function description to the wiki

> Create ADD_MONTHS UDF
> -
>
> Key: HIVE-9357
> URL: https://issues.apache.org/jira/browse/HIVE-9357
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9357.1.patch, HIVE-9357.2.patch, HIVE-9357.3.patch
>
>
> ADD_MONTHS adds a number of months to startdate: 
> add_months('2015-01-14', 1) = '2015-02-14'
> add_months('2015-01-31', 1) = '2015-02-28'
> add_months('2015-02-28', 2) = '2015-04-30'
> add_months('2015-02-28', 12) = '2016-02-29'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9357) Create ADD_MONTHS UDF

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9357:
--
  Labels:   (was: TODOC15)
Release Note: Returns the date that is num_months after start_date

> Create ADD_MONTHS UDF
> -
>
> Key: HIVE-9357
> URL: https://issues.apache.org/jira/browse/HIVE-9357
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 0.15.0
>
> Attachments: HIVE-9357.1.patch, HIVE-9357.2.patch, HIVE-9357.3.patch
>
>
> ADD_MONTHS adds a number of months to startdate: 
> add_months('2015-01-14', 1) = '2015-02-14'
> add_months('2015-01-31', 1) = '2015-02-28'
> add_months('2015-02-28', 2) = '2015-04-30'
> add_months('2015-02-28', 12) = '2016-02-29'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9569) Enable more unit tests for UNION ALL [Spark Branch]

2015-02-08 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-9569:
---
Attachment: HIVE-9569.2.patch

> Enable more unit tests for UNION ALL [Spark Branch]
> ---
>
> Key: HIVE-9569
> URL: https://issues.apache.org/jira/browse/HIVE-9569
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Chao
>Assignee: Chao
> Attachments: HIVE-9569.1-spark.patch, HIVE-9569.1.patch, 
> HIVE-9569.2.patch
>
>
> Currently, we only enabled a subset of all the union tests. We should try to 
> enable the rest, and see if there's any issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-3405) UDF initcap to obtain a string with the first letter of each word in uppercase other letters in lowercase

2015-02-08 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3405:
--
Labels: patch  (was: TODOC15 patch)

> UDF initcap to obtain a string with the first letter of each word in 
> uppercase other letters in lowercase
> -
>
> Key: HIVE-3405
> URL: https://issues.apache.org/jira/browse/HIVE-3405
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.13.0, 0.14.0, 
> 0.15.0, 0.14.1
>Reporter: Archana Nair
>Assignee: Alexander Pivovarov
>  Labels: patch
> Fix For: 0.15.0
>
> Attachments: HIVE-3405.1.patch.txt, HIVE-3405.2.patch, 
> HIVE-3405.3.patch, HIVE-3405.4.patch, HIVE-3405.5.patch, HIVE-3405.5.patch
>
>
> Hive current releases lacks a INITCAP function  which returns String with 
> first letter of the word in uppercase.INITCAP returns String, with the first 
> letter of each word in uppercase, all other letters in same case. Words are 
> delimited by white space.This will be useful report generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3405) UDF initcap to obtain a string with the first letter of each word in uppercase other letters in lowercase

2015-02-08 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311794#comment-14311794
 ] 

Alexander Pivovarov commented on HIVE-3405:
---

Added function description to the wiki

> UDF initcap to obtain a string with the first letter of each word in 
> uppercase other letters in lowercase
> -
>
> Key: HIVE-3405
> URL: https://issues.apache.org/jira/browse/HIVE-3405
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.13.0, 0.14.0, 
> 0.15.0, 0.14.1
>Reporter: Archana Nair
>Assignee: Alexander Pivovarov
>  Labels: patch
> Fix For: 0.15.0
>
> Attachments: HIVE-3405.1.patch.txt, HIVE-3405.2.patch, 
> HIVE-3405.3.patch, HIVE-3405.4.patch, HIVE-3405.5.patch, HIVE-3405.5.patch
>
>
> Hive current releases lacks a INITCAP function  which returns String with 
> first letter of the word in uppercase.INITCAP returns String, with the first 
> letter of each word in uppercase, all other letters in same case. Words are 
> delimited by white space.This will be useful report generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9454) Test failures due to new Calcite version

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9454:
---
Attachment: HIVE-9454.07.patch

Updated patch to use Calcite 1.0.0 which has been released.

> Test failures due to new Calcite version
> 
>
> Key: HIVE-9454
> URL: https://issues.apache.org/jira/browse/HIVE-9454
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-9454.02.patch, HIVE-9454.03.patch, 
> HIVE-9454.04.patch, HIVE-9454.05.patch, HIVE-9454.06.patch, 
> HIVE-9454.07.patch, HIVE-9454.1.patch
>
>
> A bunch of failures have started appearing in patches which seen unrelated. I 
> am thinking we've picked up a new version of Calcite. E.g.:
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2488/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_auto_join12/
> {noformat}
> Running: diff -a 
> /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/auto_join12.q.out
>  
> /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/auto_join12.q.out
> 32c32
> < $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
> ---
> > $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
> 35c35
> < $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src 
> ---
> > $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src 
> 39c39
> < $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
> ---
> > $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 
> 54c54
> < $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src 
> ---
> > $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Description: Propagate limit context generated from GlobalLimitOptimizer to 
storage handlers.  (was: Propagate limit context generated from 
GlobalLimitOptimizer to strorage handlers.)

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to storage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Attachment: HIVE-9615.1.patch.txt

Old patch found from git stash

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to strorage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9615:

Status: Patch Available  (was: Open)

> Provide limit context for storage handlers
> --
>
> Key: HIVE-9615
> URL: https://issues.apache.org/jira/browse/HIVE-9615
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-9615.1.patch.txt
>
>
> Propagate limit context generated from GlobalLimitOptimizer to strorage 
> handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9615) Provide limit context for storage handlers

2015-02-08 Thread Navis (JIRA)
Navis created HIVE-9615:
---

 Summary: Provide limit context for storage handlers
 Key: HIVE-9615
 URL: https://issues.apache.org/jira/browse/HIVE-9615
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial


Propagate limit context generated from GlobalLimitOptimizer to strorage 
handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2015-02-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311775#comment-14311775
 ] 

Gopal V commented on HIVE-9350:
---

SessionState is referred to in the AM, but MetaException isn't included in the 
hive-exec.jar.

> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9614) Encrypt mapjoin tables

2015-02-08 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9614:
--

 Summary: Encrypt mapjoin tables
 Key: HIVE-9614
 URL: https://issues.apache.org/jira/browse/HIVE-9614
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland


When performing a MapJoin we store data which is put in the distributed cache 
which is stored on local disk. Ideally we would encrypt these tables to the 
same degree that the they are encrypted in HDFS or find some other way to 
ensure they are encrypted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311761#comment-14311761
 ] 

Hive QA commented on HIVE-9611:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697374/HIVE-9611.patch

{color:green}SUCCESS:{color} +1 7526 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2712/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2712/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2712/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697374 - PreCommit-HIVE-TRUNK-Build

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30783: JDBC should provide metadata for columns whether a column is a partition column or not

2015-02-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30783/
---

Review request for hive.


Bugs: HIVE-3050
https://issues.apache.org/jira/browse/HIVE-3050


Repository: hive-git


Description
---

Trivial request from UI developers. 
{code}
DatabaseMetaData databaseMetaData = connection.getMetaData();
ResultSet rs = databaseMetaData.getColumns(null, null, "tableName", null);

boolean partitionKey = rs.getBoolean("IS_PARTITION_COLUMN");
{code}
It's not JDBC standard column but seemed to be useful.


Diffs
-

  
service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 
92ea7b0 

Diff: https://reviews.apache.org/r/30783/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-3050) JDBC should provide metadata for columns whether a column is a partition column or not

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3050:

Attachment: HIVE-3050.1.patch.txt

> JDBC should provide metadata for columns whether a column is a partition 
> column or not
> --
>
> Key: HIVE-3050
> URL: https://issues.apache.org/jira/browse/HIVE-3050
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3050.1.patch.txt
>
>
> Trivial request from UI developers. 
> {code}
> DatabaseMetaData databaseMetaData = connection.getMetaData();
> ResultSet rs = databaseMetaData.getColumns(null, null, "tableName", null);
> 
> boolean partitionKey = rs.getBoolean("IS_PARTITION_COLUMN");
> {code}
> It's not JDBC standard column but seemed to be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9613) Left join query plan outputs wrong column when using subquery

2015-02-08 Thread Li Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Xin updated HIVE-9613:
-
Description: 
I have a query that outputs a column with wrong contents when using 
subquery,and the contents of that column is equal to another column,not its own.

I have three tables,as follows:

table 1: _hivetemp.category_city_rank_:
||category||city||rank||
|jinrongfuwu|shanghai|1|
|ktvjiuba|shanghai|2|

table 2:_hivetemp.category_match_:
||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
|danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
|zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|

table 3:_hivetemp.city_match_:
||src_city_name_en||dst_city_name_en||city_name_cn||
|sh|shanghai|上海|

And the query is :
{code}
select
a.category,
a.city,
a.rank,
b.src_category_en,
c.src_city_name_en
from
hivetemp.category_city_rank a
left outer join
(select
src_category_en,
dst_category_en
from
hivetemp.category_match) b
on  a.category = b.dst_category_en
left outer join
(select
src_city_name_en,
dst_city_name_en
from
hivetemp.city_match) c
on  a.city = c.dst_city_name_en
{code}

which shoud output the results as follows,and i test it in hive 0.13:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
|ktvjiuba|shanghai|2|zpwentiyingshi|sh|

but int hive0.14,the results in the column *src_category_en* is wrong,and is 
just the *city* contents:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|shanghai|sh|
|ktvjiuba|shanghai|2|shanghai|sh|

Using explain to examine the execution plan,i can see the first subquery just 
outputs one column of *dst_category_en*,and *src_category_en* is just missing.

{quote}
   b:category_match
  TableScan
alias: category_match
Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE
Select Operator
  expressions: dst_category_en (type: string)
  outputColumnNames: _col1
  Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE

{quote}

  was:
I have a query that outputs a column with wrong contents,which equals to 
another column.  

I have three tables,as follows:

table 1: _hivetemp.category_city_rank_:
||category||city||rank||
|jinrongfuwu|shanghai|1|
|ktvjiuba|shanghai|2|

table 2:_hivetemp.category_match_:
||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
|danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
|zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|

table 3:_hivetemp.city_match_:
||src_city_name_en||dst_city_name_en||city_name_cn||
|sh|shanghai|上海|

And the query is :
{code}
select
a.category,
a.city,
a.rank,
b.src_category_en,
c.src_city_name_en
from
hivetemp.category_city_rank a
left outer join
(select
src_category_en,
dst_category_en
from
hivetemp.category_match) b
on  a.category = b.dst_category_en
left outer join
(select
src_city_name_en,
dst_city_name_en
from
hivetemp.city_match) c
on  a.city = c.dst_city_name_en
{code}

which shoud output the results as follows,and i test it in hive 0.13:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
|ktvjiuba|shanghai|2|zpwentiyingshi|sh|

but int hive0.14,the results in the column *src_category_en* is wrong,and is 
just the *city* contents:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|shanghai|sh|
|ktvjiuba|shanghai|2|shanghai|sh|

Using explain to examine the execution plan,i can see the first subquery just 
outputs one column of *dst_category_en*,and *src_category_en* is just missing.

{quote}
   b:category_match
  TableScan
alias: category_match
Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE
Select Operator
  expressions: dst_category_en (type: string)
  outputColumnNames: _col1
  Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE

{quote}


> Left join query plan outputs  wrong column when using subquery
> --
>
> Key: HIVE-9613
> URL: https://issues.apache.org/jira/browse/HIVE-9613
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Planning
>Affects Versions: 0.14.0, 1.0.0
> Environment: apache hadoop 2.5.1 
>Reporter: Li Xin
>
> I have a query that outputs a column with wrong contents when using 
> subquery,and the contents of that column is equal to another column,not its 
> own.
> I have three tables,as follows:
> table 1: _hivetemp.category_city_rank_:
> ||category||city||rank||
> |jinrongfuwu|shanghai|1|
> |

[jira] [Created] (HIVE-9613) Left join query plan outputs wrong column when using subquery

2015-02-08 Thread Li Xin (JIRA)
Li Xin created HIVE-9613:


 Summary: Left join query plan outputs  wrong column when using 
subquery
 Key: HIVE-9613
 URL: https://issues.apache.org/jira/browse/HIVE-9613
 Project: Hive
  Issue Type: Bug
  Components: Parser, Query Planning
Affects Versions: 0.14.0, 1.0.0
 Environment: apache hadoop 2.5.1 
Reporter: Li Xin


I have a query that outputs a column with wrong contents,which equals to 
another column.  

I have three tables,as follows:

table 1: _hivetemp.category_city_rank_:
||category||city||rank||
|jinrongfuwu|shanghai|1|
|ktvjiuba|shanghai|2|

table 2:_hivetemp.category_match_:
||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
|danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
|zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|

table 3:_hivetemp.city_match_:
||src_city_name_en||dst_city_name_en||city_name_cn||
|sh|shanghai|上海|

And the query is :
{code}
select
a.category,
a.city,
a.rank,
b.src_category_en,
c.src_city_name_en
from
hivetemp.category_city_rank a
left outer join
(select
src_category_en,
dst_category_en
from
hivetemp.category_match) b
on  a.category = b.dst_category_en
left outer join
(select
src_city_name_en,
dst_city_name_en
from
hivetemp.city_match) c
on  a.city = c.dst_city_name_en
{code}

which shoud output the results as follows,and i test it in hive 0.13:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
|ktvjiuba|shanghai|2|zpwentiyingshi|sh|

but int hive0.14,the results in the column *src_category_en* is wrong,and is 
just the *city* contents:
||category||city||rank||src_category_en||src_city_name_en||
|jinrongfuwu|shanghai|1|shanghai|sh|
|ktvjiuba|shanghai|2|shanghai|sh|

Using explain to examine the execution plan,i can see the first subquery just 
outputs one column of *dst_category_en*,and *src_category_en* is just missing.

{quote}
   b:category_match
  TableScan
alias: category_match
Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE
Select Operator
  expressions: dst_category_en (type: string)
  outputColumnNames: _col1
  Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
Column stats: NONE

{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Rui Li


> On Feb. 9, 2015, 2:51 a.m., Rui Li wrote:
> >

Some high level question, do we still need two buffers? And does it make sense 
to use something like a queue instead of an array as the buffer?


- Rui


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/#review71597
---


On Feb. 7, 2015, 3:09 a.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30739/
> ---
> 
> (Updated Feb. 7, 2015, 3:09 a.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9574
> https://issues.apache.org/jira/browse/HIVE-9574
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Result KV cache doesn't use RowContainer any more since it has logic we don't 
> need, which is some overhead. We don't do lazy computing right away, instead 
> we wait a little till the cache is close to spill.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  78ab680 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
> 8ead0cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 
> 7a09b4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java
>  e92e299 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
> 070ea4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
>  d4ff37c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 
> 286816b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
> 0df4598 
> 
> Diff: https://reviews.apache.org/r/30739/diff/
> 
> 
> Testing
> ---
> 
> Unit test, test on cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Commented] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311752#comment-14311752
 ] 

Xuefu Zhang commented on HIVE-9611:
---

+1

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30750: HIVE-9605 Remove parquet nested objects from wrapper writable objects

2015-02-08 Thread Dong Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30750/#review71603
---


Looks good! Sergio.

I only have one question. In the 4 inspectors, I think we don't need to check 
(array length == 0), this may miss the empty map/list case. The original code 
does this check, but it is for container. What do you think?
I left detailed comments for 1 inspector below.


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java


In original wrapper, it has to check null and length to ensure there is an 
ArrayWritable of Map. Then get it.

As we removed the wrapper, and get the mapArray already, maybe we just need 
to check null here. If length is 0, I think we should return an empty map 
instead of null.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java


how about just check null? 
Length being 0 might be a normal case.


- Dong Chen


On Feb. 8, 2015, 6:13 a.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30750/
> ---
> 
> (Updated Feb. 8, 2015, 6:13 a.m.)
> 
> 
> Review request for hive, Ryan Blue, cheng xu, and Dong Chen.
> 
> 
> Bugs: HIVE-9605
> https://issues.apache.org/jira/browse/HIVE-9605
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Remove wrapper object from parquet nested types (map/array)
> 
> 
> Diffs
> -
> 
>   
> itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java
>  d335716be3d286a1b9221dcbd8ccd799f4c6dc66 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveCollectionConverter.java
>  6621a8768953a9bef54e7a144ae045abcc32f458 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java
>  4809f9b5882ae409159b422c08c665aa24f796d8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/Repeated.java 
> fdea782167d63593f6cbde5e7154d771761757f7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java
>  62c61fc7502f24e6a032076f384b5a946c1cc9a6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/DeepParquetHiveMapInspector.java
>  d38c64192e01371c0c98b339113348d2e52cedc3 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveArrayInspector.java
>  53ca31d0b516c4a941e048e98e7f8f763752c436 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/StandardParquetHiveMapInspector.java
>  5aa14482899fed5711b40c5554b056d07818afb5 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapStructures.java 
> ca4805082fd717d15ed41ca15a730e19da267c8a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestAbstractParquetMapInspector.java
>  ef05150494027ddd70790dcf26b772ebc4cd2b8b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestDeepParquetHiveMapInspector.java
>  8646ff4d3413d7d642e2559e1a485d77472b156a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveArrayInspector.java
>  f3a24af2e5f4eeb24e1e286ada19fc9592daacb6 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestStandardParquetHiveMapInspector.java
>  278419f73b311322dcf3c70abb340bf63d8a4337 
> 
> Diff: https://reviews.apache.org/r/30750/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



[jira] [Created] (HIVE-9612) Turn off DEBUG logging for Lazy Objects for tests

2015-02-08 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9612:
--

 Summary: Turn off DEBUG logging for Lazy Objects for tests
 Key: HIVE-9612
 URL: https://issues.apache.org/jira/browse/HIVE-9612
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland


Our tests are collecting a tremendous amounts of logs:
{noformat}
[root@ip-10-152-185-204 TestRCFile]# pwd
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-2709/succeeded/TestRCFile
[root@ip-10-152-185-204 TestRCFile]# ls -lh hive.log 
-rw-r--r-- 1 hiveptest hiveptest 143M Feb  8 03:54 hive.log
{noformat}

Much of this logging is due to stack traces printed at DEBUG. 

{noformat}
2015-02-08 00:54:07,942 DEBUG [main]: lazy.LazyDouble 
(LazyDouble.java:init(55)) - Data not in the Double data type range so 
converted to null. Given data is :
java.lang.NumberFormatException: empty String
at 
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1011)
at java.lang.Double.parseDouble(Double.java:540)
at 
org.apache.hadoop.hive.serde2.lazy.LazyDouble.init(LazyDouble.java:51)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase.getFieldsAsList(ColumnarStructBase.java:224)
at 
org.apache.hadoop.hive.serde2.objectinspector.ColumnarStructObjectInspector.getStructFieldsDataAsList(ColumnarStructObjectInspector.java:76)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.serialize(ColumnarSerDe.java:144)
at 
org.apache.hadoop.hive.ql.io.TestRCFile.partialReadTest(TestRCFile.java:598)
at 
org.apache.hadoop.hive.ql.io.TestRCFile.testWriteAndPartialRead(TestRCFile.java:417)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

2015-02-08 00:54:17,992 DEBUG [main]: lazy.LazyPrimitive 
(LazyPrimitive.java:logExceptionMessage(81)) - Data not in the INT data type 
range so converted to null. Given data is :
java.lang.Exception: For debugging purposes
at 
org.apache.hadoop.hive.serde2.lazy.LazyPrimitive.logExceptionMessage(LazyPrimitive.java:81)
at 
org.apache.hadoop.hive.serde2.lazy.LazyInteger.init(LazyInteger.java:59)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase$FieldInfo.uncheckedGetField(ColumnarStructBase.java:111)
at 
org.apache.hadoop.hive.serde2.columnar.ColumnarStructBase.getField(ColumnarStructBase.java:172)
at 
org.apache.hadoop.hive.serde2.objectinspector.ColumnarStructObjectInspector.getStructFieldData(ColumnarStructObjectInspector.java:67)
at 
org.apache.hadoop.hive.ql.io.TestRCFile.testSimpleReadAndWrite(TestRCFile.java:232)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccess

[jira] [Commented] (HIVE-9586) Too verbose log can hurt performance, we should always check log level first

2015-02-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311742#comment-14311742
 ] 

Xuefu Zhang commented on HIVE-9586:
---

Also merged to spark branch.

> Too verbose log can hurt performance, we should always check log level first
> 
>
> Key: HIVE-9586
> URL: https://issues.apache.org/jira/browse/HIVE-9586
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: spark-branch, 1.2.0
>
> Attachments: HIVE-9586.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9586) Too verbose log can hurt performance, we should always check log level first

2015-02-08 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9586:
--
Fix Version/s: spark-branch

> Too verbose log can hurt performance, we should always check log level first
> 
>
> Key: HIVE-9586
> URL: https://issues.apache.org/jira/browse/HIVE-9586
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: spark-branch, 1.2.0
>
> Attachments: HIVE-9586.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Rui Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/#review71604
---



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java


What happens if input!=null and we're creating the temp file again?


- Rui Li


On Feb. 7, 2015, 3:09 a.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30739/
> ---
> 
> (Updated Feb. 7, 2015, 3:09 a.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9574
> https://issues.apache.org/jira/browse/HIVE-9574
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Result KV cache doesn't use RowContainer any more since it has logic we don't 
> need, which is some overhead. We don't do lazy computing right away, instead 
> we wait a little till the cache is close to spill.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  78ab680 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
> 8ead0cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 
> 7a09b4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java
>  e92e299 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
> 070ea4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
>  d4ff37c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 
> 286816b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
> 0df4598 
> 
> Diff: https://reviews.apache.org/r/30739/diff/
> 
> 
> Testing
> ---
> 
> Unit test, test on cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Open  (was: Patch Available)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Status: Patch Available  (was: Open)

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar

2015-02-08 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-6617:
--
Attachment: HIVE-6617.09.patch

address ptf issue. Now it should be 0 warning without the reserved keywords.

> Reduce ambiguity in grammar
> ---
>
> Key: HIVE-6617
> URL: https://issues.apache.org/jira/browse/HIVE-6617
> Project: Hive
>  Issue Type: Task
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
> Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, 
> HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, 
> HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch
>
>
> CLEAR LIBRARY CACHE
> As of today, antlr reports 214 warnings. Need to bring down this number, 
> ideally to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Rui Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/#review71597
---



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java


If I understand correctly, this can be renamed to something like 
IN_MEMORY_NUM_ROWS?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java


Do we need a parameter here? Seems it can just use writeCursor?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java


Also close input and output here?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java


I suppose this is to avoid frequent switch buffer? But why the magic number 
1?


- Rui Li


On Feb. 7, 2015, 3:09 a.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30739/
> ---
> 
> (Updated Feb. 7, 2015, 3:09 a.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9574
> https://issues.apache.org/jira/browse/HIVE-9574
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Result KV cache doesn't use RowContainer any more since it has logic we don't 
> need, which is some overhead. We don't do lazy computing right away, instead 
> we wait a little till the cache is close to spill.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  78ab680 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
> 8ead0cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 
> 7a09b4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java
>  e92e299 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
> 070ea4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
>  d4ff37c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 
> 286816b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
> 0df4598 
> 
> Diff: https://reviews.apache.org/r/30739/diff/
> 
> 
> Testing
> ---
> 
> Unit test, test on cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Commented] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311727#comment-14311727
 ] 

Hive QA commented on HIVE-2573:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697373/HIVE-2573.14.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7526 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacro
org.apache.hadoop.hive.ql.parse.TestMacroSemanticAnalyzer.testDropMacroExistsDoNotIgnoreErrors
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2711/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2711/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2711/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697373 - PreCommit-HIVE-TRUNK-Build

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30782: Window clause ROW BETWEEN for PRECEDING does not work

2015-02-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30782/
---

Review request for hive.


Bugs: HIVE-9412
https://issues.apache.org/jira/browse/HIVE-9412


Repository: hive-git


Description
---

When window clause with ROWS is used between two proper PRECEDING boundaries, 
Hive reports syntax errors as following examples.

--For example 1
SELECT name, dept_num, salary,
MAX(salary) OVER (PARTITION BY dept_num ORDER BY 
name ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) win4_alter
FROM employee_contract;

Error: Error while compiling statement: FAILED: SemanticException Failed to 
breakup Windowing invocations into Groups. At least 1 group must only depend on 
input columns. Also check for circular dependencies.
Underlying error: Window range invalid, start boundary is greater than end 
boundary: window(start=range(2 PRECEDING), end=range(1 PRECEDING)) 
(state=42000,code=4)

--For example 2
SELECT name, dept_num, salary,
MAX(salary) OVER (PARTITION BY dept_num ORDER BY 
name ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) win1
FROM employee_contract;

Error: Error while compiling statement: FAILED: SemanticException End of a 
WindowFrame cannot be UNBOUNDED PRECEDING (state=42000,code=4)


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java e95505c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java 28afc6b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ptf/WindowFunctionDef.java e4ea358 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java 
12a327f 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFFirstValue.java 
f679387 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFLastValue.java 
e099154 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java a153818 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java d931d52 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStreamingEvaluator.java
 d68c085 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java ffb7093 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 
903a9b0 
  ql/src/test/queries/clientpositive/windowing_windowspec.q 6d8ce67 
  ql/src/test/results/clientpositive/windowing_windowspec.q.out 00af6b8 

Diff: https://reviews.apache.org/r/30782/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-9586) Too verbose log can hurt performance, we should always check log level first

2015-02-08 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311709#comment-14311709
 ] 

Xin Hao commented on HIVE-9586:
---

Hi, Xuefu, could you please consider also commit it to Spark Branch?  Thanks.

> Too verbose log can hurt performance, we should always check log level first
> 
>
> Key: HIVE-9586
> URL: https://issues.apache.org/jira/browse/HIVE-9586
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: 1.2.0
>
> Attachments: HIVE-9586.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30780: Use session classloader instead of application loader

2015-02-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30780/
---

Review request for hive.


Bugs: HIVE-9486
https://issues.apache.org/jira/browse/HIVE-9486


Repository: hive-git


Description
---

PlanUtils.java use correct classloader when calling Class.forName()
?? ? Tue, 27 Jan 2015 01:08:10 -0800


Hi All,
I am not having some hive classpath issue. I think this is a bug.
I wrote my own SerDe "com.stanley.MySerde", which is a simple json serializer; 
It is generally the same with the built-in SerDe 
"org.apache.hadoop.hive.serde2.DelimitedJSONSerDe".
Then I issued the command:add jar /path/to/myjar.jar; (I am sure this command 
worked)create table t1.json_1 row format serde "com.stanley.MySerde" location 
'/user/stanley/test-data-1/' as select * from t1.plain_table;
create table t1.json_2 row format serde 
"org.apache.hadoop.hive.serde2.DelimitedJSONSerDe" location 
'/user/stanley/test-data-2/' as select * from t1.plain_table;
The second command will succeed but the first one will fail with 
ClassNotFoundException. But if I put myjar.jar to $HIVE_HOME/lib, both command 
will succeed. I went through the code of the 
org.apache.hadoop.hive.ql.plan.PlanUtils.java, seems it is using 
Class.forname(clzname) to load the class, I think it should use the 
Thread.contextClassLoader instead, am I right?There's a similar issue here: 
https://issues.apache.org/jira/browse/HIVE-6495
Here's the exception trace: java.lang.ClassNotFoundException: 
com.ebay.p13n.hive.bexbat.serde.JsonLazySimpleSerDe   at 
java.net.URLClassLoader$1.run(URLClassLoader.java:366)   at 
java.net.URLClassLoader$1.run(URLClassLoader.java:355)   at 
java.security.AccessController.doPrivileged(Native Method)   at 
java.net.URLClassLoader.findClass(URLClassLoader.java:354)   at 
java.lang.ClassLoader.loadClass(ClassLoader.java:425)at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)at 
java.lang.ClassLoader.loadClass(ClassLoader.java:358)at 
java.lang.Class.forName0(Native Method)  at 
java.lang.Class.forName(Class.java:190)  at 
org.apache.hadoop.hive.ql.plan.PlanUtils.getTableDesc(PlanUtils.java:310)at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:5874)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8278)
  at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8169)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9001)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9267)
 at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
  at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:427)at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:980)at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1045)   at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)


Diffs
-

  accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/Utils.java 16abac2 
  
accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java
 ef459aa 
  
accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/AccumuloSerDeParameters.java
 ef77697 
  common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDeHelper.java 
9f2f02f 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDeParameters.java 
a43520c 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657 
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatSplit.java 
d3d5a0f 
  
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageFactory.java
 88df982 
  
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
 8c4bca0 
  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobState.java
 36b64da 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 3a2a6ee 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 adb50f0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
3518edc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HivePreWarmProcessor.java 
ce3b1d6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
afe83d9 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
0ca5d22 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveRecordReader.java ede3b6e 
  ql/src/java/org/apache/hadoo

Review Request 30779: hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-02-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30779/
---

Review request for hive.


Repository: hive-git


Description
---

If you use hive.limit.query.max.table.partition to limit the amount of 
partitions that can be queried it makes queries on non-partitioned tables fail.

Example:
{noformat}
CREATE TABLE tmp(test INT);
SELECT COUNT(*) FROM TMP; -- works fine
SET hive.limit.query.max.table.partition=20;
SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
SET hive.limit.query.max.table.partition=-1;
SELECT COUNT(*) FROM TMP; -- works fine again
{noformat}


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 d18e1a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 6c1ab07 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 2466d78 

Diff: https://reviews.apache.org/r/30779/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-9499) hive.limit.query.max.table.partition makes queries fail on non-partitioned tables

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9499:

Attachment: HIVE-9499.3.patch.txt

Rebased to trunk

> hive.limit.query.max.table.partition makes queries fail on non-partitioned 
> tables
> -
>
> Key: HIVE-9499
> URL: https://issues.apache.org/jira/browse/HIVE-9499
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Alexander Kasper
>Assignee: Navis
> Attachments: HIVE-9499.1.patch.txt, HIVE-9499.2.patch.txt, 
> HIVE-9499.3.patch.txt
>
>
> If you use hive.limit.query.max.table.partition to limit the amount of 
> partitions that can be queried it makes queries on non-partitioned tables 
> fail.
> Example:
> {noformat}
> CREATE TABLE tmp(test INT);
> SELECT COUNT(*) FROM TMP; -- works fine
> SET hive.limit.query.max.table.partition=20;
> SELECT COUNT(*) FROM TMP; -- generates NPE (FAILED: NullPointerException null)
> SET hive.limit.query.max.table.partition=-1;
> SELECT COUNT(*) FROM TMP; -- works fine again
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9507) Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9507:

Attachment: HIVE-9507.2.patch.txt

Reattaching for test

> Make "LATERAL VIEW inline(expression) mytable" tolerant to nulls
> 
>
> Key: HIVE-9507
> URL: https://issues.apache.org/jira/browse/HIVE-9507
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, UDF
>Affects Versions: 0.14.0
> Environment: hdp 2.2
> Windows server 2012 R2 64-bit
>Reporter: Moustafa Aboul Atta
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, 
> parial_log.log
>
>
> I have tweets stored with avro on hdfs with the default twitter status 
> (tweet) schema.
> There's an object called "entities" that contains arrays of structs.
> When I run
>  
> {{SELECT mytable.*}}
> {{FROM tweets}}
> {{LATERAL VIEW INLINE(entities.media) mytable}}
> I get the exception attached as partial_log.log, however, if I add
> {{WHERE entities.media IS NOT NULL}}
> it runs perfectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 30778: NPE for invalid union all

2015-02-08 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30778/
---

Review request for hive.


Bugs: HIVE-9513
https://issues.apache.org/jira/browse/HIVE-9513


Repository: hive-git


Description
---

NPE duting parsing  of :

{noformat}
select * from (
 select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f
  union all
 select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g
) e ;
{noformat}


Diffs
-

  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 7c09fcc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 3a613a2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 2466d78 
  ql/src/test/queries/clientnegative/union3.q ce65747 
  ql/src/test/queries/clientpositive/union35.q PRE-CREATION 
  ql/src/test/results/clientnegative/union3.q.out de1c62b 
  ql/src/test/results/clientpositive/union35.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/30778/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-9513) NULL POINTER EXCEPTION

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-9513:

Attachment: HIVE-9513.2.patch.txt

> NULL POINTER EXCEPTION
> --
>
> Key: HIVE-9513
> URL: https://issues.apache.org/jira/browse/HIVE-9513
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.13.1
>Reporter: ErwanMAS
>Assignee: Navis
> Attachments: HIVE-9513.1.patch.txt, HIVE-9513.2.patch.txt
>
>
> NPE duting parsing  of :
> {noformat}
> select * from (
>  select * from ( select 1 as id , "foo" as str_1 from staging.dual ) f
>   union   all
>  select * from ( select 2 as id , "bar" as str_2 from staging.dual ) g
> ) e ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8119) Implement Date in ParquetSerde

2015-02-08 Thread Dong Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-8119:

Attachment: HIVE-8119.1.patch

Patch v1. A slight change to import class DateWritable.

> Implement Date in ParquetSerde
> --
>
> Key: HIVE-8119
> URL: https://issues.apache.org/jira/browse/HIVE-8119
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Brock Noland
>Assignee: Mohit Sabharwal
> Attachments: HIVE-8119.1.patch, HIVE-8119.patch
>
>
> Date type in Parquet is discussed here: 
> http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9611:
---
Affects Version/s: spark-branch

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Brock Noland (JIRA)
Brock Noland created HIVE-9611:
--

 Summary: Allow SPARK_HOME as well as spark.home to define sparks 
location
 Key: HIVE-9611
 URL: https://issues.apache.org/jira/browse/HIVE-9611
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Minor
 Attachments: HIVE-9611.patch

Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We should 
allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9611:
---
Attachment: HIVE-9611.patch

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311696#comment-14311696
 ] 

Brock Noland commented on HIVE-9611:


FYI [~xuefuz]

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: spark-branch, 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9611:
---
Affects Version/s: 1.1.0
   Status: Patch Available  (was: Open)

> Allow SPARK_HOME as well as spark.home to define sparks location
> 
>
> Key: HIVE-9611
> URL: https://issues.apache.org/jira/browse/HIVE-9611
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9611.patch
>
>
> Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We 
> should allow {{SPARK_HOME}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9610) Continuation of HIVE-9438 - The standalone-jdbc jar missing some classes

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9610:
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   Status: Resolved  (was: Patch Available)

Thank you Vaibhav! I committed this to trunk and branch1.1

> Continuation of HIVE-9438 - The standalone-jdbc jar missing some classes
> 
>
> Key: HIVE-9610
> URL: https://issues.apache.org/jira/browse/HIVE-9610
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0
>
> Attachments: HIVE-9610.patch
>
>
> We've not had success only including specific shim classes as part of the 
> standalone jdbc jar. Since all shim classes shouldn't be too large we'll 
> include them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9432) CBO (Calcite Return Path): Removing QB from ParseContext

2015-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311685#comment-14311685
 ] 

Hive QA commented on HIVE-9432:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12697368/HIVE-9432.03.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7526 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2710/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2710/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2710/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12697368 - PreCommit-HIVE-TRUNK-Build

> CBO (Calcite Return Path): Removing QB from ParseContext
> 
>
> Key: HIVE-9432
> URL: https://issues.apache.org/jira/browse/HIVE-9432
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9432.01.patch, HIVE-9432.02.patch, 
> HIVE-9432.03.patch, HIVE-9432.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: (was: HIVE-2573.14.patch.txt)

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 26854: HIVE-2573 Create per-session function registry

2015-02-08 Thread Navis Ryu


> On Nov. 14, 2014, 9:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g, line 1626
> > 
> >
> > The message here should be "reload function statement"

Fixed


> On Nov. 14, 2014, 9:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/session/SessionConf.java, line 24
> > 
> >
> > What about the idea of moving static call to resolveFunctions() to 
> > SessionState? I thought that would remove the need for SessionConf, because 
> > then Hive class would once again be usable during query runtime. Unless you 
> > think it's cleaner to use SessionConf to get HiveConf rather than the Hive 
> > object.

Done


> On Nov. 14, 2014, 9:24 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java, line 105
> > 
> >
> > Ok, so this allows the persistent function list to be reloaded, with an 
> > explicit RELOAD command. This should work for now, I suppose it's always 
> > possible to add more automatic refreshing of functions later on if folks 
> > want that.

Yes, let's do that in following issue.


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/#review61331
---


On Nov. 13, 2014, 10:15 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26854/
> ---
> 
> (Updated Nov. 13, 2014, 10:15 p.m.)
> 
> 
> Review request for hive, Navis Ryu and Thejas Nair.
> 
> 
> Bugs: HIVE-2573
> https://issues.apache.org/jira/browse/HIVE-2573
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Small updates to Navis' changes:
> - session registry doesn't lookup metastore for UDFs
> - my feedback from Navis' original patch
> - metastore udfs should not be considered native. This allows them to be 
> added/removed from registry
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 9aa917c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java cca57d2 
>   contrib/src/test/results/clientnegative/invalid_row_sequence.q.out 8f3c0b3 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  88b0791 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 292c83c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 6323387 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 913288f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 18e40b3 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 13277a9 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 211ab6c 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IndexUtils.java e2768ff 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java
>  7f52c29 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
> 4b2a81a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
> 22e5b47 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f1365fa 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IndexUpdater.java 2b239ab 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
> 6962ee9 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FunctionWork.java f968bc1 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ReloadFunctionDesc.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionConf.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2806bd1 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 
> 46f8052 
>   ql/src/test/queries/clientnegative/drop_native_udf.q ae047bb 
>   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
> c7405ed 
>   ql/src/test/results/clientnegative/create_function_nonudf_class.q.out 
> d0dd50a 
>   ql/src/test/results/clientnegat

[jira] [Updated] (HIVE-2573) Create per-session function registry

2015-02-08 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2573:

Attachment: HIVE-2573.14.patch.txt

Forgot this for a long time. Rebased to trunk.

> Create per-session function registry 
> -
>
> Key: HIVE-2573
> URL: https://issues.apache.org/jira/browse/HIVE-2573
> Project: Hive
>  Issue Type: Improvement
>  Components: Server Infrastructure
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, 
> HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, 
> HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, 
> HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, 
> HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, 
> HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt
>
>
> Currently the function registry is shared resource and could be overrided by 
> other users when using HiveServer. If per-session function registry is 
> provided, this situation could be prevented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9432) CBO (Calcite Return Path): Removing QB from ParseContext

2015-02-08 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9432:
--
Attachment: HIVE-9432.03.patch

> CBO (Calcite Return Path): Removing QB from ParseContext
> 
>
> Key: HIVE-9432
> URL: https://issues.apache.org/jira/browse/HIVE-9432
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9432.01.patch, HIVE-9432.02.patch, 
> HIVE-9432.03.patch, HIVE-9432.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9432) CBO (Calcite Return Path): Removing QB from ParseContext

2015-02-08 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9432:
--
Attachment: (was: HIVE-9432.03.patch)

> CBO (Calcite Return Path): Removing QB from ParseContext
> 
>
> Key: HIVE-9432
> URL: https://issues.apache.org/jira/browse/HIVE-9432
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9432.01.patch, HIVE-9432.02.patch, HIVE-9432.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9432) CBO (Calcite Return Path): Removing QB from ParseContext

2015-02-08 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9432:
--
Attachment: HIVE-9432.03.patch

> CBO (Calcite Return Path): Removing QB from ParseContext
> 
>
> Key: HIVE-9432
> URL: https://issues.apache.org/jira/browse/HIVE-9432
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 0.15.0
>
> Attachments: HIVE-9432.01.patch, HIVE-9432.02.patch, 
> HIVE-9432.03.patch, HIVE-9432.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8493) Access to HiveClientCache$CacheableHiveMetaStoreClient#isClosed should be synchronized

2015-02-08 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HIVE-8493.
--
Resolution: Later

> Access to HiveClientCache$CacheableHiveMetaStoreClient#isClosed should be 
> synchronized
> --
>
> Key: HIVE-8493
> URL: https://issues.apache.org/jira/browse/HIVE-8493
> Project: Hive
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> Update to isClosed is protected by synchronizing on 
> CacheableHiveMetaStoreClient.this
> {code}
> public boolean isClosed() {
>   return isClosed;
> {code}
> The above access should be synchronized as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Jimmy Xiang


> On Feb. 7, 2015, 6:38 p.m., Xuefu Zhang wrote:
> > High-level comments:
> > 1. File name convention needs to take care of multiple threads doing the 
> > same thing in a single JVM.
> > 2. Avoid creating Tuple2 objects. There is no need to have this object, but 
> > we create them just to wrap key/value, likely increase GC pressure.
> > 3. Disk writing can be more performing. We should be able to put 
> > keys/values as bytes in a large byte[] (as the buffer). The we need to 
> > spill, we write the whole byte[] to disk in one shot.
> > 4. Related to #3, but I'm wondering why we need kryo at all.
> > 5. We need to take care of file closing in case of exceptions. The call 
> > usually goes into a finally block.
> > 6. It's better to allocate the buffer in the beginning and fail right way 
> > in case of OOM. We don't want the job to fail when only the last row needs 
> > to spill and incur OOM then.

1. As in the javadoc, HiveKVResultCache is not thread safe, each 
HiveBaseFunctionResultList instance should have its own cache. Any suggestion 
as how to change the file name?
2. The next() returns a Tuple2. We need to create a Tuple2 before returning the 
pair. Are you saying we puth the pair in buffer without creating a Tuple2?
3. Output has a buffer itself. With the current way, we don't need to create 
another buffer.
4. Kryo is not used here. We just used Input and Output. The reason is that 
they both have buffers inside. Input has a way to tell if we are done with the 
file before trying to read from it.
5. Good point, will fix that.
6. The buffer iteself doesn't need much memory. Yes, we can pre-allocate it. 
Will fix that.


- Jimmy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/#review71546
---


On Feb. 7, 2015, 3:09 a.m., Jimmy Xiang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30739/
> ---
> 
> (Updated Feb. 7, 2015, 3:09 a.m.)
> 
> 
> Review request for hive, Rui Li and Xuefu Zhang.
> 
> 
> Bugs: HIVE-9574
> https://issues.apache.org/jira/browse/HIVE-9574
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Result KV cache doesn't use RowContainer any more since it has logic we don't 
> need, which is some overhead. We don't do lazy computing right away, instead 
> we wait a little till the cache is close to spill.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  78ab680 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
> 8ead0cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 
> 7a09b4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java
>  e92e299 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
> 070ea4d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
>  d4ff37c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 
> 286816b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
> 0df4598 
> 
> Diff: https://reviews.apache.org/r/30739/diff/
> 
> 
> Testing
> ---
> 
> Unit test, test on cluster
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>



[jira] [Updated] (HIVE-9608) Define SPARK_HOME if not defined automagically

2015-02-08 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-9608:
---
   Resolution: Fixed
Fix Version/s: 1.1.0
   Status: Resolved  (was: Patch Available)

Thank you Xuefu!

> Define SPARK_HOME if not defined automagically
> --
>
> Key: HIVE-9608
> URL: https://issues.apache.org/jira/browse/HIVE-9608
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Fix For: 1.1.0
>
> Attachments: HIVE-9608.patch, HIVE-9608.patch
>
>
> many hadoop installs are in {{dir/\{spark,hive,hadoop,..\}}}. We can infer 
> {{SPARK_HOME}} in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9608) Define SPARK_HOME if not defined automagically

2015-02-08 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311422#comment-14311422
 ] 

Brock Noland commented on HIVE-9608:


I believe that error was related to the disk filling up. I re-ran the test 
locally and it passed.

> Define SPARK_HOME if not defined automagically
> --
>
> Key: HIVE-9608
> URL: https://issues.apache.org/jira/browse/HIVE-9608
> Project: Hive
>  Issue Type: Improvement
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Attachments: HIVE-9608.patch, HIVE-9608.patch
>
>
> many hadoop installs are in {{dir/\{spark,hive,hadoop,..\}}}. We can infer 
> {{SPARK_HOME}} in these cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hive QA error - No space left on device

2015-02-08 Thread Brock Noland
Cleaned up.

On Fri, Feb 6, 2015 at 6:21 PM, Alexander Pivovarov 
wrote:

> https://issues.apache.org/jira/browse/HIVE-9594
>
>
> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2690/console
>


[jira] [Commented] (HIVE-9566) HiveServer2 fails to start with NullPointerException

2015-02-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311385#comment-14311385
 ] 

Xuefu Zhang commented on HIVE-9566:
---

Thanks for taking care of this. Just for my understanding, why the conf object 
would be null in this case? Does the problem happens all the time or just for a 
special case? Thanks.

> HiveServer2 fails to start with NullPointerException
> 
>
> Key: HIVE-9566
> URL: https://issues.apache.org/jira/browse/HIVE-9566
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.13.0, 0.14.0, 0.13.1
>Reporter: Na Yang
>Assignee: Na Yang
> Attachments: HIVE-9566-branch-0.13.patch, 
> HIVE-9566-branch-0.14.patch, HIVE-9566-trunk.patch
>
>
> hiveserver2 uses embedded metastore with default hive-site.xml configuration. 
> I use "hive --stop --service hiveserver2" command to stop the running 
> hiveserver2 process and then use "hive --start --service hiveserver2" command 
> to start the hiveserver2 service. I see the following exception in the 
> hive.log file
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:104)
> at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:138)
> at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:171)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/
---

(Updated Feb. 6, 2015, 10:43 p.m.)


Review request for hive, Rui Li and Xuefu Zhang.


Changes
---

Removed aboutToSpill


Bugs: HIVE-9574
https://issues.apache.org/jira/browse/HIVE-9574


Repository: hive-git


Description
---

Result KV cache doesn't use RowContainer any more since it has logic we don't 
need, which is some overhead. We don't do lazy computing right away, instead we 
wait a little till the cache is close to spill.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
 78ab680 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
8ead0cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 7a09b4d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java 
e92e299 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
070ea4d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
 d4ff37c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 286816b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
0df4598 

Diff: https://reviews.apache.org/r/30739/diff/


Testing
---

Unit test, test on cluster


Thanks,

Jimmy Xiang



Review Request 30757: HIVE-9607 Remove unnecessary attach-jdbc-driver execution from package/pom.xml

2015-02-08 Thread Alexander Pivovarov

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30757/
---

Review request for hive and Jason Dere.


Bugs: HIVE-9607
https://issues.apache.org/jira/browse/HIVE-9607


Repository: hive-git


Description
---

HIVE-9607 Remove unnecessary attach-jdbc-driver execution from package/pom.xml


Diffs
-

  packaging/pom.xml 396ae3ce150747cff3070ba43c345fcd3cf6f4c5 

Diff: https://reviews.apache.org/r/30757/diff/


Testing
---


Thanks,

Alexander Pivovarov



Hive QA error - No space left on device

2015-02-08 Thread Alexander Pivovarov
https://issues.apache.org/jira/browse/HIVE-9594

http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2690/console


Re: Confluence documentation error

2015-02-08 Thread Lefty Leverenz
Good catch, Philippe.  TypedBytesRecordReader has been
in org.apache.hadoop.hive.contrib.util.typedbytes.* since release 0.5, so
I've fixed the wikidoc
.
 (Releases prior to 0.5 didn't include TypedBytesRecordReader.)

Thanks.

-- Lefty Leverenz

On Wed, Feb 4, 2015 at 3:43 PM, Philippe Kernévez 
wrote:

> Hi,
>
> On the page
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform
> The exemple #2 suggest to use
> RECORDREADER 'org.apache.hadoop.hive.ql.exec.TypedBytesRecordReader'
>
> It seems that the class moved to :
> org.apache.hadoop.hive.contrib.util.typedbytes.TypedBytesRecordReader
>
> I don't have right to change it myself (pkernevez).
>
> Regards,
> Philippe
>


Re: Review Request 30739: HIVE-9574 Lazy computing in HiveBaseFunctionResultList may hurt performance [Spark Branch]

2015-02-08 Thread Jimmy Xiang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30739/
---

(Updated Feb. 7, 2015, midnight)


Review request for hive, Rui Li and Xuefu Zhang.


Bugs: HIVE-9574
https://issues.apache.org/jira/browse/HIVE-9574


Repository: hive-git


Description
---

Result KV cache doesn't use RowContainer any more since it has logic we don't 
need, which is some overhead. We don't do lazy computing right away, instead we 
wait a little till the cache is close to spill.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
 78ab680 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
8ead0cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunction.java 7a09b4d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveMapFunctionResultList.java 
e92e299 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunction.java 
070ea4d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveReduceFunctionResultList.java
 d4ff37c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/KryoSerializer.java 286816b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
0df4598 

Diff: https://reviews.apache.org/r/30739/diff/


Testing
---

Unit test, test on cluster


Thanks,

Jimmy Xiang



  1   2   >