[jira] [Commented] (HIVE-10120) Disallow create table with dot/colon in column name

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385159#comment-14385159
 ] 

Hive QA commented on HIVE-10120:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707863/HIVE-10120.01.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8673 tests executed
*Failed tests:*
{noformat}
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3190/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3190/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3190/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707863 - PreCommit-HIVE-TRUNK-Build

> Disallow create table with dot/colon in column name
> ---
>
> Key: HIVE-10120
> URL: https://issues.apache.org/jira/browse/HIVE-10120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10120.01.patch
>
>
> Since we don't allow users to query column names with dot in the middle such 
> as emp.no, don't allow users to create tables with such columns that cannot 
> be queried. Fix the documentation to reflect this fix.
> Here is an example. Consider this table:
> {code}
> CREATE TABLE a (`emp.no` string);
> select `emp.no` from a; fails with this message:
> FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
> from [0:emp.no]
> {code}
> The hive documentation needs to be fixed:
> {code}
>  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
> to  indicate that any Unicode character can go between the backticks in the 
> select statement, but it doesn’t like the dot/colon or even select * when 
> there is a column that has a dot/colon. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10132) LLAP: Tez heartbeats are delayed by ~500+ ms due to Hadoop IPC client

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10132:
---
Attachment: HIVE-10132.NOT.A.patch

This turns off the client-cache, which uses the socket factory object as the 
key.

But is not the right fix.

> LLAP: Tez heartbeats are delayed by ~500+ ms due to Hadoop IPC client
> -
>
> Key: HIVE-10132
> URL: https://issues.apache.org/jira/browse/HIVE-10132
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Siddharth Seth
> Attachments: HIVE-10132.NOT.A.patch
>
>
> HADOOP-11772 has a clearer bug report of the core issue inside hadoop-common.
> Due to the delayed heartbeats reaching the AM, the reducers are losing up-to 
> a couple of seconds for a 60ms (x10 parallel) mapper + 300ms reducer instead 
> of finishing the query in under a second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10000) 10000 whoooohooo

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385146#comment-14385146
 ] 

Gunther Hagleitner commented on HIVE-1:
---

+1

> 1 whhooo
> 
>
> Key: HIVE-1
> URL: https://issues.apache.org/jira/browse/HIVE-1
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Damien Carol
>
> {noformat}
> g▄µ,  
>   
>   ▄█▀²▀▀▀██▄▄,
>   
> ▄▓▓░/;,,...`▀▀▀█▄, ,gg,   
>   
>  ╓,,,g▄▓▓░░░║║╓w;,.`²▀██▄,  ,▄▀▀░░░▀█,
>   
>▄▓▓▀▀▓▓▓█▄▄║║Ü╓w,,...`▀▀█▓▒░░░╟▄░░▀█   
>   
>   █▓░░░▀▀▒░░░║╗ÿw,..░▀▓█▄░▀█░µ░█╥ 
>   
>  █▓░░░║╗y..²▓▓▌░█░║/▀▄
>   
>  ▓▌░░░║║╓░░╩B▒▀≡██▄▄░║;..▀▓║▐▒░░¡▓▌   
>   
>  ▀▓░░║»╓▄▒▓▓▓▒░░¡.▀▌ÿ▓`   
>   
>   ▀█░╓╓░;,|`▀▓░░║y^██M
>   
> ▀█║║y░▀░░░║░█▒░█▌ 
>   
>   ▀█▄░░░║░░░▒░░▓▌ 
>   
> ▓█▄▄░░░▒▓█
>   
>  ▀█░░░▄▄██▓▓░░░▄▄▒░░▒▓█   
>   
>   ▀▌░░░g▄▄██▀▀░░░░░▒░░▓█  
>   
>`²▀²² g▓▒║░░░▒▓▓▓▄██▄░░░▓█ 
>   
> ▄▓▓█░▒▓▓▒░░▒▒▒▓▓▒░░░╣▀██▓g▄   
>   
>╓▌░▒▓▓▓█▄░░░▒▒█▓▓▒▒▒▀▀▒▓`  
>   
>▒▀█░▒▌░░▒▀░▓░░▒▓   
>   
>▒.⌠▓▄▒▓▓▓░▀░░▒▓╛   
>   
>▒..░░▓█▄░░░▓▓█▒▒▒░░░▒▓▌
>   
>▐▓▄▀▓█▄░▀▓▓█░░░▓▓▄╣▒▓▓ 
>   
> ▓▓▓█▄░▀▓█▓▓▀▀▀▓█▒▓█▄░░░▒▒█▓▀  
>   
> ▀▓██▄░░░▓▓▓█,  ▀█▓▓▓█▀`▀▀██▓▀`
>   
>  ▒░▀███▄▓▓ ²▀▀φy  ▄▄▄╖µ▄▄▄¡▄▄▄╖ 
> ,▄▄▄╖▄▄▄.   
>   ╙µ¡░░░▀▀▓▄▄g╓.  ▓▓▓▌█▓▓▓░▓▓▓▌ 
> ▐▓▓▓░▓▓▓N   
> ▀█▄▄░▀▓▓µ ▓▓▓▌█▓▓▓░▓▓▓▌╟▓▓▓▌█▓▓▓ 
> ²`²
>   ▀███▓▓▓▌╛   ░▓▓▓▌ ▓▓▓Ñ 
> ▓▌ 
>  ▀▀▀▀▓▓▓██▓▓▓█]▓▓▓▌ ▐▓▓  
> ▓Ñ 
>  `╙╨╫░░░▄█▓▀▀²▓▓▓▌█▓▓▓░▓▓▓▌ ╘▓▌  
> ▄▄▄µ   
>   ▓▓▓▌█▓▓▓░▓▓▓▌  █`  
> ▓▓▓▌   
>    ```   
> ```
>  ██╗ ██╗  ██╗  ██╗  ██╗ 
> ███║██╔═╗██╔═╗██╔═╗██╔═╗
> ╚██║██║██╔██║██║██╔██║██║██╔██║██║██╔██║
>  ██║╔╝██║╔╝██║╔╝██║╔╝██║
>  ██║╚██╔╝╚██╔╝╚██╔╝╚██╔╝
>  ╚═╝ ╚═╝  ╚═╝  ╚═╝  ╚═╝ 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10130:
---
Attachment: HIVE-10130.2-spark.patch

> Merge from Spark branch to trunk 03/27/2015
> ---
>
> Key: HIVE-10130
> URL: https://issues.apache.org/jira/browse/HIVE-10130
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-10130.1-spark.patch, HIVE-10130.2-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10132) LLAP: Tez heartbeats are delayed by ~500+ ms due to Hadoop IPC client

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10132:
---
Assignee: Siddharth Seth

> LLAP: Tez heartbeats are delayed by ~500+ ms due to Hadoop IPC client
> -
>
> Key: HIVE-10132
> URL: https://issues.apache.org/jira/browse/HIVE-10132
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Siddharth Seth
>
> HADOOP-11772 has a clearer bug report of the core issue inside hadoop-common.
> Due to the delayed heartbeats reaching the AM, the reducers are losing up-to 
> a couple of seconds for a 60ms (x10 parallel) mapper + 300ms reducer instead 
> of finishing the query in under a second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385130#comment-14385130
 ] 

Hive QA commented on HIVE-9518:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707843/HIVE-9518.8.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8674 tests executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3189/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3189/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3189/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707843 - PreCommit-HIVE-TRUNK-Build

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch, 
> HIVE-9518.8.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.
> Should accept date, timestamp and string arguments in the format '-MM-dd' 
> or '-MM-dd HH:mm:ss'.
> The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-9780:
--
Attachment: HIVE-9780.06.patch

following [~mmokhtar]'s suggestion, add cbo info for this explain. cc'ing 
[~jpullokkaran].

> Add another level of explain for RDBMS audience
> ---
>
> Key: HIVE-9780
> URL: https://issues.apache.org/jira/browse/HIVE-9780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
> HIVE-9780.03.patch, HIVE-9780.04.patch, HIVE-9780.05.patch, HIVE-9780.06.patch
>
>
> Current Hive Explain (default) is targeted at MR Audience. We need a new 
> level of explain plan to be targeted at RDBMS audience. The explain requires 
> these:
> 1) The focus needs to be on what part of the query is being executed rather 
> than internals of the engines
> 2) There needs to be a clearly readable tree of operations
> 3) Examples - Table scan should mention the table being scanned, the Sarg, 
> the size of table and expected cardinality after the Sarg'ed read. The join 
> should mention the table being joined with and the join condition. The 
> aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10128:
---
Attachment: hashmap-after.png

Fix looks good, the hashmap sync points have gone away from the inner loops.

!hashmap-after.png!

The orange sections are the IO elevator lagging behind the map-join operator.

Found other init-time lock sections, will file more JIRAs.

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10128.patch, hashmap-after.png, 
> hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385096#comment-14385096
 ] 

Gopal V commented on HIVE-10128:


The log-level set to WARN level is not really a patch - as long as we log at 
INFO level, we have issues with synchronization unless we switch to async 
logging.

Will gladly +1, if you have a patch to add to the {{hive --service llap 
--loglevel}} option, that avoids me having to edit the JSON configs before 
running run.sh.

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
> hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385078#comment-14385078
 ] 

Hive QA commented on HIVE-10066:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707814/HIVE-10066.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3188/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3188/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3188/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707814 - PreCommit-HIVE-TRUNK-Build

> Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
> -
>
> Key: HIVE-10066
> URL: https://issues.apache.org/jira/browse/HIVE-10066
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-10066.2.patch, HIVE-10066.3.patch, HIVE-10066.patch
>
>
> From [~hitesh]:
> "Tez is a client-side only component ( no daemons, etc ) and therefore it is 
> meant to be installed on the gateway box ( or where its client libraries are 
> needed by any other services’ daemons). It does not have any cluster 
> dependencies both in terms of libraries/jars as well as configs. When it runs 
> on a worker node, everything was pre-packaged and made available to the 
> worker node via the distributed cache via the client code. Hence, its 
> client-side configs are also only needed on the same (client) node as where 
> it is installed. The only other install step needed is to have the tez 
> tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which 
> points to the HDFS path. "
> We need a way to pass client jars and tez-site.xml to the LaunchMapper.
> We should create a general purpose mechanism here which can supply additional 
> artifacts per job type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: HIVE-10123.02.patch

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch, HIVE-10123.02.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385067#comment-14385067
 ] 

Hive QA commented on HIVE-10130:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707946/HIVE-10130.1-spark.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 8709 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_13
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_22
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_6_subq
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/808/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/808/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-808/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707946 - PreCommit-HIVE-SPARK-Build

> Merge from Spark branch to trunk 03/27/2015
> ---
>
> Key: HIVE-10130
> URL: https://issues.apache.org/jira/browse/HIVE-10130
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-10130.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385062#comment-14385062
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

As for logging, I have a patch for that that someone needs to +1 ;)

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
> hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385060#comment-14385060
 ] 

Sergey Shelukhin edited comment on HIVE-10128 at 3/28/15 2:21 AM:
--

Git is down, so merge won't propagate. I will attach the patch here for now, to 
commit I need to merge and I cannot deal with SVN today anymore


was (Author: sershe):
Git is down, so merge won't propagate. I will attach the patch here for now, to 
commit I need to merge and I cannot deal with SVN anymore

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
> hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10128:

Attachment: HIVE-10128.patch

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10128.patch, hashmap-sync-source.png, 
> hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385060#comment-14385060
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

Git is down, so merge won't propagate. I will attach the patch here for now, to 
commit I need to merge and I cannot deal with SVN anymore

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385042#comment-14385042
 ] 

Aihua Xu commented on HIVE-10093:
-

Thanks Szehon.

> Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
> -
>
> Key: HIVE-10093
> URL: https://issues.apache.org/jira/browse/HIVE-10093
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-10093.patch
>
>
> When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
> unnecessarily right before the call to: 
> HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
> DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
> needed.
> Side effect is creation of useless derby database file on HiveServer2 in 
> secure clusters, causing confusion.  This could potentially be skipped if 
> MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10093:

Attachment: (was: HIVE-10093.patch)

> Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
> -
>
> Key: HIVE-10093
> URL: https://issues.apache.org/jira/browse/HIVE-10093
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-10093.patch
>
>
> When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
> unnecessarily right before the call to: 
> HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
> DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
> needed.
> Side effect is creation of useless derby database file on HiveServer2 in 
> secure clusters, causing confusion.  This could potentially be skipped if 
> MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10093:

Attachment: HIVE-10093.patch

> Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
> -
>
> Key: HIVE-10093
> URL: https://issues.apache.org/jira/browse/HIVE-10093
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10093.patch, HIVE-10093.patch
>
>
> When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
> unnecessarily right before the call to: 
> HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
> DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
> needed.
> Side effect is creation of useless derby database file on HiveServer2 in 
> secure clusters, causing confusion.  This could potentially be skipped if 
> MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385033#comment-14385033
 ] 

Aihua Xu commented on HIVE-10093:
-

Whoops. I included it by accident. 

> Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
> -
>
> Key: HIVE-10093
> URL: https://issues.apache.org/jira/browse/HIVE-10093
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10093.patch
>
>
> When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
> unnecessarily right before the call to: 
> HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
> DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
> needed.
> Side effect is creation of useless derby database file on HiveServer2 in 
> secure clusters, causing confusion.  This could potentially be skipped if 
> MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10093) Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2

2015-03-27 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385016#comment-14385016
 ] 

Szehon Ho commented on HIVE-10093:
--

Aihua, .reviewboardrc is modified by mistake right?  If so I can commit the 
patch without it.  Thanks

> Unnecessary HMSHandler initialization for default MemoryTokenStore on HS2
> -
>
> Key: HIVE-10093
> URL: https://issues.apache.org/jira/browse/HIVE-10093
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-10093.patch
>
>
> When the HiveAuthFactory is constructed in HS2, it initializes a HMSHandler 
> unnecessarily right before the call to: 
> HadoopThriftAuthBridge.startDelegationTokenSecretManager().  If the 
> DelegationTokenStore is configured to be a memoryTokenStore, this step is not 
> needed.
> Side effect is creation of useless derby database file on HiveServer2 in 
> secure clusters, causing confusion.  This could potentially be skipped if 
> MemoryTokenStore is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385006#comment-14385006
 ] 

Hive QA commented on HIVE-10086:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707844/HIVE-10086.5.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8678 tests executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_table_with_subschema
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3187/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3187/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3187/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707844 - PreCommit-HIVE-TRUNK-Build

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.5.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10130:
---
Attachment: HIVE-10130.1-spark.patch

> Merge from Spark branch to trunk 03/27/2015
> ---
>
> Key: HIVE-10130
> URL: https://issues.apache.org/jira/browse/HIVE-10130
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-10130.1-spark.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10130) Merge from Spark branch to trunk 03/27/2015

2015-03-27 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-10130:
--

Assignee: Xuefu Zhang

> Merge from Spark branch to trunk 03/27/2015
> ---
>
> Key: HIVE-10130
> URL: https://issues.apache.org/jira/browse/HIVE-10130
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: spark-branch
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10112:
---
Assignee: Gunther Hagleitner  (was: Sergey Shelukhin)

> LLAP: query 17 tasks fail due to mapjoin issue
> --
>
> Key: HIVE-10112
> URL: https://issues.apache.org/jira/browse/HIVE-10112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Gunther Hagleitner
>
> {noformat}
> 2015-03-26 18:16:38,833 
> [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
>  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
> java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.(WriteBuffers.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
> {noformat}
> Tasks do appear to pass on retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10129) LLAP: Fix ordering of execution modes

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10129.
--
Resolution: Fixed

Committed to llap branch.

> LLAP: Fix ordering of execution modes
> -
>
> Key: HIVE-10129
> URL: https://issues.apache.org/jira/browse/HIVE-10129
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10129.1.patch
>
>
> uber > llap > container execution modes. Fix the ordering in in-place update 
> UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10129) LLAP: Fix ordering of execution modes

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10129:
-
Attachment: HIVE-10129.1.patch

> LLAP: Fix ordering of execution modes
> -
>
> Key: HIVE-10129
> URL: https://issues.apache.org/jira/browse/HIVE-10129
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10129.1.patch
>
>
> uber > llap > container execution modes. Fix the ordering in in-place update 
> UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-10112.

Resolution: Not a Problem

Code that triggers HIVE-10128 fixed this inconsistency issue.

The correctness issue is resolved, but the performance problem remains.

> LLAP: query 17 tasks fail due to mapjoin issue
> --
>
> Key: HIVE-10112
> URL: https://issues.apache.org/jira/browse/HIVE-10112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> {noformat}
> 2015-03-26 18:16:38,833 
> [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
>  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
> java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.(WriteBuffers.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
> {noformat}
> Tasks do appear to pass on retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reopened HIVE-10128:


> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384979#comment-14384979
 ] 

Gopal V commented on HIVE-10128:


[~sershe]: we need to reverse the duplicate markers.

Re-opening this one, since this came up today & isn't related to the read-only 
issue.

I have a suggestion here - how about we return a shallow-copy for seal() ?



> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384970#comment-14384970
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

rather, bugs were before sync blocks were added earlier today.

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384961#comment-14384961
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

Oh, sync blocks were added by Gunther

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10128.
-
Resolution: Duplicate

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384954#comment-14384954
 ] 

Sergey Shelukhin commented on HIVE-10128:
-

Yeah, we are fixing this right now. It also causes bugs aplenty. I have a 
patch, just need to finish the bloody merge before I build and test it

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10128) LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access

2015-03-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10128:
---
Attachment: hashmap-sync-source.png
hashmap-sync.png

> LLAP: BytesBytesMultiHashMap does not allow concurrent read-only access
> ---
>
> Key: HIVE-10128
> URL: https://issues.apache.org/jira/browse/HIVE-10128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: hashmap-sync-source.png, hashmap-sync.png
>
>
> The multi-threaded performance takes a serious hit when LLAP shares 
> hashtables between the probe threads running in parallel. 
> !hashmap-sync.png!
> This is an explicit synchronized block inside ReusableRowContainer which 
> triggers this particular pattern.
> !hashmap-sync-source.png!
> Looking deeper into the code, the synchronization seems to be caused due to 
> the fact that WriteBuffers.setReadPoint modifies the otherwise read-only 
> hashtable.
> To generate this sort of result, run LLAP at a WARN log-level, to avoid all 
> the log synchronization that otherwise affects the thread sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10127) LLAP: Port changes to timestamp stream reader after timezone fix in trunk

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10127:
-
Affects Version/s: llap

> LLAP: Port changes to timestamp stream reader after timezone fix in trunk
> -
>
> Key: HIVE-10127
> URL: https://issues.apache.org/jira/browse/HIVE-10127
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Timezone fix changes in trunk (HIVE-8746) needs changes to llap stream 
> readers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10126) Upgrade Tez dependency to the latest released version

2015-03-27 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-10126:
---
Summary: Upgrade Tez dependency to the latest released version  (was: 
upgrade Tez dependency to the the latest released version)

> Upgrade Tez dependency to the latest released version
> -
>
> Key: HIVE-10126
> URL: https://issues.apache.org/jira/browse/HIVE-10126
> Project: Hive
>  Issue Type: Bug
>Reporter: Na Yang
>
> Tez 0.6 has been released. It will be nice to upgrade the tez dependency to 
> the latest released version. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10038) Add Calcite's ProjectMergeRule.

2015-03-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10038:

Attachment: HIVE-10038.5.patch

More fixes.

> Add Calcite's ProjectMergeRule.
> ---
>
> Key: HIVE-10038
> URL: https://issues.apache.org/jira/browse/HIVE-10038
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10038.2.patch, HIVE-10038.3.patch, 
> HIVE-10038.4.patch, HIVE-10038.5.patch, HIVE-10038.patch
>
>
> Helps to improve latency by shortening operator pipeline. Folds adjacent 
> projections in one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10122:
---

Assignee: Sergey Shelukhin

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384929#comment-14384929
 ] 

Sergey Shelukhin commented on HIVE-10122:
-

Yeah, I guess this is the part you were seeing a couple month ago that I said 
should not happen. I'll look next week

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10022) DFS in authorization might take too long

2015-03-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384924#comment-14384924
 ] 

Thejas M Nair commented on HIVE-10022:
--

authorization_uri_import does not fail in other test runs in precommit - 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/
 . Maybe  you need  a clean build after removing the patch ?
Can you also please upload the patch to reviewboard ?

> DFS in authorization might take too long
> 
>
> Key: HIVE-10022
> URL: https://issues.apache.org/jira/browse/HIVE-10022
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.14.0
>Reporter: Pankit Thapar
>Assignee: Pankit Thapar
> Fix For: 1.0.1
>
> Attachments: HIVE-10022.2.patch, HIVE-10022.patch
>
>
> I am testing a query like : 
> set hive.test.authz.sstd.hs2.mode=true;
> set 
> hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
> set 
> hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator;
> set hive.security.authorization.enabled=true;
> set user.name=user1;
> create table auth_noupd(i int) clustered by (i) into 2 buckets stored as orc 
> location '${OUTPUT}' TBLPROPERTIES ('transactional'='true');
> Now, in the above query,  since authorization is true, 
> we would end up calling doAuthorizationV2() which ultimately ends up calling 
> SQLAuthorizationUtils.getPrivilegesFromFS() which calls a recursive method : 
> FileUtils.isActionPermittedForFileHierarchy() with the object or the ancestor 
> of the object we are trying to authorize if the object does not exist. 
> The logic in FileUtils.isActionPermittedForFileHierarchy() is DFS.
> Now assume, we have a path as a/b/c/d that we are trying to authorize.
> In case, a/b/c/d does not exist, we would call 
> FileUtils.isActionPermittedForFileHierarchy() with say a/b/ assuming a/b/c 
> also does not exist.
> If under the subtree at a/b, we have millions of files, then 
> FileUtils.isActionPermittedForFileHierarchy()  is going to check file 
> permission on each of those objects. 
> I do not completely understand why do we have to check for file permissions 
> in all the objects in  branch of the tree that we are not  trying to read 
> from /write to.  
> We could have checked file permission on the ancestor that exists and if it 
> matches what we expect, the return true.
> Please confirm if this is a bug so that I can submit a patch else let me know 
> what I am missing ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384914#comment-14384914
 ] 

Thejas M Nair commented on HIVE-10122:
--

[~mmokhtar] For the partition table query in these cases, I expect you to see a 
query that gets all partitions for the table, then another one that gets 
selected set of partitions based on partition names.
I don't see this error about parsing in hive 0.13, so I think it is a 
regression since hive 0.14 .


> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384917#comment-14384917
 ] 

Mostafa Mokhtar commented on HIVE-10122:


[~sershe]
Ok, now I understand what is going on, this is the problematic metastore query 

{code}
SELECT 
A0.PART_NAME AS NUCORDER0
FROM
PARTITIONS A0
LEFT OUTER JOIN
TBLS B0 ON A0.TBL_ID = B0.TBL_ID
LEFT OUTER JOIN
DBS C0 ON B0.DB_ID = C0.DB_ID
WHERE
C0.NAME = 'tpcds_bin_partitioned_orc_3'
AND B0.TBL_NAME = 'store_sales'
ORDER BY NUCORDER0;
{code}

Then the remaining ones use the PART_NAME and PART_ID from the previous query
{code}
select 
PARTITIONS.PART_ID
from
PARTITIONS
inner join
TBLS ON PARTITIONS.TBL_ID = TBLS.TBL_ID
and TBLS.TBL_NAME = 'store_sales'
inner join
DBS ON TBLS.DB_ID = DBS.DB_ID
and DBS.NAME = 'tpcds_bin_partitioned_orc_3'
where
PARTITIONS.PART_NAME in ('ss_sold_date_sk=2450816' , 
'ss_sold_date_sk=2450817');
{code}
{code}
select 
PARTITIONS.PART_ID,
SDS.SD_ID,
SDS.CD_ID,
SERDES.SERDE_ID,
PARTITIONS.CREATE_TIME,
PARTITIONS.LAST_ACCESS_TIME,
SDS.INPUT_FORMAT,
SDS.IS_COMPRESSED,
SDS.IS_STOREDASSUBDIRECTORIES,
SDS.LOCATION,
SDS.NUM_BUCKETS,
SDS.OUTPUT_FORMAT,
SERDES.NAME,
SERDES.SLIB
from
PARTITIONS
left outer join
SDS ON PARTITIONS.SD_ID = SDS.SD_ID
left outer join
SERDES ON SDS.SERDE_ID = SERDES.SERDE_ID
where
PART_ID in (59203 , 58422)
order by PART_NAME asc;
{code}


If filters are on the partitioned column only as in 
{code} 
select ss_item_sk  rowcount from store_sales where ss_sold_date_sk between 
2450816 and 2450817  ;
{code}

Then PARTITIONS table is queried with a partition filter.
{code}

select 
PARTITIONS.PART_ID
from
PARTITIONS
inner join
TBLS ON PARTITIONS.TBL_ID = TBLS.TBL_ID
and TBLS.TBL_NAME = 'store_sales'
inner join
DBS ON TBLS.DB_ID = DBS.DB_ID
and DBS.NAME = 'tpcds_bin_partitioned_orc_3'
inner join
PARTITION_KEY_VALS FILTER0 ON FILTER0.PART_ID = PARTITIONS.PART_ID
and FILTER0.INTEGER_IDX = 0
where
case
when
TBLS.TBL_NAME = 'store_sales'
and DBS.NAME = 'tpcds_bin_partitioned_orc_3'
then
cast(FILTER0.PART_KEY_VAL as decimal (21 , 0 ))
else null
end) >= 2450816)
and ((case
when
TBLS.TBL_NAME = 'store_sales'
and DBS.NAME = 'tpcds_bin_partitioned_orc_3'
then
cast(FILTER0.PART_KEY_VAL as decimal (21 , 0 ))
else null
end) <= 2450817)));
{code}


For 2K partitions there is no measurable performance difference between 
{code}
 explain select ss_item_sk  rowcount from store_sales where ss_sold_date_sk 
between 2450816 and 2450817  ;
{code}

and 
{code}
explain select ss_item_sk  rowcount from store_sales where ss_sold_date_sk 
between 2450816 and 2450817  and ss_ticket_number  > 1 and ss_item_sk > 
50;
{code}

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-10028:


Assignee: Prasanth Jayachandran

> LLAP: Create a fixed size execution queue for daemons
> -
>
> Key: HIVE-10028
> URL: https://issues.apache.org/jira/browse/HIVE-10028
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
>
> Currently, this is unbounded. This should be a configurable size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10125) LLAP: Print execution modes in tez in-place UI

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10125.
--
Resolution: Fixed

Committed to llap branch

> LLAP: Print execution modes in tez in-place UI
> --
>
> Key: HIVE-10125
> URL: https://issues.apache.org/jira/browse/HIVE-10125
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10125.1.patch
>
>
> There are different execution modes "container", "llap" and "uber". Print the 
> execution mode of the work in in-place UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10125) LLAP: Print execution modes in tez in-place UI

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10125:
-
Attachment: HIVE-10125.1.patch

> LLAP: Print execution modes in tez in-place UI
> --
>
> Key: HIVE-10125
> URL: https://issues.apache.org/jira/browse/HIVE-10125
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10125.1.patch
>
>
> There are different execution modes "container", "llap" and "uber". Print the 
> execution mode of the work in in-place UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10106) Regression : Dynamic partition pruning not working after HIVE-9976

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384886#comment-14384886
 ] 

Hive QA commented on HIVE-10106:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707704/HIVE-10106.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3186/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3186/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3186/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707704 - PreCommit-HIVE-TRUNK-Build

> Regression : Dynamic partition pruning not working after HIVE-9976
> --
>
> Key: HIVE-10106
> URL: https://issues.apache.org/jira/browse/HIVE-10106
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Siddharth Seth
> Fix For: 1.2.0
>
> Attachments: HIVE-10106.1.patch
>
>
> After HIVE-9976 got checked in dynamic partition pruning doesn't work.
> Partitions are pruned and later show up in splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384880#comment-14384880
 ] 

Sergey Shelukhin commented on HIVE-10122:
-

That is stats; do you see MySQL queries to PARTITIONS table?

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384874#comment-14384874
 ] 

Mostafa Mokhtar commented on HIVE-10122:


[~sershe] [~thejas] [~hagleitn]

I ran explain for this query 
{code}
select 
ss_item_sk rowcount
from
store_sales
where
ss_sold_date_sk between 2450816 and 2450817
and ss_ticket_number > 1
and ss_item_sk > 50;
{code}

And the query that gets issues to MySQL looks correct to me as only the 
qualified partitions are queried.
What am I missing?

{code}
select 
COLUMN_NAME,
COLUMN_TYPE,
min(LONG_LOW_VALUE),
max(LONG_HIGH_VALUE),
min(DOUBLE_LOW_VALUE),
max(DOUBLE_HIGH_VALUE),
min(BIG_DECIMAL_LOW_VALUE),
max(BIG_DECIMAL_HIGH_VALUE),
sum(NUM_NULLS),
max(NUM_DISTINCTS),
max(AVG_COL_LEN),
max(MAX_COL_LEN),
sum(NUM_TRUES),
sum(NUM_FALSES)
from
PART_COL_STATS
where
DB_NAME = 'tpcds_bin_partitioned_orc_3'
and TABLE_NAME = 'store_sales'
and COLUMN_NAME in ('ss_item_sk' , 'ss_ticket_number')
and PARTITION_NAME in ('ss_sold_date_sk=2450816' , 
'ss_sold_date_sk=2450817')
group by COLUMN_NAME , COLUMN_TYPE
{code}

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384870#comment-14384870
 ] 

Mostafa Mokhtar commented on HIVE-10123:


[~sershe]
Thanks!
Will update the patch.

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384866#comment-14384866
 ] 

Sergey Shelukhin commented on HIVE-10123:
-

If you are changing metric to ms, why not use System.current-ms instead of a 
more expensive nano call?
Then, there should be a better way to expand and rehash directly to target, w/o 
iterations

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384851#comment-14384851
 ] 

Mostafa Mokhtar commented on HIVE-10123:


[~sershe] [~wzheng]
Can you please take a look?

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10050) Support overriding memory configuration for AM launched for TempletonControllerJob

2015-03-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384850#comment-14384850
 ] 

Eugene Koifman commented on HIVE-10050:
---

+1

> Support overriding memory configuration for AM launched for 
> TempletonControllerJob
> --
>
> Key: HIVE-10050
> URL: https://issues.apache.org/jira/browse/HIVE-10050
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: HIVE-10050.1.patch, HIVE-10050.2.patch, 
> HIVE-10050.3.patch
>
>
> The MR AM launched for the TempletonControllerJob does not do any heavy 
> lifting and therefore can be configured to use a small memory footprint ( as 
> compared to potentially using the default footprint for most MR jobs on a 
> cluster ). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: (was: HIVE-10123.01.patch)

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: HIVE-10123.01.patch

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10123) Hybrid grace Hash join : Use estimate key count from stats to initialize BytesBytesMultiHashMap

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10123:
---
Attachment: HIVE-10123.01.patch

> Hybrid grace Hash join : Use estimate key count from stats to initialize 
> BytesBytesMultiHashMap
> ---
>
> Key: HIVE-10123
> URL: https://issues.apache.org/jira/browse/HIVE-10123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Mostafa Mokhtar
> Fix For: 1.2.0
>
> Attachments: HIVE-10123.01.patch
>
>
> Hybrid grace Hash join is not using estimated number of rows from the 
> statistics to initialize BytesBytesMultiHashMap. 
> Add some logging to BytesBytesMultiHashMap to track get probes and use msec 
> for expandAndRehash as us overflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10122:
-
Affects Version/s: 1.1.0
   0.14.0
   1.0.0

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10050) Support overriding memory configuration for AM launched for TempletonControllerJob

2015-03-27 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated HIVE-10050:
---
Attachment: HIVE-10050.3.patch

Commented out properties in webhcat-default for now. 

> Support overriding memory configuration for AM launched for 
> TempletonControllerJob
> --
>
> Key: HIVE-10050
> URL: https://issues.apache.org/jira/browse/HIVE-10050
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: HIVE-10050.1.patch, HIVE-10050.2.patch, 
> HIVE-10050.3.patch
>
>
> The MR AM launched for the TempletonControllerJob does not do any heavy 
> lifting and therefore can be configured to use a small memory footprint ( as 
> compared to potentially using the default footprint for most MR jobs on a 
> cluster ). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10122:

Component/s: Metastore

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10122) Hive metastore filter-by-expression is broken for non-partition expressions

2015-03-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384808#comment-14384808
 ] 

Sergey Shelukhin commented on HIVE-10122:
-

[~gopalv] [~mmokhtar] [~hagleitn] this could be a considerable perf regression 
starting from ~Hive 0.14 for metastore partition pruning.
It would seemingly (I am going based on what Thejas found for one query, 
haven't looked into it) affect every query combining partition + non-partition 
filter, causing pruning to go thru partition-name-based path on the client.
We need to fix this... I can look into it next week

> Hive metastore filter-by-expression is broken for non-partition expressions
> ---
>
> Key: HIVE-10122
> URL: https://issues.apache.org/jira/browse/HIVE-10122
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> See 
> https://issues.apache.org/jira/browse/HIVE-10091?focusedCommentId=14382413&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14382413
> These two lines of code
> {noformat}
> // Replace virtual columns with nulls. See javadoc for details.
> prunerExpr = removeNonPartCols(prunerExpr, extractPartColNames(tab), 
> partColsUsedInFilter);
> // Remove all parts that are not partition columns. See javadoc for 
> details.
> ExprNodeDesc compactExpr = compactExpr(prunerExpr.clone());
> {noformat}
> are supposed to take care of this; I see there were bunch of changes to this 
> code over some time, and now it appears to be broken.
> Thanks to [~thejas] for info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10000) 10000 whoooohooo

2015-03-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384804#comment-14384804
 ] 

Lefty Leverenz commented on HIVE-1:
---

+1

Let's resolve this so [~damien.carol] will get credit for his bodacious artwork.

> 1 whhooo
> 
>
> Key: HIVE-1
> URL: https://issues.apache.org/jira/browse/HIVE-1
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Damien Carol
>
> {noformat}
> g▄µ,  
>   
>   ▄█▀²▀▀▀██▄▄,
>   
> ▄▓▓░/;,,...`▀▀▀█▄, ,gg,   
>   
>  ╓,,,g▄▓▓░░░║║╓w;,.`²▀██▄,  ,▄▀▀░░░▀█,
>   
>▄▓▓▀▀▓▓▓█▄▄║║Ü╓w,,...`▀▀█▓▒░░░╟▄░░▀█   
>   
>   █▓░░░▀▀▒░░░║╗ÿw,..░▀▓█▄░▀█░µ░█╥ 
>   
>  █▓░░░║╗y..²▓▓▌░█░║/▀▄
>   
>  ▓▌░░░║║╓░░╩B▒▀≡██▄▄░║;..▀▓║▐▒░░¡▓▌   
>   
>  ▀▓░░║»╓▄▒▓▓▓▒░░¡.▀▌ÿ▓`   
>   
>   ▀█░╓╓░;,|`▀▓░░║y^██M
>   
> ▀█║║y░▀░░░║░█▒░█▌ 
>   
>   ▀█▄░░░║░░░▒░░▓▌ 
>   
> ▓█▄▄░░░▒▓█
>   
>  ▀█░░░▄▄██▓▓░░░▄▄▒░░▒▓█   
>   
>   ▀▌░░░g▄▄██▀▀░░░░░▒░░▓█  
>   
>`²▀²² g▓▒║░░░▒▓▓▓▄██▄░░░▓█ 
>   
> ▄▓▓█░▒▓▓▒░░▒▒▒▓▓▒░░░╣▀██▓g▄   
>   
>╓▌░▒▓▓▓█▄░░░▒▒█▓▓▒▒▒▀▀▒▓`  
>   
>▒▀█░▒▌░░▒▀░▓░░▒▓   
>   
>▒.⌠▓▄▒▓▓▓░▀░░▒▓╛   
>   
>▒..░░▓█▄░░░▓▓█▒▒▒░░░▒▓▌
>   
>▐▓▄▀▓█▄░▀▓▓█░░░▓▓▄╣▒▓▓ 
>   
> ▓▓▓█▄░▀▓█▓▓▀▀▀▓█▒▓█▄░░░▒▒█▓▀  
>   
> ▀▓██▄░░░▓▓▓█,  ▀█▓▓▓█▀`▀▀██▓▀`
>   
>  ▒░▀███▄▓▓ ²▀▀φy  ▄▄▄╖µ▄▄▄¡▄▄▄╖ 
> ,▄▄▄╖▄▄▄.   
>   ╙µ¡░░░▀▀▓▄▄g╓.  ▓▓▓▌█▓▓▓░▓▓▓▌ 
> ▐▓▓▓░▓▓▓N   
> ▀█▄▄░▀▓▓µ ▓▓▓▌█▓▓▓░▓▓▓▌╟▓▓▓▌█▓▓▓ 
> ²`²
>   ▀███▓▓▓▌╛   ░▓▓▓▌ ▓▓▓Ñ 
> ▓▌ 
>  ▀▀▀▀▓▓▓██▓▓▓█]▓▓▓▌ ▐▓▓  
> ▓Ñ 
>  `╙╨╫░░░▄█▓▀▀²▓▓▓▌█▓▓▓░▓▓▓▌ ╘▓▌  
> ▄▄▄µ   
>   ▓▓▓▌█▓▓▓░▓▓▓▌  █`  
> ▓▓▓▌   
>    ```   
> ```
>  ██╗ ██╗  ██╗  ██╗  ██╗ 
> ███║██╔═╗██╔═╗██╔═╗██╔═╗
> ╚██║██║██╔██║██║██╔██║██║██╔██║██║██╔██║
>  ██║╔╝██║╔╝██║╔╝██║╔╝██║
>  ██║╚██╔╝╚██╔╝╚██╔╝╚██╔╝
>  ╚═╝ ╚═╝  ╚═╝  ╚═╝  ╚═╝ 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10050) Support overriding memory configuration for AM launched for TempletonControllerJob

2015-03-27 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated HIVE-10050:
---
Attachment: HIVE-10050.2.patch

> Support overriding memory configuration for AM launched for 
> TempletonControllerJob
> --
>
> Key: HIVE-10050
> URL: https://issues.apache.org/jira/browse/HIVE-10050
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Attachments: HIVE-10050.1.patch, HIVE-10050.2.patch
>
>
> The MR AM launched for the TempletonControllerJob does not do any heavy 
> lifting and therefore can be configured to use a small memory footprint ( as 
> compared to potentially using the default footprint for most MR jobs on a 
> cluster ). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10001) SMB join in reduce side

2015-03-27 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384780#comment-14384780
 ] 

Vikram Dixit K commented on HIVE-10001:
---

Address review comments.

> SMB join in reduce side
> ---
>
> Key: HIVE-10001
> URL: https://issues.apache.org/jira/browse/HIVE-10001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10001.1.patch, HIVE-10001.2.patch, 
> HIVE-10001.3.patch, HIVE-10001.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10001) SMB join in reduce side

2015-03-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10001:
--
Attachment: HIVE-10001.4.patch

> SMB join in reduce side
> ---
>
> Key: HIVE-10001
> URL: https://issues.apache.org/jira/browse/HIVE-10001
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-10001.1.patch, HIVE-10001.2.patch, 
> HIVE-10001.3.patch, HIVE-10001.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10022) DFS in authorization might take too long

2015-03-27 Thread Pankit Thapar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384753#comment-14384753
 ] 

Pankit Thapar commented on HIVE-10022:
--

Hi [~thejas] , Can you please comment on the failures. These tests  pass on my 
local machine. Only testNegativeCliDriver_authorization_uri_import fails but 
that fails even without the patch on my local machine.


> DFS in authorization might take too long
> 
>
> Key: HIVE-10022
> URL: https://issues.apache.org/jira/browse/HIVE-10022
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 0.14.0
>Reporter: Pankit Thapar
>Assignee: Pankit Thapar
> Fix For: 1.0.1
>
> Attachments: HIVE-10022.2.patch, HIVE-10022.patch
>
>
> I am testing a query like : 
> set hive.test.authz.sstd.hs2.mode=true;
> set 
> hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
> set 
> hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.SessionStateConfigUserAuthenticator;
> set hive.security.authorization.enabled=true;
> set user.name=user1;
> create table auth_noupd(i int) clustered by (i) into 2 buckets stored as orc 
> location '${OUTPUT}' TBLPROPERTIES ('transactional'='true');
> Now, in the above query,  since authorization is true, 
> we would end up calling doAuthorizationV2() which ultimately ends up calling 
> SQLAuthorizationUtils.getPrivilegesFromFS() which calls a recursive method : 
> FileUtils.isActionPermittedForFileHierarchy() with the object or the ancestor 
> of the object we are trying to authorize if the object does not exist. 
> The logic in FileUtils.isActionPermittedForFileHierarchy() is DFS.
> Now assume, we have a path as a/b/c/d that we are trying to authorize.
> In case, a/b/c/d does not exist, we would call 
> FileUtils.isActionPermittedForFileHierarchy() with say a/b/ assuming a/b/c 
> also does not exist.
> If under the subtree at a/b, we have millions of files, then 
> FileUtils.isActionPermittedForFileHierarchy()  is going to check file 
> permission on each of those objects. 
> I do not completely understand why do we have to check for file permissions 
> in all the objects in  branch of the tree that we are not  trying to read 
> from /write to.  
> We could have checked file permission on the ancestor that exists and if it 
> matches what we expect, the return true.
> Please confirm if this is a bug so that I can submit a patch else let me know 
> what I am missing ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10089) RCFile: lateral view explode caused ConcurrentModificationException

2015-03-27 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-10089:

Attachment: HIVE-10089.1.patch

> RCFile: lateral view explode caused ConcurrentModificationException
> ---
>
> Key: HIVE-10089
> URL: https://issues.apache.org/jira/browse/HIVE-10089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-10089.1.patch
>
>
> CREATE TABLE test_table123 (a INT, b MAP) STORED AS RCFILE;
> INSERT OVERWRITE TABLE test_table123 SELECT 1, MAP("a1", "b1", "c1", "d1") 
> FROM src LIMIT 1;
> The following query will lead to ConcurrentModificationException
> SELECT * FROM (SELECT b FROM test_table123) t1 LATERAL VIEW explode(b) x AS 
> b,c LIMIT 1;
> Failed with exception 
> java.io.IOException:java.util.ConcurrentModificationException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10088) RCFIle: Lateral view with explode throws ConcurrentModificationException

2015-03-27 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang resolved HIVE-10088.
-
Resolution: Duplicate

> RCFIle: Lateral view with explode throws ConcurrentModificationException
> 
>
> Key: HIVE-10088
> URL: https://issues.apache.org/jira/browse/HIVE-10088
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>
> CREATE TABLE test_table123 (a INT, b MAP) STORED AS RCFILE;
> INSERT OVERWRITE TABLE test_table123 SELECT 1, MAP("a1", "b1", "c1", "d1") 
> FROM src LIMIT 1;
> hive> select * from test_table123;
> 1
> {"a1":"b1","c1":"d1"}
> The following query will lead to ConcurrentModificationException
> SELECT * FROM (SELECT b FROM test_table123) t1 LATERAL VIEW explode(b) x AS 
> b,c LIMIT 1;
> Failed with exception 
> java.io.IOException:java.util.ConcurrentModificationException



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9780) Add another level of explain for RDBMS audience

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384732#comment-14384732
 ] 

Hive QA commented on HIVE-9780:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707691/HIVE-9780.05.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8680 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3185/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3185/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3185/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707691 - PreCommit-HIVE-TRUNK-Build

> Add another level of explain for RDBMS audience
> ---
>
> Key: HIVE-9780
> URL: https://issues.apache.org/jira/browse/HIVE-9780
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-9780.01.patch, HIVE-9780.02.patch, 
> HIVE-9780.03.patch, HIVE-9780.04.patch, HIVE-9780.05.patch
>
>
> Current Hive Explain (default) is targeted at MR Audience. We need a new 
> level of explain plan to be targeted at RDBMS audience. The explain requires 
> these:
> 1) The focus needs to be on what part of the query is being executed rather 
> than internals of the engines
> 2) There needs to be a clearly readable tree of operations
> 3) Examples - Table scan should mention the table being scanned, the Sarg, 
> the size of table and expected cardinality after the Sarg'ed read. The join 
> should mention the table being joined with and the join condition. The 
> aggregate should mention the columns in the group-by. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9709) Hive should support replaying cookie from JDBC driver for beeline

2015-03-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-9709:

Attachment: HIVE-9709.2.patch

> Hive should support replaying cookie from JDBC driver for beeline
> -
>
> Key: HIVE-9709
> URL: https://issues.apache.org/jira/browse/HIVE-9709
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-9709.1.patch, HIVE-9709.2.patch
>
>
> Consider the following scenario:
> Beeline > Knox > HS2.
> Where Knox is going to LDAP for authentication. To avoid re-authentication, 
> Knox supports using a Cookie to identity a request. However the Beeline JDBC 
> client does not send back the cookie Knox sent and this leads to Knox having 
> to re-create LDAP authentication request on every connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10082) LLAP: UnwrappedRowContainer throws exceptions

2015-03-27 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10082:
--
Attachment: HIVE-10082.2.patch

> LLAP: UnwrappedRowContainer throws exceptions
> -
>
> Key: HIVE-10082
> URL: https://issues.apache.org/jira/browse/HIVE-10082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-10082.1.patch, HIVE-10082.2.patch
>
>
> TPC-DS Query27 runs with map-joins enabled results in errors originating from 
> these lines in UnwrappedRowContainer::unwrap() 
> {code}
>for (int index : valueIndex) {
>   if (index >= 0) {
> unwrapped.add(currentKey == null ? null : currentKey[index]);
>   } else {
> unwrapped.add(values.get(-index - 1));
>   }
> }
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:341)
> {code}
> This is intermittent and does not cause query failures as the retries 
> succeed, but slows down the query by an entire wave due to the retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10082) LLAP: UnwrappedRowContainer throws exceptions

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384692#comment-14384692
 ] 

Gunther Hagleitner commented on HIVE-10082:
---

.2 is a temporary patch. [~sershe] will fix properly w/o the need for 
synchronization in another patch.

> LLAP: UnwrappedRowContainer throws exceptions
> -
>
> Key: HIVE-10082
> URL: https://issues.apache.org/jira/browse/HIVE-10082
> Project: Hive
>  Issue Type: Bug
>Affects Versions: llap
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Fix For: llap
>
> Attachments: HIVE-10082.1.patch, HIVE-10082.2.patch
>
>
> TPC-DS Query27 runs with map-joins enabled results in errors originating from 
> these lines in UnwrappedRowContainer::unwrap() 
> {code}
>for (int index : valueIndex) {
>   if (index >= 0) {
> unwrapped.add(currentKey == null ? null : currentKey[index]);
>   } else {
> unwrapped.add(values.get(-index - 1));
>   }
> }
> {code}
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:653)
> at java.util.ArrayList.get(ArrayList.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.unwrap(UnwrapRowContainer.java:79)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:62)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.first(UnwrapRowContainer.java:33)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:341)
> {code}
> This is intermittent and does not cause query failures as the retries 
> succeed, but slows down the query by an entire wave due to the retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10114) Split strategies for ORC

2015-03-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10114:
-
Attachment: HIVE-10114.2.patch

Fix test failures.

> Split strategies for ORC
> 
>
> Key: HIVE-10114
> URL: https://issues.apache.org/jira/browse/HIVE-10114
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-10114.1.patch, HIVE-10114.2.patch
>
>
> ORC split generation does not have clearly defined strategies for different 
> scenarios (many small orc files, few small orc files, many large files etc.). 
> Few strategies like storing the file footer in orc split, making entire file 
> as a orc split already exists. This JIRA to make the split generation 
> simpler, support different strategies for various use cases (BI, ETL, ACID 
> etc.) and to lay the foundation for HIVE-7428.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384674#comment-14384674
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-9272:


[~ekoifman] Please find attached the patch with renaming of input files 
automated (in deploy_e2e_artifacts.sh). I was a little hesitant to add this 
initially as this looked like the only local file operation, but certainly will 
come in handy for anyone running the the test suite.

> Tests for utf-8 support
> ---
>
> Key: HIVE-9272
> URL: https://issues.apache.org/jira/browse/HIVE-9272
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
>Affects Versions: 0.14.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Aswathy Chellammal Sreekumar
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
> HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.6.patch, HIVE-9272.patch
>
>
> Including some test cases for utf8 support in webhcat. The first four tests 
> invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
> data processed, file names and job name. The last test case tests the 
> filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10074) Ability to run HCat Client Unit tests in a system test setting

2015-03-27 Thread Deepesh Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384666#comment-14384666
 ] 

Deepesh Khandelwal commented on HIVE-10074:
---

[~sushanth] the error still seems unrelated. Can you suggest what the next 
steps here should be?

> Ability to run HCat Client Unit tests in a system test setting
> --
>
> Key: HIVE-10074
> URL: https://issues.apache.org/jira/browse/HIVE-10074
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Deepesh Khandelwal
>Assignee: Deepesh Khandelwal
> Attachments: HIVE-10074.1.patch, HIVE-10074.patch
>
>
> Following testsuite 
> {{hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java}}
>  is a JUnit testsuite to test some basic HCat client API. During setup it 
> brings up a Hive Metastore with embedded Derby. The testsuite however will be 
> even more useful if it can be run against a running Hive Metastore 
> (transparent to whatever backing DB its running against).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-9272:
---
Attachment: HIVE-9272.6.patch

> Tests for utf-8 support
> ---
>
> Key: HIVE-9272
> URL: https://issues.apache.org/jira/browse/HIVE-9272
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
>Affects Versions: 0.14.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Aswathy Chellammal Sreekumar
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
> HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.6.patch, HIVE-9272.patch
>
>
> Including some test cases for utf8 support in webhcat. The first four tests 
> invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
> data processed, file names and job name. The last test case tests the 
> filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384635#comment-14384635
 ] 

Matt McCline commented on HIVE-9937:


Thank you for the review comments.

I did have trouble with one of the old VectorSerDe class buffering up 1024 
Object[] rows and it caused Writable overwrite problems.  But I stopped using 
that SerDe.  The singleRow trick has been used by VectorReduceSinkOperator and 
VectorFileSinkOperator for a while with no problems.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10112) LLAP: query 17 tasks fail due to mapjoin issue

2015-03-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10112:
---

Assignee: Sergey Shelukhin

> LLAP: query 17 tasks fail due to mapjoin issue
> --
>
> Key: HIVE-10112
> URL: https://issues.apache.org/jira/browse/HIVE-10112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> {noformat}
> 2015-03-26 18:16:38,833 
> [TezTaskRunner_attempt_1424502260528_1696_1_07_00_0(container_1_1696_01_000220_sershe_20150326181607_188ab263-0a13-4528-b778-c803f378640d:1_Map
>  1_0_0)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
> java.lang.RuntimeException: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:308)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AssertionError: Length is negative: -54
> at 
> org.apache.hadoop.hive.serde2.WriteBuffers$ByteSegmentRef.(WriteBuffers.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.getValueRefs(BytesBytesMultiHashMap.java:270)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.setFromOutput(MapJoinBytesTableContainer.java:429)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$GetAdaptor.setFromVector(MapJoinBytesTableContainer.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.setMapJoinKey(VectorMapJoinOperator.java:222)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:310)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.process(VectorMapJoinOperator.java:252)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:163)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:83)
> {noformat}
> Tasks do appear to pass on retries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10120) Disallow create table with dot/colon in column name

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10120:
---
Attachment: HIVE-10120.01.patch

> Disallow create table with dot/colon in column name
> ---
>
> Key: HIVE-10120
> URL: https://issues.apache.org/jira/browse/HIVE-10120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10120.01.patch
>
>
> Since we don't allow users to query column names with dot in the middle such 
> as emp.no, don't allow users to create tables with such columns that cannot 
> be queried. Fix the documentation to reflect this fix.
> Here is an example. Consider this table:
> {code}
> CREATE TABLE a (`emp.no` string);
> select `emp.no` from a; fails with this message:
> FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
> from [0:emp.no]
> {code}
> The hive documentation needs to be fixed:
> {code}
>  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
> to  indicate that any Unicode character can go between the backticks in the 
> select statement, but it doesn’t like the dot/colon or even select * when 
> there is a column that has a dot/colon. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10120) Disallow create table with dot/colon in column name

2015-03-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10120:
---
Description: 
Since we don't allow users to query column names with dot in the middle such as 
emp.no, don't allow users to create tables with such columns that cannot be 
queried. Fix the documentation to reflect this fix.

Here is an example. Consider this table:
{code}
CREATE TABLE a (`emp.no` string);
select `emp.no` from a; fails with this message:
FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from 
[0:emp.no]
{code}

The hive documentation needs to be fixed:
{code}
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems to 
 indicate that any Unicode character can go between the backticks in the select 
statement, but it doesn’t like the dot/colon or even select * when there is a 
column that has a dot/colon. 
{code}

  was:
Since we don't allow users to query column names with dot in the middle such as 
emp.no, don't allow users to create tables with such columns that cannot be 
queried. Fix the documentation to reflect this fix.

Here is an example. Consider this table:
{code}
CREATE TABLE a (`emp.no` string);
{code}
select `emp.no` from a; fails with this message:
FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from 
[0:emp.no]
{code}

The hive documentation needs to be fixed:
{code}
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems to 
 indicate that any Unicode character can go between the backticks in the select 
statement, but it doesn’t like the dot/colon or even select * when there is a 
column that has a dot/colon. 
{code}


> Disallow create table with dot/colon in column name
> ---
>
> Key: HIVE-10120
> URL: https://issues.apache.org/jira/browse/HIVE-10120
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Since we don't allow users to query column names with dot in the middle such 
> as emp.no, don't allow users to create tables with such columns that cannot 
> be queried. Fix the documentation to reflect this fix.
> Here is an example. Consider this table:
> {code}
> CREATE TABLE a (`emp.no` string);
> select `emp.no` from a; fails with this message:
> FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
> from [0:emp.no]
> {code}
> The hive documentation needs to be fixed:
> {code}
>  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
> to  indicate that any Unicode character can go between the backticks in the 
> select statement, but it doesn’t like the dot/colon or even select * when 
> there is a column that has a dot/colon. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-10116.
-
Resolution: Fixed

Committed to branch. Thanks, Jesus!

> CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
> actually a Semijoin [CBO branch]
> --
>
> Key: HIVE-10116
> URL: https://issues.apache.org/jira/browse/HIVE-10116
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: cbo-branch
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: cbo-branch
>
> Attachments: HIVE-10116.cbo.patch
>
>
> {{cbo_semijoin.q}} reproduces the error.
> Stacktrace:
> {noformat}
> 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
> (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
> java.lang.ArrayIndexOutOfBoundsException: 3
> at 
> org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
> at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
> at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at 
> at java.lang.reflect.Method.invoke(Method.java:606)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10116) CBO (Calcite Return Path): RelMdSize throws an Exception when Join is actually a Semijoin [CBO branch]

2015-03-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384552#comment-14384552
 ] 

Ashutosh Chauhan commented on HIVE-10116:
-

+1

> CBO (Calcite Return Path): RelMdSize throws an Exception when Join is 
> actually a Semijoin [CBO branch]
> --
>
> Key: HIVE-10116
> URL: https://issues.apache.org/jira/browse/HIVE-10116
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: cbo-branch
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: cbo-branch
>
> Attachments: HIVE-10116.cbo.patch
>
>
> {{cbo_semijoin.q}} reproduces the error.
> Stacktrace:
> {noformat}
> 2015-03-26 09:55:20,652 ERROR [main]: parse.CalcitePlanner 
> (CalcitePlanner.java:genOPTree(269)) - CBO failed, skipping CBO.
> java.lang.ArrayIndexOutOfBoundsException: 3
> at 
> org.apache.calcite.rel.metadata.RelMdSize.averageColumnSizes(RelMdSize.java:193)
> at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$2$1.invoke(ReflectiveRelMetadataProvider.java:194)
> at com.sun.proxy.$Proxy30.averageColumnSizes(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at 
> at java.lang.reflect.Method.invoke(Method.java:606)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10099) Enable constant folding for Decimal

2015-03-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384533#comment-14384533
 ] 

Prasanth Jayachandran commented on HIVE-10099:
--

LGTM, +1

> Enable constant folding for Decimal
> ---
>
> Key: HIVE-10099
> URL: https://issues.apache.org/jira/browse/HIVE-10099
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10099.2.patch, HIVE-10099.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9272) Tests for utf-8 support

2015-03-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384513#comment-14384513
 ] 

Eugene Koifman commented on HIVE-9272:
--

[~asreekumar], renaming files after checkout is a good idea, but this step 
needs to be automated just like renaming of the jar is automated.  Generally, 
we want to try to minimize manual steps as much as possible.

> Tests for utf-8 support
> ---
>
> Key: HIVE-9272
> URL: https://issues.apache.org/jira/browse/HIVE-9272
> Project: Hive
>  Issue Type: Test
>  Components: Tests, WebHCat
>Affects Versions: 0.14.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Aswathy Chellammal Sreekumar
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-9272.1.patch, HIVE-9272.2.patch, HIVE-9272.3.patch, 
> HIVE-9272.4.patch, HIVE-9272.5.patch, HIVE-9272.patch
>
>
> Including some test cases for utf8 support in webhcat. The first four tests 
> invoke hive, pig, mapred and streaming apis for testing the utf8 support for 
> data processed, file names and job name. The last test case tests the 
> filtering of job name with utf8 character



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384499#comment-14384499
 ] 

Hive QA commented on HIVE-10115:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707682/HIVE-10115.0.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3184/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3184/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3184/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707682 - PreCommit-HIVE-TRUNK-Build

> HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and 
> Delegation token(DIGEST) when alternate authentication is enabled
> ---
>
> Key: HIVE-10115
> URL: https://issues.apache.org/jira/browse/HIVE-10115
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 1.0.0
>Reporter: Mubashir Kazia
>  Labels: patch
> Fix For: 1.1.0
>
> Attachments: HIVE-10115.0.patch
>
>
> In a Kerberized cluster when alternate authentication is enabled on HS2, it 
> should also accept Kerberos Authentication. The reason this is important is 
> because when we enable LDAP authentication HS2 stops accepting delegation 
> token authentication. So we are forced to enter username passwords in the 
> oozie configuration.
> The whole idea of SASL is that multiple authentication mechanism can be 
> offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) 
> authentication when we enable LDAP authentication, this defeats SASL purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Fix Version/s: (was: 1.2.0)
   cbo-branch

> CBO (Calcite Return Path): Internal error: Cannot find common type for join 
> keys 
> -
>
> Key: HIVE-10118
> URL: https://issues.apache.org/jira/browse/HIVE-10118
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Fix For: cbo-branch
>
>
> Query 
> {code}
> explain
>   select  ss_items.item_id
>,ss_item_rev
>,ss_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ss_dev
>,cs_item_rev
>,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
>,ws_item_rev
> "query58.sql.explain.out" 698L, 43463C
>   
>   1,1   Top
>,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
>,ws_item_rev
>,ws_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ws_dev
>,(ss_item_rev+cs_item_rev+ws_item_rev)/3 average
> FROM
> ( select i_item_id item_id ,sum(ss_ext_sales_price) as ss_item_rev
>  from store_sales
>  JOIN item ON store_sales.ss_item_sk = item.i_item_sk
>  JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
>  JOIN (select d1.d_date
>  from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
> d2.d_week_seq
>  where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
> sub.d_date
>  group by i_item_id ) ss_items
> JOIN
> ( select i_item_id item_id ,sum(cs_ext_sales_price) as cs_item_rev
>  from catalog_sales
>  JOIN item ON catalog_sales.cs_item_sk = item.i_item_sk
>  JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
>  JOIN (select d1.d_date
>  from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
> d2.d_week_seq
>  where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
> sub.d_date
>  group by i_item_id ) cs_items
> ON ss_items.item_id=cs_items.item_id
> JOIN
> ( select i_item_id item_id ,sum(ws_ext_sales_price) as ws_item_rev
>  from web_sales
>  JOIN item ON web_sales.ws_item_sk = item.i_item_sk
>  JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
>  JOIN (select d1.d_date
>  from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
> d2.d_week_seq
>  where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
> sub.d_date
>  group by i_item_id ) ws_items
> ON ss_items.item_id=ws_items.item_id
>  where
>ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
>and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
>and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
>and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
>and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
>and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
>  order by item_id ,ss_item_rev
>  limit 100
>   
>   
>   41,8   6%
>  order by item_id ,ss_item_rev
>  limit 100
> {code}
> Exception 
> {code}
>  limit 100
> 15/03/27 12:38:32 [main]: ERROR parse.CalcitePlanner: CBO failed, skipping 
> CBO.
> java.lang.RuntimeException: java.lang.AssertionError: Internal error: Cannot 
> find common type for join keys $1 (type INTEGER) and $1 (type 
> VARCHAR(2147483647))
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:677)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:586)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:238)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9998)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:201)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
>   at org.apache.hado

[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Description: 
Query 
{code}
explain
  select  ss_items.item_id
   ,ss_item_rev
   ,ss_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ss_dev
   ,cs_item_rev
   ,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
   ,ws_item_rev
"query58.sql.explain.out" 698L, 43463C  

  1,1   Top
   ,cs_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 cs_dev
   ,ws_item_rev
   ,ws_item_rev/(ss_item_rev+cs_item_rev+ws_item_rev)/3 * 100 ws_dev
   ,(ss_item_rev+cs_item_rev+ws_item_rev)/3 average
FROM
( select i_item_id item_id ,sum(ss_ext_sales_price) as ss_item_rev
 from store_sales
 JOIN item ON store_sales.ss_item_sk = item.i_item_sk
 JOIN date_dim ON store_sales.ss_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) ss_items
JOIN
( select i_item_id item_id ,sum(cs_ext_sales_price) as cs_item_rev
 from catalog_sales
 JOIN item ON catalog_sales.cs_item_sk = item.i_item_sk
 JOIN date_dim ON catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) cs_items
ON ss_items.item_id=cs_items.item_id
JOIN
( select i_item_id item_id ,sum(ws_ext_sales_price) as ws_item_rev
 from web_sales
 JOIN item ON web_sales.ws_item_sk = item.i_item_sk
 JOIN date_dim ON web_sales.ws_sold_date_sk = date_dim.d_date_sk
 JOIN (select d1.d_date
 from date_dim d1 JOIN date_dim d2 ON d1.d_week_seq = 
d2.d_week_seq
 where d2.d_date = '1998-08-04') sub ON date_dim.d_date = 
sub.d_date
 group by i_item_id ) ws_items
ON ss_items.item_id=ws_items.item_id
 where
   ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
   and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
 order by item_id ,ss_item_rev
 limit 100


  41,8   6%
 order by item_id ,ss_item_rev
 limit 100
{code}

Exception 
{code}
 limit 100
15/03/27 12:38:32 [main]: ERROR parse.CalcitePlanner: CBO failed, skipping CBO.
java.lang.RuntimeException: java.lang.AssertionError: Internal error: Cannot 
find common type for join keys $1 (type INTEGER) and $1 (type 
VARCHAR(2147483647))
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:677)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:586)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:238)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9998)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:201)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1114)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1162)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1051)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1041)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
at 
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:403)
at org.apache.hadoop.hive.cli.

[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Summary: CBO (Calcite Return Path): Internal error: Cannot find common type 
for join keys   (was: CBO : )

> CBO (Calcite Return Path): Internal error: Cannot find common type for join 
> keys 
> -
>
> Key: HIVE-10118
> URL: https://issues.apache.org/jira/browse/HIVE-10118
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Mostafa Mokhtar
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10118) CBO (Calcite Return Path): Internal error: Cannot find common type for join keys

2015-03-27 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10118:
---
Assignee: Jesus Camacho Rodriguez

> CBO (Calcite Return Path): Internal error: Cannot find common type for join 
> keys 
> -
>
> Key: HIVE-10118
> URL: https://issues.apache.org/jira/browse/HIVE-10118
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Mostafa Mokhtar
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10108) Index#getIndexTableName() returns db.index_table_name

2015-03-27 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HIVE-10108:
--

Assignee: Jimmy Xiang

> Index#getIndexTableName() returns db.index_table_name
> -
>
> Key: HIVE-10108
> URL: https://issues.apache.org/jira/browse/HIVE-10108
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> Index#getIndexTableName() used to just returns index table name. Now it 
> returns a qualified table name.  This change was introduced in HIVE-3781.
> As a result:
> IMetaStoreClient#getTable(index.getDbName(), index.getIndexTableName())
> throws ObjectNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4997) HCatalog doesn't allow multiple input tables

2015-03-27 Thread Suraj Nayak M (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384416#comment-14384416
 ] 

Suraj Nayak M commented on HIVE-4997:
-

Yes [~sushanth]. I was able to use this patch for ORC table read. The filters 
worked perfectly fine for 1st level partition. How can I add filter to select 
only 1 nested partition. Like year=2015/month=03/day=01 ? I tried year='2015' 
and month='03' and day='01' but did not work. The input read all the records 
from table.

> HCatalog doesn't allow multiple input tables
> 
>
> Key: HIVE-4997
> URL: https://issues.apache.org/jira/browse/HIVE-4997
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Daniel Intskirveli
> Attachments: HIVE-4997.2.patch, HIVE-4997.3.patch, HIVE-4997.4.patch
>
>
> HCatInputFormat does not allow reading from multiple hive tables in the same 
> MapReduce job. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10100) Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384348#comment-14384348
 ] 

Gunther Hagleitner commented on HIVE-10100:
---

Thank you [~cnauroth]! Lowering priority. Will keep open to switch from "hadoop 
jar" to "yarn jar" to get rid of the warning over time.

> Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0
> --
>
> Key: HIVE-10100
> URL: https://issues.apache.org/jira/browse/HIVE-10100
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Priority: Critical
>
> HADOOP-11257 adds a warning to stdout
> {noformat}
> WARNING: Use "yarn jar" to launch YARN applications.
> {noformat}
> which will cause issues if untreated with folks that programatically parse 
> stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10100) Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0

2015-03-27 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10100:
--
Priority: Critical  (was: Blocker)

> Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0
> --
>
> Key: HIVE-10100
> URL: https://issues.apache.org/jira/browse/HIVE-10100
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Priority: Critical
>
> HADOOP-11257 adds a warning to stdout
> {noformat}
> WARNING: Use "yarn jar" to launch YARN applications.
> {noformat}
> which will cause issues if untreated with folks that programatically parse 
> stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: HIVE-10086.5.patch

Attached HIVE-10086.5.patch that fix issue with parquet_joint.q test.

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.5.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10086) Hive throws error when accessing Parquet file schema using field name match

2015-03-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10086:
---
Attachment: (was: HIVE-10086.4.patch)

> Hive throws error when accessing Parquet file schema using field name match
> ---
>
> Key: HIVE-10086
> URL: https://issues.apache.org/jira/browse/HIVE-10086
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-10086.5.patch, HiveGroup.parquet
>
>
> When Hive table schema contains a portion of the schema of a Parquet file, 
> then the access to the values should work if the field names match the 
> schema. This does not work when a struct<> data type is in the schema, and 
> the Hive schema contains just a portion of the struct elements. Hive throws 
> an error instead.
> This is the example and how to reproduce:
> First, create a parquet table, and add some values on it:
> {code}
> CREATE TABLE test1 (id int, name string, address 
> struct) STORED AS PARQUET;
> INSERT INTO TABLE test1 SELECT 1, 'Roger', 
> named_struct('number',8600,'street','Congress Ave.','zip','87366') FROM 
> srcpart LIMIT 1;
> {code}
> Note: {{srcpart}} could be any table. It is just used to leverage the INSERT 
> statement.
> The above table example generates the following Parquet file schema:
> {code}
> message hive_schema {
>   optional int32 id;
>   optional binary name (UTF8);
>   optional group address {
> optional int32 number;
> optional binary street (UTF8);
> optional binary zip (UTF8);
>   }
> }
> {code} 
> Afterwards, I create a table that contains just a portion of the schema, and 
> load the Parquet file generated above, a query will fail on that table:
> {code}
> CREATE TABLE test1 (name string, address struct) STORED AS 
> PARQUET;
> LOAD DATA LOCAL INPATH '/tmp/HiveGroup.parquet' OVERWRITE INTO TABLE test1;
> hive> SELECT name FROM test1;
> OK
> Roger
> Time taken: 0.071 seconds, Fetched: 1 row(s)
> hive> SELECT address FROM test1;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.UnsupportedOperationException: Cannot inspect 
> org.apache.hadoop.io.IntWritable
> Time taken: 0.085 seconds
> {code}
> I would expect that Parquet can access the matched names, but Hive throws an 
> error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9518) Implement MONTHS_BETWEEN aligned with Oracle one

2015-03-27 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-9518:
--
Attachment: HIVE-9518.8.patch

2 improvements in GenericUDF.getTimestampValue() related to short date format 
support

> Implement MONTHS_BETWEEN aligned with Oracle one
> 
>
> Key: HIVE-9518
> URL: https://issues.apache.org/jira/browse/HIVE-9518
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Xiaobing Zhou
>Assignee: Alexander Pivovarov
> Attachments: HIVE-9518.1.patch, HIVE-9518.2.patch, HIVE-9518.3.patch, 
> HIVE-9518.4.patch, HIVE-9518.5.patch, HIVE-9518.6.patch, HIVE-9518.7.patch, 
> HIVE-9518.8.patch
>
>
> This is used to track work to build Oracle like months_between. Here's 
> semantics:
> MONTHS_BETWEEN returns number of months between dates date1 and date2. If 
> date1 is later than date2, then the result is positive. If date1 is earlier 
> than date2, then the result is negative. If date1 and date2 are either the 
> same days of the month or both last days of months, then the result is always 
> an integer. Otherwise Oracle Database calculates the fractional portion of 
> the result based on a 31-day month and considers the difference in time 
> components date1 and date2.
> Should accept date, timestamp and string arguments in the format '-MM-dd' 
> or '-MM-dd HH:mm:ss'.
> The result should be rounded to 8 decimal places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10117) LLAP: Use task number, attempt number to cache plans

2015-03-27 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10117:
--
Fix Version/s: llap

> LLAP: Use task number, attempt number to cache plans
> 
>
> Key: HIVE-10117
> URL: https://issues.apache.org/jira/browse/HIVE-10117
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
> Fix For: llap
>
>
> Instead of relying on thread locals only. This can be used to share the work 
> between Inputs / Processor / Outputs in Tez.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10099) Enable constant folding for Decimal

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384262#comment-14384262
 ] 

Hive QA commented on HIVE-10099:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707679/HIVE-10099.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3183/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3183/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3183/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707679 - PreCommit-HIVE-TRUNK-Build

> Enable constant folding for Decimal
> ---
>
> Key: HIVE-10099
> URL: https://issues.apache.org/jira/browse/HIVE-10099
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 0.14.0, 1.0.0, 1.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10099.2.patch, HIVE-10099.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10106) Regression : Dynamic partition pruning not working after HIVE-9976

2015-03-27 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384249#comment-14384249
 ] 

Gunther Hagleitner commented on HIVE-10106:
---

That sounds correct to me. +1.

> Regression : Dynamic partition pruning not working after HIVE-9976
> --
>
> Key: HIVE-10106
> URL: https://issues.apache.org/jira/browse/HIVE-10106
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Siddharth Seth
> Fix For: 1.2.0
>
> Attachments: HIVE-10106.1.patch
>
>
> After HIVE-9976 got checked in dynamic partition pruning doesn't work.
> Partitions are pruned and later show up in splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts

2015-03-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384246#comment-14384246
 ] 

Thejas M Nair commented on HIVE-10066:
--

+1

> Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
> -
>
> Key: HIVE-10066
> URL: https://issues.apache.org/jira/browse/HIVE-10066
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, WebHCat
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-10066.2.patch, HIVE-10066.3.patch, HIVE-10066.patch
>
>
> From [~hitesh]:
> "Tez is a client-side only component ( no daemons, etc ) and therefore it is 
> meant to be installed on the gateway box ( or where its client libraries are 
> needed by any other services’ daemons). It does not have any cluster 
> dependencies both in terms of libraries/jars as well as configs. When it runs 
> on a worker node, everything was pre-packaged and made available to the 
> worker node via the distributed cache via the client code. Hence, its 
> client-side configs are also only needed on the same (client) node as where 
> it is installed. The only other install step needed is to have the tez 
> tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which 
> points to the HDFS path. "
> We need a way to pass client jars and tez-site.xml to the LaunchMapper.
> We should create a general purpose mechanism here which can supply additional 
> artifacts per job type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10113) LLAP: reducers running in LLAP starve out map retries

2015-03-27 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384247#comment-14384247
 ] 

Siddharth Seth commented on HIVE-10113:
---

Related: https://issues.apache.org/jira/browse/HIVE-10029

This is expected at the moment. Until we support pre-empting tasks / removal of 
tasks from queues.

> LLAP: reducers running in LLAP starve out map retries
> -
>
> Key: HIVE-10113
> URL: https://issues.apache.org/jira/browse/HIVE-10113
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Gunther Hagleitner
>
> When query 17 is run, some mappers from Map 1 currently fail (due to unwrap 
> issue, and also due to  HIVE-10112).
> This query has 1000+ reducers; if they are ran in llap, they all queue up, 
> and the query locks up.
> If only mappers run in LLAP, query completes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >