[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-12-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Attachment: HIVE-15192.6.patch

Updated subquery remove rule and de-correlation to support NOT IN queries and 
fixed bunch of other issues. 

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, 
> HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-12-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Status: Patch Available  (was: Open)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, 
> HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries

2016-12-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15192:
---
Status: Open  (was: Patch Available)

> Use Calcite to de-correlate and plan subqueries
> ---
>
> Key: HIVE-15192
> URL: https://issues.apache.org/jira/browse/HIVE-15192
> Project: Hive
>  Issue Type: Task
>  Components: Logical Optimizer
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, 
> HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch
>
>
> Currently support of subqueries is limited [Link to original spec | 
> https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf].
> Using Calcite to plan and de-correlate subqueries will help Hive get rid of 
> these limitations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15413) Primary key constraints forced to be unique across database and table names

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737109#comment-15737109
 ] 

Sergey Shelukhin commented on HIVE-15413:
-

I'm pretty sure most RDBMSes require unique constraint name within one database.

> Primary key constraints forced to be unique across database and table names
> ---
>
> Key: HIVE-15413
> URL: https://issues.apache.org/jira/browse/HIVE-15413
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Priority: Critical
>
> In the RDBMS underlying the metastore the table that stores primary and 
> foreign keys has it's own primary key (at the RDBMS level) of 
> (constraint_name, position).  This means that a constraint name must be 
> unique across all tables and databases in a system.  This is not reasonable.  
> Database and table name should be included in the RDBMS primary key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15147) LLAP: use LLAP cache for non-columnar formats in a somewhat general way

2016-12-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15147:

Attachment: HIVE-15147.01.WIP.noout.patch

2nd WIP patch with almost-proper cache support. Not sure if cache actually 
works (need to check), but it does pass the test. Also rebased on top 
HIVE-14453 that was committed recently.
Cache evicted buffer cleanup thread is not implemented, but it's a simple 
change.

Next steps:
0) Implementing cache cleanup of evicted buffers.
1) Separating file into slices instead of caching it at split granularity (the 
read-time support for slices is already there, need write-time support, so to 
speak).
2) The above - LlapTextInputFormat claiming to be vectorized hack.
3) Cleanup, logging, more testing, etc.

> LLAP: use LLAP cache for non-columnar formats in a somewhat general way
> ---
>
> Key: HIVE-15147
> URL: https://issues.apache.org/jira/browse/HIVE-15147
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15147.01.WIP.noout.patch, HIVE-15147.WIP.noout.patch
>
>
> The primary goal for the first pass is caching text files. Nothing would 
> prevent other formats from using the same path, in principle, although, as 
> was originally done with ORC, it may be better to have native caching support 
> optimized for each particular format.
> Given that caching pure text is not smart, and we already have ORC-encoded 
> cache that is columnar due to ORC file structure, we will transform data into 
> columnar ORC.
> The general idea is to treat all the data in the world as merely ORC that was 
> compressed with some poor compression codec, such as csv. Using the original 
> IF and serde, as well as an ORC writer (with some heavyweight optimizations 
> disabled, potentially), we can "uncompress" the csv/whatever data into its 
> "original" ORC representation, then cache it efficiently, by column, and also 
> reuse a lot of the existing code.
> Various other points:
> 1) Caching granularity will have to be somehow determined (i.e. how do we 
> slice the file horizontally, to avoid caching entire columns). As with ORC 
> uncompressed files, the specific offsets don't really matter as long as they 
> are consistent between reads. The problem is that the file offsets will 
> actually need to be propagated to the new reader from the original 
> inputformat. Row counts are easier to use but there's a problem of how to 
> actually map them to missing ranges to read from disk.
> 2) Obviously, for row-based formats, if any one column that is to be read has 
> been evicted or is otherwise missing, "all the columns" have to be read for 
> the corresponding slice to cache and read that one column. The vague plan is 
> to handle this implicitly, similarly to how ORC reader handles CB-RG overlaps 
> - it will just so happen that a missing column in disk range list to retrieve 
> will expand the disk-range-to-read into the whole horizontal slice of the 
> file.
> 3) Granularity/etc. won't work for gzipped text. If anything at all is 
> evicted, the entire file has to be re-read. Gzipped text is a ridiculous 
> feature, so this is by design.
> 4) In future, it would be possible to also build some form or 
> metadata/indexes for this cached data to do PPD, etc. This is out of the 
> scope for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15401) Import constraints into HBase metastore

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737066#comment-15737066
 ] 

Hive QA commented on HIVE-15401:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842638/HIVE-15401.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10769 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2532/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2532/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2532/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842638 - PreCommit-HIVE-Build

> Import constraints into HBase metastore
> ---
>
> Key: HIVE-15401
> URL: https://issues.apache.org/jira/browse/HIVE-15401
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore
>Affects Versions: 2.1.1
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-15401.patch
>
>
> Since HIVE-15342 added support for primary and foreign keys in the HBase 
> metastore we should support them in HBaseImport as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-12-09 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-15112:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Push to the upstream. Thanks [~csun] for the review.

> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-15112.1.patch, HIVE-15112.2.patch, HIVE-15112.patch
>
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type

2016-12-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737061#comment-15737061
 ] 

ASF GitHub Bot commented on HIVE-15112:
---

Github user asfgit closed the pull request at:

https://github.com/apache/hive/pull/116


> Implement Parquet vectorization reader for Struct type
> --
>
> Key: HIVE-15112
> URL: https://issues.apache.org/jira/browse/HIVE-15112
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-15112.1.patch, HIVE-15112.2.patch, HIVE-15112.patch
>
>
> Like HIVE-14815, we need support Parquet vectorized reader for struct type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736990#comment-15736990
 ] 

Sergey Shelukhin commented on HIVE-15405:
-

Nm, it looks like a newly broken test today in all jiras. +1

> Improve FileUtils.isPathWithinSubtree
> -
>
> Key: HIVE-15405
> URL: https://issues.apache.org/jira/browse/HIVE-15405
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, 
> HIVE-15405.2.patch
>
>
> When running single node LLAP with the following query multiple number of 
> times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} 
> became a hotpath. 
> {noformat}
> SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`,
> YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok`
> FROM `flights` as `flights`
> JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`)
> JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`)
> JOIN `airports` as `dest_airport` ON (`flights`.`dest` = 
> `dest_airport`.`iata`)
> GROUP BY YEAR(`flights`.`dateofflight`);
> {noformat}
> It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} 
> based on path depth comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15414) Fix batchSize for TestNegativeCliDriver

2016-12-09 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736979#comment-15736979
 ] 

Vihang Karajgaonkar commented on HIVE-15414:


[~sseth] [~spena] Do you know if batchSize is kept as 1000 intentionally?

> Fix batchSize for TestNegativeCliDriver
> ---
>
> Key: HIVE-15414
> URL: https://issues.apache.org/jira/browse/HIVE-15414
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> While analyzing the console output of pre-commit console logs, I noticed that 
> TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right.
> 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 
> PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, 
> queryFilesProperty=qfile, 
> name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more..
>   
> I think {{qFileTest.clientNegative.batchSize = 1000}} in 
> {{test-configuration2.properties}} is probably the batchSize is the reason. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15414) Fix batchSize for TestNegativeCliDriver

2016-12-09 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-15414:
---
Description: 
While analyzing the console output of pre-commit console logs, I noticed that 
TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right.

2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 PBatch: 
QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, 
queryFilesProperty=qfile, 
name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more..
  

I think {{qFileTest.clientNegative.batchSize = 1000}} in 
{{test-configuration2.properties}} is probably the reason. 


  was:
While analyzing the console output of pre-commit console logs, I noticed that 
TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right.

2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 PBatch: 
QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, 
queryFilesProperty=qfile, 
name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more..
  

I think {{qFileTest.clientNegative.batchSize = 1000}} in 
{{test-configuration2.properties}} is probably the batchSize is the reason. 



> Fix batchSize for TestNegativeCliDriver
> ---
>
> Key: HIVE-15414
> URL: https://issues.apache.org/jira/browse/HIVE-15414
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> While analyzing the console output of pre-commit console logs, I noticed that 
> TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right.
> 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 
> PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, 
> queryFilesProperty=qfile, 
> name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more..
>   
> I think {{qFileTest.clientNegative.batchSize = 1000}} in 
> {{test-configuration2.properties}} is probably the reason. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree

2016-12-09 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736970#comment-15736970
 ] 

Rajesh Balamohan commented on HIVE-15405:
-

Thanks [~sershe]. TestMiniLlapLocalCliDriver#stats_based_fetch_decision did not 
fail in my env. Will check and update.

> Improve FileUtils.isPathWithinSubtree
> -
>
> Key: HIVE-15405
> URL: https://issues.apache.org/jira/browse/HIVE-15405
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, 
> HIVE-15405.2.patch
>
>
> When running single node LLAP with the following query multiple number of 
> times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} 
> became a hotpath. 
> {noformat}
> SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`,
> YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok`
> FROM `flights` as `flights`
> JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`)
> JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`)
> JOIN `airports` as `dest_airport` ON (`flights`.`dest` = 
> `dest_airport`.`iata`)
> GROUP BY YEAR(`flights`.`dateofflight`);
> {noformat}
> It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} 
> based on path depth comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736962#comment-15736962
 ] 

Hive QA commented on HIVE-15405:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842517/HIVE-15405.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10768 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2531/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2531/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2531/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842517 - PreCommit-HIVE-Build

> Improve FileUtils.isPathWithinSubtree
> -
>
> Key: HIVE-15405
> URL: https://issues.apache.org/jira/browse/HIVE-15405
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, 
> HIVE-15405.2.patch
>
>
> When running single node LLAP with the following query multiple number of 
> times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} 
> became a hotpath. 
> {noformat}
> SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`,
> YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok`
> FROM `flights` as `flights`
> JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`)
> JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`)
> JOIN `airports` as `dest_airport` ON (`flights`.`dest` = 
> `dest_airport`.`iata`)
> GROUP BY YEAR(`flights`.`dateofflight`);
> {noformat}
> It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} 
> based on path depth comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736916#comment-15736916
 ] 

Sergey Shelukhin commented on HIVE-14007:
-

Another consideration is whether users configs will break. Do the configs all 
have the same name? Perhaps it's possible to add a fallback option to old names 
in OrcConf otherwise

> Replace ORC module with ORC release
> ---
>
> Key: HIVE-14007
> URL: https://issues.apache.org/jira/browse/HIVE-14007
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, 
> HIVE-14007.patch
>
>
> This completes moving the core ORC reader & writer to the ORC project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736905#comment-15736905
 ] 

Sergey Shelukhin commented on HIVE-15397:
-

Btw, I am too lazy to rerun it again now, but I think the current master is 
inconsistent, cause the out file changes that removed the rows on the first run 
removed the rows because I disabled metadata-only by default. So, non-optimized 
path on master doesn't return such rows, but metadata-only does return them. 
After the patch, neither does.

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736908#comment-15736908
 ] 

Sergey Shelukhin commented on HIVE-15397:
-

Now all I need is a +1 :P

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6365) Alter a partition to be of a different fileformat than the Table's fileformat. Use insert overwrite to write data to this partition. The partition fileformat is converted

2016-12-09 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6365:
--
Summary: Alter a partition to be of a different fileformat than the Table's 
fileformat. Use insert overwrite to write data to this partition. The partition 
fileformat is converted back to table's fileformat after the insert operation.  
 (was: Alter a partition to be of a different fileformat than the Table's 
fileformat. Use insert overwrite to write data to this partition. The partition 
fileformat is coverted back to table's fileformat after the insert operation. )

> Alter a partition to be of a different fileformat than the Table's 
> fileformat. Use insert overwrite to write data to this partition. The 
> partition fileformat is converted back to table's fileformat after the insert 
> operation. 
> --
>
> Key: HIVE-6365
> URL: https://issues.apache.org/jira/browse/HIVE-6365
> Project: Hive
>  Issue Type: Bug
> Environment: emr
>Reporter: Pavan Srinivas
>
> Lets say, there is partitioned table like 
> Step1:
> >> CREATE TABLE srcpart (key STRING, value STRING)
> PARTITIONED BY (ds STRING, hr STRING)
> STORED AS TEXTFILE;
> Step2:
> Alter the fileformat for a specific available partition. 
> >> alter table srcpart partition(ds="2008-04-08", hr="12") set fileformat  
> >> orc;
> Step3:
> Describe the partition.
> >> desc formatted srcpart partition(ds="2008-04-08", hr="12")
> .
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed:   No
> Num Buckets:  -1
> Bucket Columns:   []
> Sort Columns: []
> Storage Desc Params:
>   serialization.format1
> Step4:
> Write the data to this partition using insert overwrite. 
> >>insert overwrite  table srcpart partition(ds="2008-04-08",hr="12") select 
> >>key, value from ... 
> Step5:
> Describe the partition again. 
> >> desc formatted srcpart partition(ds="2008-04-08", hr="12")
> .
> # Storage Information
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Compressed:   No
> Num Buckets:  -1
> Bucket Columns:   []
> Sort Columns: []
> Storage Desc Params:
>   serialization.format1
> The fileformat of the partition is converted back to the table's original 
> fileformat. It should have retained and written the data in the modified 
> fileformat. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release

2016-12-09 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736863#comment-15736863
 ] 

Owen O'Malley commented on HIVE-14007:
--

Ok, I've updated the pull request https://github.com/apache/hive/pull/81 and 
will post a new patch once ORC 1.2.3 is released.

The patch ensures that the moved config variables don't create an error by 
adding them to the exception map. 
https://github.com/omalley/hive/commit/d078aea84ecb1fe7d3e9b95cc845dfbdee63587c#diff-7f16e0de4170e5b6c031990da80f5643

As to why the patch is changing test files, it is because the ORC project has a 
couple of fixes and features that hadn't been back ported to Hive.

In particular,
* ORC-101 fixes the bloom filters to use utf-8 rather than the jvm default 
encoding. This changes the size of the ORC files and their write version. There 
is an option to write the old broken bloom filters in addition to the new ones.
* ORC-54 makes the default for schema evolution to be by name instead of 
position if the ORC files has real column names. Some of the tests required the 
old behavior and so I changed the names to match for the intended matches. Note 
that Hive 1.x never got the fix to encode the real column names in the file 
metadata, so all files written by Hive 1.x will use positional matching.


> Replace ORC module with ORC release
> ---
>
> Key: HIVE-14007
> URL: https://issues.apache.org/jira/browse/HIVE-14007
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, 
> HIVE-14007.patch
>
>
> This completes moving the core ORC reader & writer to the ORC project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14870) OracleStore: RawStore implementation optimized for Oracle

2016-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736848#comment-15736848
 ] 

Alan Gates commented on HIVE-14870:
---

Do you have code you could post showing the OracleStore?  Obviously it won't be 
ready for inclusion but it would be helpful to see it.

Did you consider using array types to store things like columns and parameters? 
 I know this will decrease portability since every database does collection 
types differently (and some don't do them at all), but it would also remove 
additional calls or joins from a number of operations.

Did you do any experimentation on the trade offs of duplicating data versus 
performance?  For example, one could play with storing a storage descriptor in 
the partition or table object.  Or serdes inside of a storage descriptor, etc.  
This would obviously grow the amount of data in the metastore but again reduce 
the number of calls or joins to fetch data.

> OracleStore: RawStore implementation optimized for Oracle
> -
>
> Key: HIVE-14870
> URL: https://issues.apache.org/jira/browse/HIVE-14870
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Chris Drome
>Assignee: Chris Drome
> Attachments: OracleStoreDesignProposal.pdf
>
>
> The attached document is a proposal for a RawStore implementation which is 
> optimized for Oracle and replaces DataNucleus. The document outlines schema 
> changes, OracleStore implementation details, and performance tests against 
> ObjectStore, ObjectStore+DirectSQL, and OracleStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736821#comment-15736821
 ] 

Ashutosh Chauhan commented on HIVE-15397:
-

This is interesting, because Hive allows you to create partitions without any 
data. That will result in a partitioning column having a value. So, shall we 
assume that table has row(s) with partitioning column taking supplied value and 
other columns being null. I think no. This was the case earlier and I think its 
wrong. I think behavior we are getting now is correct. If partition exists but 
its empty, we should consider partition has 0 rows, thus value for partitioning 
column should not matter during query evaluation. So, max(partCol) from 
empty_table should be null even when there is a partition which has partcol = 1.
So, I think behavior we are getting after patch is correct and desired. 

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15401) Import constraints into HBase metastore

2016-12-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-15401:
--
Status: Patch Available  (was: Open)

> Import constraints into HBase metastore
> ---
>
> Key: HIVE-15401
> URL: https://issues.apache.org/jira/browse/HIVE-15401
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore
>Affects Versions: 2.1.1
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-15401.patch
>
>
> Since HIVE-15342 added support for primary and foreign keys in the HBase 
> metastore we should support them in HBaseImport as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15401) Import constraints into HBase metastore

2016-12-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-15401:
--
Attachment: HIVE-15401.patch

In addition to adding the import of constraints to HBaseImport this patch fixes 
a bug in HBaseStore where it was reversing the primary key and foreign key 
fields in HBaseStore.getForeignKeys() causing it to return wrong results.

> Import constraints into HBase metastore
> ---
>
> Key: HIVE-15401
> URL: https://issues.apache.org/jira/browse/HIVE-15401
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Metastore
>Affects Versions: 2.1.1
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-15401.patch
>
>
> Since HIVE-15342 added support for primary and foreign keys in the HBase 
> metastore we should support them in HBaseImport as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15413) Primary key constraints forced to be unique across database and table names

2016-12-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15413:
-
Target Version/s: 2.2.0

> Primary key constraints forced to be unique across database and table names
> ---
>
> Key: HIVE-15413
> URL: https://issues.apache.org/jira/browse/HIVE-15413
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Priority: Critical
>
> In the RDBMS underlying the metastore the table that stores primary and 
> foreign keys has it's own primary key (at the RDBMS level) of 
> (constraint_name, position).  This means that a constraint name must be 
> unique across all tables and databases in a system.  This is not reasonable.  
> Database and table name should be included in the RDBMS primary key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15413) Primary key constraints forced to be unique across database and table names

2016-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736687#comment-15736687
 ] 

Alan Gates commented on HIVE-15413:
---

This applies to foreign keys as well, meaning even a foreign key and primary 
key cannot share a name.

> Primary key constraints forced to be unique across database and table names
> ---
>
> Key: HIVE-15413
> URL: https://issues.apache.org/jira/browse/HIVE-15413
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Alan Gates
>Priority: Critical
>
> In the RDBMS underlying the metastore the table that stores primary and 
> foreign keys has it's own primary key (at the RDBMS level) of 
> (constraint_name, position).  This means that a constraint name must be 
> unique across all tables and databases in a system.  This is not reasonable.  
> Database and table name should be included in the RDBMS primary key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15342) Add support for primary/foreign keys in HBase metastore

2016-12-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736677#comment-15736677
 ] 

Lefty Leverenz commented on HIVE-15342:
---

Okay, thanks Alan.

> Add support for primary/foreign keys in HBase metastore
> ---
>
> Key: HIVE-15342
> URL: https://issues.apache.org/jira/browse/HIVE-15342
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 2.2.0
>
> Attachments: HIVE-15342.patch
>
>
> When HIVE-13076 was committed the calls into the HBase metastore were stubbed 
> out.  We need to implement support for constraints in the HBase metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736659#comment-15736659
 ] 

Lefty Leverenz commented on HIVE-15403:
---

Okay, thanks Prasanth.

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736653#comment-15736653
 ] 

Prasanth Jayachandran commented on HIVE-15403:
--

I don't think it is required. This is mostly a bug fix.

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736648#comment-15736648
 ] 

Lefty Leverenz commented on HIVE-15403:
---

Does this need to be documented in the wiki?

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736608#comment-15736608
 ] 

Hive QA commented on HIVE-15397:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842617/HIVE-15397.01.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10793 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=91)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=213)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=215)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2529/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2529/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2529/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842617 - PreCommit-HIVE-Build

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736604#comment-15736604
 ] 

Prasanth Jayachandran commented on HIVE-15403:
--

Test failures are unrelated to this patch. Already tracked in HIVE-15058

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15403:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15329) NullPointerException might occur when create table

2016-12-09 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15329:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master! Thanks [~winningalong] for the contribution!

> NullPointerException might occur when create table
> --
>
> Key: HIVE-15329
> URL: https://issues.apache.org/jira/browse/HIVE-15329
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Meilong Huang
>Assignee: Meilong Huang
>  Labels: metastore
> Fix For: 2.2.0
>
> Attachments: HIVE-15329.1.patch
>
>
> NullPointerException might occur if table.getParameters() returns null when 
> method isNonNativeTable is invoked in class MetaStoreUtils.
> {code}
> public static boolean isNonNativeTable(Table table) {
> if (table == null) {
>   return false;
> }
> return 
> (table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != 
> null);
>   }
> {code}
> This will cause a stack trace without any suggestive information at client:
> {code}
> org.apache.hadoop.hive.metastore.api.MetaException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15329) NullPointerException might occur when create table

2016-12-09 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736584#comment-15736584
 ] 

Prasanth Jayachandran commented on HIVE-15329:
--

The test failures are not related to this patch and are already failing in 
master. Failures are tracked in HIVE-15058

> NullPointerException might occur when create table
> --
>
> Key: HIVE-15329
> URL: https://issues.apache.org/jira/browse/HIVE-15329
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Meilong Huang
>Assignee: Meilong Huang
>  Labels: metastore
> Attachments: HIVE-15329.1.patch
>
>
> NullPointerException might occur if table.getParameters() returns null when 
> method isNonNativeTable is invoked in class MetaStoreUtils.
> {code}
> public static boolean isNonNativeTable(Table table) {
> if (table == null) {
>   return false;
> }
> return 
> (table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != 
> null);
>   }
> {code}
> This will cause a stack trace without any suggestive information at client:
> {code}
> org.apache.hadoop.hive.metastore.api.MetaException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-12-09 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736556#comment-15736556
 ] 

Zoltan Haindrich commented on HIVE-14735:
-

Thank you for the command [~stakiar], i've added it to the patch.

I've uploaded #3: I hope I didn't break anything ...the ptest execution will 
shed light on this.

[~spena] i've addressed much of your comments (however I still use fixed 
version for the maven plugins - i've forgot fix that)
and also...i've missed your previous question about "where the downloaded file 
is": it's inside the local maven repository.

i've changed the following:
* added a project to repack the spark artifact under dev-support, with a readme 
describing the procedure
* {{itests/thirparty}} is now a module - this way these maven "tricks" are 
isolated, other modules rely on that thirdparty have already finished - this 
also enabled to support even multiple spark versions - which may come handy for 
people who switch between branches which pull different spark version
* it now only unpacks the spark assembly to only 1 place

[~spena] what do you think about the new changes?

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, 
> HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, HIVE-14735.3.patch
>
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release

2016-12-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736547#comment-15736547
 ] 

Gunther Hagleitner commented on HIVE-14007:
---

Moving the config vars over to ORC and removing them from HiveConf will still 
break existing apps, because hive throws an error when it doesn't know about a 
config var that starts with hive as a precaution afaik.

Why is this patch changing test files? Seems this is changing behavior in 
addition to removing the files?


> Replace ORC module with ORC release
> ---
>
> Key: HIVE-14007
> URL: https://issues.apache.org/jira/browse/HIVE-14007
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, 
> HIVE-14007.patch
>
>
> This completes moving the core ORC reader & writer to the ORC project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15385:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~stakiar]. I committed this to master.

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14735:

Attachment: HIVE-14735.3.patch

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, 
> HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, HIVE-14735.3.patch
>
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736495#comment-15736495
 ] 

Sergey Shelukhin commented on HIVE-15403:
-

+1

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736491#comment-15736491
 ] 

Hive QA commented on HIVE-15403:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842611/HIVE-15403.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10792 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2528/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2528/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2528/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842611 - PreCommit-HIVE-Build

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736456#comment-15736456
 ] 

Sergey Shelukhin edited comment on HIVE-15397 at 12/9/16 9:59 PM:
--

Updated the out files, expanded one test to run with and without metadataonly 
enabled to make sure results are consistent, fixed the typo.


was (Author: sershe):
Updated the out files, expanded one test to run with and without metadataonly 
enable to make sure results are consistent, fixed the typo.

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15397:

Attachment: HIVE-15397.01.patch

Updated the out files, expanded one test to run with and without metadataonly 
enable to make sure results are consistent, fixed the typo.

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.01.patch, HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736432#comment-15736432
 ] 

Sergey Shelukhin commented on HIVE-15397:
-

Interesting q file changes.. according to our take on 1=1 group by 1=1 they are 
correct.
E.g. table has 3 partitions, part=a, part=b, and part=c. Only a and c have data.
select distinct part from t
used to return "a, b, c". However, there are no rows in the table that actually 
have value b. So, the result has changed to "a, c".
[~ashutoshc] [~jcamachorodriguez] would you say it's the correct change and 
previous result is incorrect?
Same for max(partcol) from an empty table - should it be null? Cause there are 
no rows in the table to derive max from, similar how there are no rows in gby 
1=1 to group by.

> metadata-only queries may return incorrect results with empty tables
> 
>
> Key: HIVE-15397
> URL: https://issues.apache.org/jira/browse/HIVE-15397
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15397.patch
>
>
> Queries like select 1=1 from t group by 1=1 may return rows, based on 
> OneNullRowInputFormat, even if the source table is empty. For now, add some 
> basic detection of empty tables and turn this off by default (since we can't 
> know whether a table is empty or not based on there being some files, without 
> reading them).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736388#comment-15736388
 ] 

Hive QA commented on HIVE-15118:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842601/HIVE-15118.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10792 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=91)
org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade (batchId=209)
org.apache.hive.hcatalog.api.TestHCatClientNotification.createTable 
(batchId=219)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2527/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2527/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2527/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842601 - PreCommit-HIVE-Build

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15411) ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES

2016-12-09 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736385#comment-15736385
 ] 

Anthony Hsu commented on HIVE-15411:


Proposal is to extend the ADD PARTITION grammar to support the following:
{noformat}
ALTER TABLE table_name ADD [IF NOT EXISTS]
PARTITION (part_col='part_value', ...)
  [FILEFORMAT ]  -- new
  [SERDEPROPERTIES ('key1'='val', ...)]  -- new
  [LOCATION 'location1']
PARTITION (part_col='part_value', ...)
  [FILEFORMAT ]  -- new
  [SERDEPROPERTIES ('key1'='val', ...)]  -- new
  [LOCATION 'location2']
...;
{noformat}

> ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES
> ---
>
> Key: HIVE-15411
> URL: https://issues.apache.org/jira/browse/HIVE-15411
> Project: Hive
>  Issue Type: Improvement
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
>
> Currently, {{ALTER TABLE ... ADD PARTITION}} only lets you set the 
> partition's LOCATION but not its FILEFORMAT or SERDEPROPERTIES. In order to 
> change the FILEFORMAT or SERDEPROPERTIES, you have to issue two additional 
> calls to {{ALTER TABLE ... PARTITION ... SET FILEFORMAT}} and {{ALTER TABLE 
> ... PARTITION ... SET SERDEPROPERTIES}}. This is not atomic, and queries that 
> interleave the ALTER TABLE commands may fail.
> We should extend the grammar to support setting FILEFORMAT and 
> SERDEPROPERTIES atomically as part of the ADD PARTITION command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736347#comment-15736347
 ] 

Thejas M Nair commented on HIVE-15410:
--

[~ctang.ma] What is the complete valid set of property values ? Where/how do we 
restrict the acceptable values in hive sql ? Can it end with a "." or "-" ?
Also, can you include a unit test to the validate function ? 

> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15338) Wrong result from non-vectorized DATEDIFF with scalar parameter of type DATE/TIMESTAMP

2016-12-09 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736296#comment-15736296
 ] 

Jason Dere commented on HIVE-15338:
---

Changes look good, +1. Just see about the diff in vector_between_in

> Wrong result from non-vectorized DATEDIFF with scalar parameter of type 
> DATE/TIMESTAMP
> --
>
> Key: HIVE-15338
> URL: https://issues.apache.org/jira/browse/HIVE-15338
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15338.01.patch, HIVE-15338.02.patch, 
> HIVE-15338.03.patch, HIVE-15338.04.patch
>
>
> Vectorization in vectorized DATEDIFF accidentally treated the scalar 
> parameter is type DATE (e.g. CURRENT_DATE) as 0.
> Current Q file test vectorized_date_funcs.q DOES NOT test the DATE/TIMESTAMP 
> scalar type case.
> And, non-vectorized cases of DATEDIFF are using UTF and returning the wrong 
> results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736292#comment-15736292
 ] 

Hive QA commented on HIVE-14496:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842593/HIVE-14496.04.patch

{color:green}SUCCESS:{color} +1 due to 18 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 121 failed/errored test(s), 10784 tests 
executed
*Failed tests:*
{noformat}
TestHBaseImport - did not produce a TEST-*.xml file (likely timed out) 
(batchId=193)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=70)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_unionDistinct_2]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[unionDistinct_2]
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[unionDistinct_2] 
(batchId=91)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.alterInvalidation
 (batchId=199)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.altersInvalidation
 (batchId=199)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.hit 
(batchId=199)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.invalidation
 (batchId=199)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.someWithStats
 (batchId=199)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.allWithStats
 (batchId=186)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.noneWithStats
 (batchId=186)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.nonexistentPartitions
 (batchId=186)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions
 (batchId=186)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCacheWithBitVector.allPartitions
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.MiddleOfPartitionsHaveBitVectorStatus
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDouble
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusLong
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsOfPartitionsHaveBitVectorStatus
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusDecimal
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusDouble
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusLong
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusString
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.noPartitionsHaveBitVectorStatus
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.MiddleOfPartitionsHaveBitVectorStatus
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDecimal
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDouble
 (batchId=187)
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusLong
 (batchId=187)

[jira] [Updated] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15403:
-
Attachment: HIVE-15403.2.patch

updated comment

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736240#comment-15736240
 ] 

Eugene Koifman edited comment on HIVE-15048 at 12/9/16 8:26 PM:


WRT dynamic partitioning, that is also not new.  Update/delete statements have 
always ran with dyn part regardless of what WriteEntity objects there are there.
we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the 
lock management logic aware of.  HIVE-15032 is tracking improving this


was (Author: ekoifman):
WRT dynamic partitioning, that is also not new.  Update/delete statements have 
always ran with dyn part regardless of what WriteEntity objects there are there.
we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the 
lock management logic aware of.

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736243#comment-15736243
 ] 

Sergey Shelukhin commented on HIVE-15403:
-

 I don't think "This is to avoid ... " comment is correct - we don't need to 
kinit on every node, rather we have problems if someone kinits ;)
Otherwise looks good.

> LLAP: Login with kerberos before starting the daemon
> 
>
> Key: HIVE-15403
> URL: https://issues.apache.org/jira/browse/HIVE-15403
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-15403.1.patch
>
>
> In LLAP cluster, if some of the nodes are kinit'ed with some user (other than 
> default hive user) and some nodes are kinit'ed with hive user, both will end 
> up in different paths under zk registry and may not be reported by the llap 
> status tool. The reason for that is when creating zk paths we use 
> UGI.getCurrentUser() but current user may not be same across all nodes 
> (someone has to do global kinit). Before bringing up the daemon, if security 
> is enabled each daemons should login based on specified kerberos principal 
> and keytab for llap daemon service and update the current logged in user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736240#comment-15736240
 ] 

Eugene Koifman commented on HIVE-15048:
---

WRT dynamic partitioning, that is also not new.  Update/delete statements have 
always ran with dyn part regardless of what WriteEntity objects there are there.
we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the 
lock management logic aware of.

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736227#comment-15736227
 ] 

Eugene Koifman commented on HIVE-15048:
---

That is not what it does.  The code removes the table WriteEntity for target 
table and replaces it with some number of partition WriteEntity objects for 
that table.
So conceptually it does the same thing as before.

If you look at the new .q.out, the output shows the set inputs/outputs that it 
ends up with (not clearly highlight but they are there)

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15342) Add support for primary/foreign keys in HBase metastore

2016-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736104#comment-15736104
 ] 

Alan Gates commented on HIVE-15342:
---

I don't think so, as this just keeps the HBase metastore up to date with the 
RDBMS based one.

> Add support for primary/foreign keys in HBase metastore
> ---
>
> Key: HIVE-15342
> URL: https://issues.apache.org/jira/browse/HIVE-15342
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 2.2.0
>
> Attachments: HIVE-15342.patch
>
>
> When HIVE-13076 was committed the calls into the HBase metastore were stubbed 
> out.  We need to implement support for constraints in the HBase metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved

2016-12-09 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736090#comment-15736090
 ] 

Alan Gates commented on HIVE-15048:
---

I'm not sure I understand the change here.  The previous code looks like it was 
trying to avoid locking the whole table by figuring out which partitions would 
be read and only locking those partitions.  It looks like this goes wrong when 
there's a subquery involved, but in general should be sound.  If I understand 
your changes you're just moving it to always use dynamic partitioning.  But 
that locks the whole table, which we don't want.

> Update/Delete statement using wrong WriteEntity when subqueries are involved
> 
>
> Key: HIVE-15048
> URL: https://issues.apache.org/jira/browse/HIVE-15048
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, 
> HIVE-15048.03.patch, HIVE-15048.04.patch
>
>
> See TestDbTxnManager2 for referenced methods
> {noformat}
> checkCmdOnDriver(driver.run("create table target (a int, b int) " +
>   "partitioned by (p int, q int) clustered by (a) into 2  buckets " +
>   "stored as orc TBLPROPERTIES ('transactional'='true')"));
> checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, 
> q1 int) clustered by (a1) into 2  buckets stored as orc TBLPROPERTIES 
> ('transactional'='true')"));
> checkCmdOnDriver(driver.run("insert into target partition(p,q) values 
> (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)"));
> checkCmdOnDriver(driver.run(
>   "update source set b1 = 1 where p1 in (select t.q from target t where 
> t.p=2)"));
> {noformat}
> The last Update stmt creates the following Entity objects in the QueryPlan
> inputs: [default@source, default@target, default@target@p=2/q=2]
> outputs: [default@target@p=2/q=2]
> Which is clearly wrong for outputs - the target table is not even 
> partitioned(or called 'target').
> This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze()
> I suspect 
> update T ... where T.p IN (select d from T where ...) 
> type query would also get messed up (but not necessarily fail) if T is 
> partitioned and the subquery filters out some partitions but that does not 
> mean that the same partitions are filtered out in the parent query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)

2016-12-09 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-14731:

Status: Open  (was: Patch Available)

Resubmit patch for jenkins test.

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> 
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, 
> HIVE-14731.2.patch, HIVE-14731.3.patch, HIVE-14731.4.patch, 
> HIVE-14731.5.patch, HIVE-14731.6.patch, HIVE-14731.7.patch, 
> HIVE-14731.8.patch, HIVE-14731.9.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's 
> integrate it into Hive on Tez. This allows us to have more than one reducer 
> in cross product queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)

2016-12-09 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-14731:

Status: Patch Available  (was: Open)

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> 
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, 
> HIVE-14731.2.patch, HIVE-14731.3.patch, HIVE-14731.4.patch, 
> HIVE-14731.5.patch, HIVE-14731.6.patch, HIVE-14731.7.patch, 
> HIVE-14731.8.patch, HIVE-14731.9.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's 
> integrate it into Hive on Tez. This allows us to have more than one reducer 
> in cross product queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736071#comment-15736071
 ] 

Sahil Takiar commented on HIVE-15385:
-

The test failures seem unrelated, and they are failing in other QA runs.

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736061#comment-15736061
 ] 

Hive QA commented on HIVE-15385:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842388/HIVE-15385.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10798 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2525/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2525/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2525/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842388 - PreCommit-HIVE-Build

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15118:

Attachment: (was: HIVE-15118.2.patch)

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15118:

Comment: was deleted

(was: Patch-2: address comment to drop table for upgrade.)

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15118:

Attachment: HIVE-15118.2.patch

patch-2: address comments to remove table during upgrade.

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15118:

Attachment: HIVE-15118.2.patch

Patch-2: address comment to drop table for upgrade.

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14798) MSCK REPAIR TABLE throws null pointer exception

2016-12-09 Thread Carl Laird (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736022#comment-15736022
 ] 

Carl Laird commented on HIVE-14798:
---

This appears to be fixed in 2.1.1.

> MSCK REPAIR TABLE throws null pointer exception
> ---
>
> Key: HIVE-14798
> URL: https://issues.apache.org/jira/browse/HIVE-14798
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Anbu Cheeralan
>
> MSCK REPAIR TABLE statement throws null pointer exception in Hive 2.1
> I have tested the same against external/internal tables created both in HDFS 
> and in Google Cloud.
> The error shown in beeline/sql client 
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Hive Logs:
> 2016-09-20T17:28:00,717 ERROR [HiveServer2-Background-Pool: Thread-92]: 
> metadata.HiveMetaStoreChecker (:()) - java.lang.NullPointerException
> 2016-09-20T17:28:00,717 WARN  [HiveServer2-Background-Pool: Thread-92]: 
> exec.DDLTask (:()) - Failed to run metacheck: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:444)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:388)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:309)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:285)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:230)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1814)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:403)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
> at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:432)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:418)
> ... 4 more
> Here are the steps to recreate this issue:
> use default;
> DROP TABLE IF EXISTS repairtable;
> CREATE TABLE repairtable(col STRING) PARTITIONED BY (p1 STRING, p2 STRING);
> MSCK REPAIR TABLE default.repairtable;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736021#comment-15736021
 ] 

Sahil Takiar commented on HIVE-15385:
-

Thanks Ashutosh!

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736015#comment-15736015
 ] 

Ashutosh Chauhan commented on HIVE-15385:
-

yeah.. that was not conscious choice to alter behavior. Sounds good to restore 
documented behavior. Thanks for fixing this up. +1   

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735990#comment-15735990
 ] 

Sahil Takiar commented on HIVE-15385:
-

Thanks Sergio.

I believe [~ashutoshc] was working on  HIVE-12988, HIVE-13716, and HIVE-13933 - 
any chance you could comment on this JIRA. To summarize, Hive documentation 
claims that when {{hive.warehouse.subdir.inherit.perms}} is {{true}}, any 
failure to inherit permissions will not cause queries to fail, only a warning 
will be logged. It looks like the aforementioned JIRAs changed that by not 
catching exceptions thrown by {{HdfsUtils.setFullFileStatus}}, just wondering 
if that was intentional or not.

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl

2016-12-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735989#comment-15735989
 ] 

Ashutosh Chauhan commented on HIVE-14998:
-

+1

> Fix and update test: TestPluggableHiveSessionImpl
> -
>
> Key: HIVE-14998
> URL: https://issues.apache.org/jira/browse/HIVE-14998
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch
>
>
> this test either prints an exception to the stdout ... or not - in its 
> current form it doesn't really usefull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-14948) properly handle special characters in identifiers

2016-12-09 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14948 started by Eugene Koifman.
-
> properly handle special characters in identifiers
> -
>
> Key: HIVE-14948
> URL: https://issues.apache.org/jira/browse/HIVE-14948
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14948.01.patch, HIVE-14948.02.patch
>
>
> The treatment of quoted identifiers in HIVE-14943 is inconsistent.  Need to 
> clean this up and if possible only quote those identifiers that need to be 
> quoted in the generated SQL statement



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735983#comment-15735983
 ] 

Chaoyu Tang commented on HIVE-15410:


The failed tests are not related.

> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735970#comment-15735970
 ] 

Hive QA commented on HIVE-15410:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842591/HIVE-15410.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10792 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2524/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2524/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2524/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842591 - PreCommit-HIVE-Build

> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15376) Improve heartbeater scheduling for transactions

2016-12-09 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735968#comment-15735968
 ] 

Eugene Koifman commented on HIVE-15376:
---

How is the heartbeater going to be started for read only queries?


> Improve heartbeater scheduling for transactions
> ---
>
> Key: HIVE-15376
> URL: https://issues.apache.org/jira/browse/HIVE-15376
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-15376.1.patch, HIVE-15376.2.patch, 
> HIVE-15376.3.patch, HIVE-15376.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views

2016-12-09 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735938#comment-15735938
 ] 

Jesus Camacho Rodriguez commented on HIVE-14496:


[~ashutoshc], let's try to check this one in. I have addressed the most 
important comments in the last patch. In particular:
- Loading all materialized views definitions for all users when HS2 starts 
(instead of per session).
- Adding just an additional field for rewrite enabled (instead of creating a 
'view descriptor'). This simplified a lot the changes in the scripts to upgrade 
metastore.

I left for a follow-up:
- Extension of rules to match new patterns.

> Enable Calcite rewriting with materialized views
> 
>
> Key: HIVE-14496
> URL: https://issues.apache.org/jira/browse/HIVE-14496
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, 
> HIVE-14496.03.patch, HIVE-14496.04.patch, HIVE-14496.patch
>
>
> Calcite already supports query rewriting using materialized views. We will 
> use it to support this feature in Hive.
> In order to do that, we need to register the existing materialized views with 
> Calcite view service and enable the materialized views rewriting rules. 
> We should include a HiveConf flag to completely disable query rewriting using 
> materialized views if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14496) Enable Calcite rewriting with materialized views

2016-12-09 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14496:
---
Attachment: HIVE-14496.04.patch

> Enable Calcite rewriting with materialized views
> 
>
> Key: HIVE-14496
> URL: https://issues.apache.org/jira/browse/HIVE-14496
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, 
> HIVE-14496.03.patch, HIVE-14496.04.patch, HIVE-14496.patch
>
>
> Calcite already supports query rewriting using materialized views. We will 
> use it to support this feature in Hive.
> In order to do that, we need to register the existing materialized views with 
> Calcite view service and enable the materialized views rewriting rules. 
> We should include a HiveConf flag to completely disable query rewriting using 
> materialized views if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree

2016-12-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735927#comment-15735927
 ] 

Sergey Shelukhin commented on HIVE-15405:
-

+1 except for tests; is stats_based_fetch_decision new?

> Improve FileUtils.isPathWithinSubtree
> -
>
> Key: HIVE-15405
> URL: https://issues.apache.org/jira/browse/HIVE-15405
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, 
> HIVE-15405.2.patch
>
>
> When running single node LLAP with the following query multiple number of 
> times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} 
> became a hotpath. 
> {noformat}
> SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`,
> YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok`
> FROM `flights` as `flights`
> JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`)
> JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`)
> JOIN `airports` as `dest_airport` ON (`flights`.`dest` = 
> `dest_airport`.`iata`)
> GROUP BY YEAR(`flights`.`dateofflight`);
> {noformat}
> It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} 
> based on path depth comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735923#comment-15735923
 ] 

Chaoyu Tang commented on HIVE-15410:


[~daijy], [~thejas] you have done quite some work on the WebHCat, could you 
review the patch? Thanks


> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735908#comment-15735908
 ] 

Sergio Peña commented on HIVE-15385:


I deleted the HiveQA comment and allow retriggering the Jenkins job.

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15385:
---
Comment: was deleted

(was: 

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842388/HIVE-15385.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10745 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=126)

[ppd_transform.q,auto_join9.q,auto_join1.q,vector_data_types.q,input14.q,union30.q,input12.q,union_remove_22.q,vectorization_3.q,groupby1_map_nomap.q,cbo_union.q,disable_merge_for_bucketing.q,reduce_deduplicate_exclude_join.q,filter_join_breaktask2.q,join30.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2498/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2498/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2498/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842388 - PreCommit-HIVE-Build)

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-09 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735896#comment-15735896
 ] 

Sergio Peña commented on HIVE-15385:


Thanks [~stakiar] for the explanation. I spent some time investigating the 
history of this permission issue, and as you mentioned, the places where the 
IOException is not ignored could have been by accident.

Based on the history that permissions should not throw an exception and make 
the query to fail, I agree on just send a warning to the log inside the 
setFullFileStatus() instead of throwing an exception. This is confusing for 
people that want to use such method.

+1

I don't think tests failures are related, but could you re-attach the patch to 
see if the others stop failing?

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15410:
---
Attachment: HIVE-15410.patch

> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen

2016-12-09 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-15410:
---
Status: Patch Available  (was: Open)

> WebHCat supports get/set table property with its name containing period and 
> hyphen
> --
>
> Key: HIVE-15410
> URL: https://issues.apache.org/jira/browse/HIVE-15410
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-15410.patch
>
>
> Hive table properties could have period (.) or hyphen (-) in their names, 
> auto.purge is one of the examples. But WebHCat APIs does not support either 
> set or get these properties, and they throw out the error msg ""Invalid DDL 
> identifier :property". For example:
> {code}
> [root@ctang-1 ~]# curl -s 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser'
> {"error":"Invalid DDL identifier :property"}
> [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ 
> "value": "true" }' 
> 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/'
> {"error":"Invalid DDL identifier :property"}
> {code}
> This patch is going to add the supports to the property name containing 
> period and/or hyphen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735842#comment-15735842
 ] 

Hive QA commented on HIVE-14496:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842587/HIVE-14496.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2523/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2523/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2523/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/util/Tool.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/conf/Configurable.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/ClassNotFoundException.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/curator/curator-framework/2.7.1/curator-framework-2.7.1.jar(org/apache/curator/framework/CuratorFrameworkFactory.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/curator/curator-client/2.7.1/curator-client-2.7.1.jar(org/apache/curator/retry/ExponentialBackoffRetry.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.2/hadoop-mapreduce-client-core-2.7.2.jar(org/apache/hadoop/mapreduce/Mapper.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Iterator.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/LinkedList.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/ExecutorService.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/Executors.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/TimeUnit.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.2/hadoop-mapreduce-client-core-2.7.2.jar(org/apache/hadoop/mapreduce/Mapper$Context.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/net/URLDecoder.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Enumeration.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Properties.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/UriBuilder.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-2.2.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/LogUtils.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Class.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Annotation.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-annotations/2.7.2/hadoop-annotations-2.7.2.jar(org/apache/hadoop/classification/InterfaceAudience.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-annotations/2.7.2/hadoop-annotations-2.7.2.jar(org/apache/hadoop/classification/InterfaceAudience$LimitedPrivate.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Retention.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/RetentionPolicy.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Target.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/ElementType.class)]]
[loading 
ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/HttpMethod.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/SuppressWarnings.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Override.class)]]
[loading 
ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(sun/misc/Contended.class)]]
[loading 

[jira] [Updated] (HIVE-14496) Enable Calcite rewriting with materialized views

2016-12-09 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14496:
---
Attachment: HIVE-14496.03.patch

> Enable Calcite rewriting with materialized views
> 
>
> Key: HIVE-14496
> URL: https://issues.apache.org/jira/browse/HIVE-14496
> Project: Hive
>  Issue Type: Sub-task
>  Components: Materialized views
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, 
> HIVE-14496.03.patch, HIVE-14496.patch
>
>
> Calcite already supports query rewriting using materialized views. We will 
> use it to support this feature in Hive.
> In order to do that, we need to register the existing materialized views with 
> Calcite view service and enable the materialized views rewriting rules. 
> We should include a HiveConf flag to completely disable query rewriting using 
> materialized views if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15353) Metastore throws NPE if StorageDescriptor.cols is null

2016-12-09 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735806#comment-15735806
 ] 

Anthony Hsu commented on HIVE-15353:


HIVE-15353.3.patch seems to have been tested; the results just weren't 
auto-posted to this JIRA: 
https://builds.apache.org/job/PreCommit-HIVE-Build/2502/console. Looks like the 
PreCommit build is currently failing due to:
{noformat}
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] No compiler is provided in this environment. Perhaps you are running on 
a JRE rather than a JDK?
{noformat}

> Metastore throws NPE if StorageDescriptor.cols is null
> --
>
> Key: HIVE-15353
> URL: https://issues.apache.org/jira/browse/HIVE-15353
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-15353.1.patch, HIVE-15353.2.patch, 
> HIVE-15353.3.patch
>
>
> When using the HiveMetaStoreClient API directly to talk to the metastore, you 
> get NullPointerExceptions when StorageDescriptor.cols is null in the 
> Table/Partition object in the following calls:
> * create_table
> * alter_table
> * alter_partition
> Calling add_partition with StorageDescriptor.cols set to null causes null to 
> be stored in the metastore database and subsequent calls to alter_partition 
> for that partition to fail with an NPE.
> Null checks should be added to eliminate the NPEs in the metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735685#comment-15735685
 ] 

Hive QA commented on HIVE-15053:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842555/HIVE-15053.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10760 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a]
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=92)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2522/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2522/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2522/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842555 - PreCommit-HIVE-Build

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-12-09 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735657#comment-15735657
 ] 

Naveen Gangam commented on HIVE-15118:
--

[~aihuaxu] The main schema file changes looks good but it will not work for 
upgrade scenarios. When you upgrade from hive 2.1, this table will still exist. 
Can you also add changes for upgrade scenario? Thanks

> Remove unused 'COLUMNS' table from derby schema
> ---
>
> Key: HIVE-15118
> URL: https://issues.apache.org/jira/browse/HIVE-15118
> Project: Hive
>  Issue Type: Sub-task
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HIVE-15118.1.patch
>
>
> COLUMNS table is unused any more. Other databases already removed it. Remove 
> from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15392:

Attachment: HIVE-15392.2.patch

patch-2: minor change (a string output) during the patch submission.

> Refactoring the validate function of HiveSchemaTool to make the output 
> consistent
> -
>
> Key: HIVE-15392
> URL: https://issues.apache.org/jira/browse/HIVE-15392
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15392.1.patch, HIVE-15392.2.patch
>
>
> The validate output is not consistent. Make it more consistent.
> {noformat}
> Starting metastore validationValidating schema version
> Succeeded in schema version validation.
> Validating sequence number for SEQUENCE_TABLE
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Validating tables in the schema for version 2.2.0
> Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 
> tables
> Schema table validation successful
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Validating columns for incorrect NULL values
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Done with metastore validationschemaTool completed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent

2016-12-09 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15392:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Chaoyu for reviewing.

> Refactoring the validate function of HiveSchemaTool to make the output 
> consistent
> -
>
> Key: HIVE-15392
> URL: https://issues.apache.org/jira/browse/HIVE-15392
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15392.1.patch
>
>
> The validate output is not consistent. Make it more consistent.
> {noformat}
> Starting metastore validationValidating schema version
> Succeeded in schema version validation.
> Validating sequence number for SEQUENCE_TABLE
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Validating tables in the schema for version 2.2.0
> Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 
> tables
> Schema table validation successful
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Validating columns for incorrect NULL values
> Metastore connection URL:  
> jdbc:derby:;databaseName=metastore_db;create=true
> Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
> Metastore connection User: APP
> Done with metastore validationschemaTool completed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15391) Location validation for table should ignore the values for view.

2016-12-09 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735536#comment-15735536
 ] 

Yongzhi Chen commented on HIVE-15391:
-

The failures are not related.

> Location validation for table should ignore the values for view.
> 
>
> Key: HIVE-15391
> URL: https://issues.apache.org/jira/browse/HIVE-15391
> Project: Hive
>  Issue Type: Sub-task
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Minor
> Attachments: HIVE-15206.1.patch
>
>
> When use schematool to do location validation, we got error message for 
> views, for example:
> {noformat}
> n DB with Name: viewa
> NULL Location for TABLE with Name: viewa
> In DB with Name: viewa
> NULL Location for TABLE with Name: viewb
> In DB with Name: viewa
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735538#comment-15735538
 ] 

Hive QA commented on HIVE-14998:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842552/HIVE-14998.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10789 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=228)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2521/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2521/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2521/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842552 - PreCommit-HIVE-Build

> Fix and update test: TestPluggableHiveSessionImpl
> -
>
> Key: HIVE-14998
> URL: https://issues.apache.org/jira/browse/HIVE-14998
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch
>
>
> this test either prints an exception to the stdout ... or not - in its 
> current form it doesn't really usefull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735429#comment-15735429
 ] 

Hive QA commented on HIVE-15161:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842551/HIVE-15161.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10783 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=93)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2520/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2520/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2520/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842551 - PreCommit-HIVE-Build

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15053:

Attachment: HIVE-15053.2.patch

rebased patch..i hope nothing is broken ;)
patch#2 here is #1 on reviewboard ;)
https://reviews.apache.org/r/54585/

> Beeline#addlocaldriver - reduce classpath scanning
> --
>
> Key: HIVE-15053
> URL: https://issues.apache.org/jira/browse/HIVE-15053
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, 
> HIVE-15053.1.patch, HIVE-15053.2.patch
>
>
> There is a classpath scanning machinery inside {{ClassNameCompleter}}.
> I think the sole purpose of these things is to scan for jdbc drivers...(but 
> not entirely sure)
> if it is indeed looking for jdbc drivers..then possibly this can be removed 
> without any issues because modern jdbc drivers usually advertise their driver 
> as a service-loadable class for {{java.sql.Driver}}
> http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html
> Auto-Loading of JDBC Driver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735303#comment-15735303
 ] 

Hive QA commented on HIVE-15405:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842517/HIVE-15405.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10792 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=132)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2519/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2519/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842517 - PreCommit-HIVE-Build

> Improve FileUtils.isPathWithinSubtree
> -
>
> Key: HIVE-15405
> URL: https://issues.apache.org/jira/browse/HIVE-15405
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, 
> HIVE-15405.2.patch
>
>
> When running single node LLAP with the following query multiple number of 
> times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} 
> became a hotpath. 
> {noformat}
> SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`,
> YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok`
> FROM `flights` as `flights`
> JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`)
> JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`)
> JOIN `airports` as `dest_airport` ON (`flights`.`dest` = 
> `dest_airport`.`iata`)
> GROUP BY YEAR(`flights`.`dateofflight`);
> {noformat}
> It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} 
> based on path depth comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14998:

Attachment: HIVE-14998.2.patch

rebased patch to current master

patch(#2) uploaded to reviewboard(#1): 
https://reviews.apache.org/r/54584

> Fix and update test: TestPluggableHiveSessionImpl
> -
>
> Key: HIVE-14998
> URL: https://issues.apache.org/jira/browse/HIVE-14998
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch
>
>
> this test either prints an exception to the stdout ... or not - in its 
> current form it doesn't really usefull.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13306) Better Decimal vectorization

2016-12-09 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-13306:
--
Description: 
Decimal Vectorization Requirements

•   Today, the LongColumnVector, DoubleColumnVector, BytesColumnVector, 
TimestampColumnVector classes store the data as primitive Java data types long, 
double, or byte arrays for efficiency.
•   DecimalColumnVector is different - it has an array of Object references 
to HiveDecimal objects.
•   The HiveDecimal object uses an internal object BigDecimal for its 
implementation.  Further, BigDecimal itself uses an internal object BigInteger 
for its implementation, and BigInteger uses an int array.  4 objects total.
•   And, HiveDecimal is an immutable object which means arithmetic and 
other operations produce new HiveDecimal object with 3 new objects underneath.
•   A major reason Vectorization is fast is the ColumnVector classes except 
DecimalColumnVector do not have to allocate additional memory per row.   This 
avoids memory fragmentation and pressure on the Java Garbage Collector that 
DecimalColumnVector can generate.  It is very significant.
•   What can be done with DecimalColumnVector to make it much more 
efficient?
o   Design several new decimal classes that allow the caller to manage the 
decimal storage.
o   If it takes 2 long values to store a decimal then a new 
DecimalColumnVector would have a long[] of length 2*1024 (where 1024 is the 
default column vector size).
o   Why store a decimal in separate long values?
•   Java does not support 128 bit integers.
•   Java does not support unsigned integers.
•   Int array representation uses smaller memory, but long array 
representation covers wider value range for fast primitive operations.
•   But really since we do not have unsigned, really you can only do 
multiplications on N-1 bits or 63 bits.
•   So, 2 longs are needed for decimal storage of 38 digits.

Future works
o   It makes sense to have just one algorithm for decimals rather than one 
for HiveDecimal and another for DecimalColumnVector.  So, make HiveDecimal 
store 2 long values, too.
o   A lower level primitive decimal class would accept decimals stored as 
long arrays and produces results into long arrays.  It would be used by 
HiveDecimal and DecimalColumnVector.

  was:
Decimal Vectorization Requirements

•   Today, the LongColumnVector, DoubleColumnVector, BytesColumnVector, 
TimestampColumnVector classes store the data as primitive Java data types long, 
double, or byte arrays for efficiency.
•   DecimalColumnVector is different - it has an array of Object references 
to HiveDecimal objects.
•   The HiveDecimal object uses an internal object BigDecimal for its 
implementation.  Further, BigDecimal itself uses an internal object BigInteger 
for its implementation, and BigInteger uses an int array.  4 objects total.
•   And, HiveDecimal is an immutable object which means arithmetic and 
other operations produce new HiveDecimal object with 3 new objects underneath.
•   A major reason Vectorization is fast is the ColumnVector classes except 
DecimalColumnVector do not have to allocate additional memory per row.   This 
avoids memory fragmentation and pressure on the Java Garbage Collector that 
DecimalColumnVector can generate.  It is very significant.
•   What can be done with DecimalColumnVector to make it much more 
efficient?
o   Design several new decimal classes that allow the caller to manage the 
decimal storage.
o   If it takes N int values to store a decimal (e.g. N=1..5), then a new 
DecimalColumnVector would have an int[] of length N*1024 (where 1024 is the 
default column vector size).
o   Why store a decimal in separate int values?
•   Java does not support 128 bit integers.
•   Java does not support unsigned integers.
•   In order to do multiplication of a decimal represented in a long you 
need twice the storage (i.e. 128 bits).  So you need to represent parts in 32 
bit integers.
•   But really since we do not have unsigned, really you can only do 
multiplications on N-1 bits or 31 bits.
•   So, 5 ints are needed for decimal storage... of 38 digits.
o   It makes sense to have just one algorithm for decimals rather than one 
for HiveDecimal and another for DecimalColumnVector.  So, make HiveDecimal 
store N int values, too.
o   A lower level primitive decimal class would accept decimals stored as 
int arrays and produces results into int arrays.  It would be used by 
HiveDecimal and DecimalColumnVector.



> Better Decimal vectorization
> 
>
> Key: HIVE-13306
> URL: https://issues.apache.org/jira/browse/HIVE-13306
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>  

[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15161:

Attachment: HIVE-15161.4.patch

re-uploaded patch (a whole month have passed)

> migrate ColumnStats to use jackson
> --
>
> Key: HIVE-15161
> URL: https://issues.apache.org/jira/browse/HIVE-15161
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 2.2.0
>
> Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, 
> HIVE-15161.3.patch, HIVE-15161.4.patch, HIVE-15161.4.patch
>
>
> * json.org has license issues
> * jackson can provide a fully compatible alternative to it
> * there are a few flakiness issues caused by the order of the map entries of 
> the columns...this cat be addressed, org.json api was unfriendly in this 
> manner ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15407) add distcp to classpath by default, because hive depends on it.

2016-12-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735210#comment-15735210
 ] 

Hive QA commented on HIVE-15407:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12842515/HIVE-15407.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10792 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=91)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=91)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2518/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2518/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12842515 - PreCommit-HIVE-Build

> add distcp to classpath by default, because hive depends on it. 
> 
>
> Key: HIVE-15407
> URL: https://issues.apache.org/jira/browse/HIVE-15407
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, CLI
>Affects Versions: 2.2.0
>Reporter: Fei Hui
>Assignee: Fei Hui
> Attachments: HIVE-15407.1.patch
>
>
> when i run hive queries, i get errors as follow
> java.lang.NoClassDefFoundError: org/apache/hadoop/tools/DistCpOptions
> ...
> I dig into code, and find that hive depends on distcp ,but distcp is not in 
> classpath by default.
> I think if adding distcp to hadoop classpath by default in hadoop project, 
> but hadoop committers will not do that. discussions in HADOOP-13865 . They 
> propose that Resolving this problem on HIVE
> So i add distcp to classpath on HIVE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-15182:

Release Note:   (was: After I've started working on this it turned out, 
that the problem can't be addressed like this...

The reason behind these "code too large" issues are that antlr generates a 
bunch of things to try to prevail in "hard-to-decide which rule will match this 
one" situations. )

> Move 'clause' rules from IdentifierParser to a different file
> -
>
> Key: HIVE-15182
> URL: https://issues.apache.org/jira/browse/HIVE-15182
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> I'm hitting antlr / code too large errors ; and these rules belong to a 
> different class than the other.
> Moving them to a separate file greatly reduces generated IdentifierParser 
> size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file

2016-12-09 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735205#comment-15735205
 ] 

Zoltan Haindrich commented on HIVE-15182:
-

After I've started working on this it turned out, that the problem can't be 
addressed like this...

The reason behind these "code too large" issues are that antlr generates a 
bunch of things to try to prevail in "hard-to-decide which rule will match this 
one" situations. 

> Move 'clause' rules from IdentifierParser to a different file
> -
>
> Key: HIVE-15182
> URL: https://issues.apache.org/jira/browse/HIVE-15182
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> I'm hitting antlr / code too large errors ; and these rules belong to a 
> different class than the other.
> Moving them to a separate file greatly reduces generated IdentifierParser 
> size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-15182.
-
  Resolution: Fixed
Release Note: 
After I've started working on this it turned out, that the problem can't be 
addressed like this...

The reason behind these "code too large" issues are that antlr generates a 
bunch of things to try to prevail in "hard-to-decide which rule will match this 
one" situations. 

> Move 'clause' rules from IdentifierParser to a different file
> -
>
> Key: HIVE-15182
> URL: https://issues.apache.org/jira/browse/HIVE-15182
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> I'm hitting antlr / code too large errors ; and these rules belong to a 
> different class than the other.
> Moving them to a separate file greatly reduces generated IdentifierParser 
> size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-1478) Non-boolean expression in WHERE should be rejected

2016-12-09 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-1478:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Non-boolean expression in WHERE should be rejected
> --
>
> Key: HIVE-1478
> URL: https://issues.apache.org/jira/browse/HIVE-1478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Paul Yang
>Assignee: Zoltan Haindrich
>Priority: Minor
> Attachments: HIVE-1478.1.patch, HIVE-1478.2.patch
>
>
> Automatically casting strings or other types into boolean may confuse even 
> the user - and somehow it doesn't always work (HIVE-15089)
> sql2011 states that "where expression" should accept a boolean expression.
> Original reported problem:
> If the expression in the where clause does not evaluate to a boolean, the job 
> will fail with the following exception in the task logs:
> Query:
> SELECT key FROM src WHERE 1;
> Exception in mapper:
> 2010-07-21 17:00:31,460 FATAL ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"key":"238","value":"val_238"}
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:180)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>   at org.apache.hadoop.mapred.Child.main(Child.java:159)
> Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Boolean
>   at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:45)
>   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
>   ... 5 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >