[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries
[ https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-15192: --- Attachment: HIVE-15192.6.patch Updated subquery remove rule and de-correlation to support NOT IN queries and fixed bunch of other issues. > Use Calcite to de-correlate and plan subqueries > --- > > Key: HIVE-15192 > URL: https://issues.apache.org/jira/browse/HIVE-15192 > Project: Hive > Issue Type: Task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, > HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch > > > Currently support of subqueries is limited [Link to original spec | > https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf]. > Using Calcite to plan and de-correlate subqueries will help Hive get rid of > these limitations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries
[ https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-15192: --- Status: Patch Available (was: Open) > Use Calcite to de-correlate and plan subqueries > --- > > Key: HIVE-15192 > URL: https://issues.apache.org/jira/browse/HIVE-15192 > Project: Hive > Issue Type: Task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, > HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch > > > Currently support of subqueries is limited [Link to original spec | > https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf]. > Using Calcite to plan and de-correlate subqueries will help Hive get rid of > these limitations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15192) Use Calcite to de-correlate and plan subqueries
[ https://issues.apache.org/jira/browse/HIVE-15192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-15192: --- Status: Open (was: Patch Available) > Use Calcite to de-correlate and plan subqueries > --- > > Key: HIVE-15192 > URL: https://issues.apache.org/jira/browse/HIVE-15192 > Project: Hive > Issue Type: Task > Components: Logical Optimizer >Reporter: Vineet Garg >Assignee: Vineet Garg > Labels: sub-query > Attachments: HIVE-15192.2.patch, HIVE-15192.3.patch, > HIVE-15192.4.patch, HIVE-15192.5.patch, HIVE-15192.6.patch, HIVE-15192.patch > > > Currently support of subqueries is limited [Link to original spec | > https://issues.apache.org/jira/secure/attachment/12614003/SubQuerySpec.pdf]. > Using Calcite to plan and de-correlate subqueries will help Hive get rid of > these limitations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15413) Primary key constraints forced to be unique across database and table names
[ https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737109#comment-15737109 ] Sergey Shelukhin commented on HIVE-15413: - I'm pretty sure most RDBMSes require unique constraint name within one database. > Primary key constraints forced to be unique across database and table names > --- > > Key: HIVE-15413 > URL: https://issues.apache.org/jira/browse/HIVE-15413 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Alan Gates >Priority: Critical > > In the RDBMS underlying the metastore the table that stores primary and > foreign keys has it's own primary key (at the RDBMS level) of > (constraint_name, position). This means that a constraint name must be > unique across all tables and databases in a system. This is not reasonable. > Database and table name should be included in the RDBMS primary key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15147) LLAP: use LLAP cache for non-columnar formats in a somewhat general way
[ https://issues.apache.org/jira/browse/HIVE-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15147: Attachment: HIVE-15147.01.WIP.noout.patch 2nd WIP patch with almost-proper cache support. Not sure if cache actually works (need to check), but it does pass the test. Also rebased on top HIVE-14453 that was committed recently. Cache evicted buffer cleanup thread is not implemented, but it's a simple change. Next steps: 0) Implementing cache cleanup of evicted buffers. 1) Separating file into slices instead of caching it at split granularity (the read-time support for slices is already there, need write-time support, so to speak). 2) The above - LlapTextInputFormat claiming to be vectorized hack. 3) Cleanup, logging, more testing, etc. > LLAP: use LLAP cache for non-columnar formats in a somewhat general way > --- > > Key: HIVE-15147 > URL: https://issues.apache.org/jira/browse/HIVE-15147 > Project: Hive > Issue Type: New Feature >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15147.01.WIP.noout.patch, HIVE-15147.WIP.noout.patch > > > The primary goal for the first pass is caching text files. Nothing would > prevent other formats from using the same path, in principle, although, as > was originally done with ORC, it may be better to have native caching support > optimized for each particular format. > Given that caching pure text is not smart, and we already have ORC-encoded > cache that is columnar due to ORC file structure, we will transform data into > columnar ORC. > The general idea is to treat all the data in the world as merely ORC that was > compressed with some poor compression codec, such as csv. Using the original > IF and serde, as well as an ORC writer (with some heavyweight optimizations > disabled, potentially), we can "uncompress" the csv/whatever data into its > "original" ORC representation, then cache it efficiently, by column, and also > reuse a lot of the existing code. > Various other points: > 1) Caching granularity will have to be somehow determined (i.e. how do we > slice the file horizontally, to avoid caching entire columns). As with ORC > uncompressed files, the specific offsets don't really matter as long as they > are consistent between reads. The problem is that the file offsets will > actually need to be propagated to the new reader from the original > inputformat. Row counts are easier to use but there's a problem of how to > actually map them to missing ranges to read from disk. > 2) Obviously, for row-based formats, if any one column that is to be read has > been evicted or is otherwise missing, "all the columns" have to be read for > the corresponding slice to cache and read that one column. The vague plan is > to handle this implicitly, similarly to how ORC reader handles CB-RG overlaps > - it will just so happen that a missing column in disk range list to retrieve > will expand the disk-range-to-read into the whole horizontal slice of the > file. > 3) Granularity/etc. won't work for gzipped text. If anything at all is > evicted, the entire file has to be re-read. Gzipped text is a ridiculous > feature, so this is by design. > 4) In future, it would be possible to also build some form or > metadata/indexes for this cached data to do PPD, etc. This is out of the > scope for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15401) Import constraints into HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737066#comment-15737066 ] Hive QA commented on HIVE-15401: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842638/HIVE-15401.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10769 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=143) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2532/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2532/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2532/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842638 - PreCommit-HIVE-Build > Import constraints into HBase metastore > --- > > Key: HIVE-15401 > URL: https://issues.apache.org/jira/browse/HIVE-15401 > Project: Hive > Issue Type: Sub-task > Components: HBase Metastore >Affects Versions: 2.1.1 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-15401.patch > > > Since HIVE-15342 added support for primary and foreign keys in the HBase > metastore we should support them in HBaseImport as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15112) Implement Parquet vectorization reader for Struct type
[ https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-15112: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Push to the upstream. Thanks [~csun] for the review. > Implement Parquet vectorization reader for Struct type > -- > > Key: HIVE-15112 > URL: https://issues.apache.org/jira/browse/HIVE-15112 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Fix For: 2.2.0 > > Attachments: HIVE-15112.1.patch, HIVE-15112.2.patch, HIVE-15112.patch > > > Like HIVE-14815, we need support Parquet vectorized reader for struct type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15112) Implement Parquet vectorization reader for Struct type
[ https://issues.apache.org/jira/browse/HIVE-15112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737061#comment-15737061 ] ASF GitHub Bot commented on HIVE-15112: --- Github user asfgit closed the pull request at: https://github.com/apache/hive/pull/116 > Implement Parquet vectorization reader for Struct type > -- > > Key: HIVE-15112 > URL: https://issues.apache.org/jira/browse/HIVE-15112 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-15112.1.patch, HIVE-15112.2.patch, HIVE-15112.patch > > > Like HIVE-14815, we need support Parquet vectorized reader for struct type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree
[ https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736990#comment-15736990 ] Sergey Shelukhin commented on HIVE-15405: - Nm, it looks like a newly broken test today in all jiras. +1 > Improve FileUtils.isPathWithinSubtree > - > > Key: HIVE-15405 > URL: https://issues.apache.org/jira/browse/HIVE-15405 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, > HIVE-15405.2.patch > > > When running single node LLAP with the following query multiple number of > times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} > became a hotpath. > {noformat} > SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} > based on path depth comparison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15414) Fix batchSize for TestNegativeCliDriver
[ https://issues.apache.org/jira/browse/HIVE-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736979#comment-15736979 ] Vihang Karajgaonkar commented on HIVE-15414: [~sseth] [~spena] Do you know if batchSize is kept as 1000 intentionally? > Fix batchSize for TestNegativeCliDriver > --- > > Key: HIVE-15414 > URL: https://issues.apache.org/jira/browse/HIVE-15414 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > While analyzing the console output of pre-commit console logs, I noticed that > TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right. > 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 > PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, > queryFilesProperty=qfile, > name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more.. > > I think {{qFileTest.clientNegative.batchSize = 1000}} in > {{test-configuration2.properties}} is probably the batchSize is the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15414) Fix batchSize for TestNegativeCliDriver
[ https://issues.apache.org/jira/browse/HIVE-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-15414: --- Description: While analyzing the console output of pre-commit console logs, I noticed that TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right. 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, queryFilesProperty=qfile, name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more.. I think {{qFileTest.clientNegative.batchSize = 1000}} in {{test-configuration2.properties}} is probably the reason. was: While analyzing the console output of pre-commit console logs, I noticed that TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right. 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, queryFilesProperty=qfile, name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more.. I think {{qFileTest.clientNegative.batchSize = 1000}} in {{test-configuration2.properties}} is probably the batchSize is the reason. > Fix batchSize for TestNegativeCliDriver > --- > > Key: HIVE-15414 > URL: https://issues.apache.org/jira/browse/HIVE-15414 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > While analyzing the console output of pre-commit console logs, I noticed that > TestNegativeCliDriver batchSize ~770 qfiles which doesn't look right. > 2016-12-09 22:23:58,945 DEBUG [TestExecutor] ExecutionPhase.execute:96 > PBatch: QFileTestBatch [batchId=84, size=774, driver=TestNegativeCliDriver, > queryFilesProperty=qfile, > name=84-TestNegativeCliDriver-nopart_insert.q-input41.q-having1.q-and-771-more.. > > I think {{qFileTest.clientNegative.batchSize = 1000}} in > {{test-configuration2.properties}} is probably the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree
[ https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736970#comment-15736970 ] Rajesh Balamohan commented on HIVE-15405: - Thanks [~sershe]. TestMiniLlapLocalCliDriver#stats_based_fetch_decision did not fail in my env. Will check and update. > Improve FileUtils.isPathWithinSubtree > - > > Key: HIVE-15405 > URL: https://issues.apache.org/jira/browse/HIVE-15405 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, > HIVE-15405.2.patch > > > When running single node LLAP with the following query multiple number of > times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} > became a hotpath. > {noformat} > SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} > based on path depth comparison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree
[ https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736962#comment-15736962 ] Hive QA commented on HIVE-15405: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842517/HIVE-15405.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10768 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=143) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=91) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2531/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2531/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2531/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842517 - PreCommit-HIVE-Build > Improve FileUtils.isPathWithinSubtree > - > > Key: HIVE-15405 > URL: https://issues.apache.org/jira/browse/HIVE-15405 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, > HIVE-15405.2.patch > > > When running single node LLAP with the following query multiple number of > times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} > became a hotpath. > {noformat} > SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} > based on path depth comparison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release
[ https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736916#comment-15736916 ] Sergey Shelukhin commented on HIVE-14007: - Another consideration is whether users configs will break. Do the configs all have the same name? Perhaps it's possible to add a fallback option to old names in OrcConf otherwise > Replace ORC module with ORC release > --- > > Key: HIVE-14007 > URL: https://issues.apache.org/jira/browse/HIVE-14007 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.2.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, > HIVE-14007.patch > > > This completes moving the core ORC reader & writer to the ORC project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736905#comment-15736905 ] Sergey Shelukhin commented on HIVE-15397: - Btw, I am too lazy to rerun it again now, but I think the current master is inconsistent, cause the out file changes that removed the rows on the first run removed the rows because I disabled metadata-only by default. So, non-optimized path on master doesn't return such rows, but metadata-only does return them. After the patch, neither does. > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736908#comment-15736908 ] Sergey Shelukhin commented on HIVE-15397: - Now all I need is a +1 :P > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6365) Alter a partition to be of a different fileformat than the Table's fileformat. Use insert overwrite to write data to this partition. The partition fileformat is converted
[ https://issues.apache.org/jira/browse/HIVE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Hsu updated HIVE-6365: -- Summary: Alter a partition to be of a different fileformat than the Table's fileformat. Use insert overwrite to write data to this partition. The partition fileformat is converted back to table's fileformat after the insert operation. (was: Alter a partition to be of a different fileformat than the Table's fileformat. Use insert overwrite to write data to this partition. The partition fileformat is coverted back to table's fileformat after the insert operation. ) > Alter a partition to be of a different fileformat than the Table's > fileformat. Use insert overwrite to write data to this partition. The > partition fileformat is converted back to table's fileformat after the insert > operation. > -- > > Key: HIVE-6365 > URL: https://issues.apache.org/jira/browse/HIVE-6365 > Project: Hive > Issue Type: Bug > Environment: emr >Reporter: Pavan Srinivas > > Lets say, there is partitioned table like > Step1: > >> CREATE TABLE srcpart (key STRING, value STRING) > PARTITIONED BY (ds STRING, hr STRING) > STORED AS TEXTFILE; > Step2: > Alter the fileformat for a specific available partition. > >> alter table srcpart partition(ds="2008-04-08", hr="12") set fileformat > >> orc; > Step3: > Describe the partition. > >> desc formatted srcpart partition(ds="2008-04-08", hr="12") > . > # Storage Information > SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde > InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > Step4: > Write the data to this partition using insert overwrite. > >>insert overwrite table srcpart partition(ds="2008-04-08",hr="12") select > >>key, value from ... > Step5: > Describe the partition again. > >> desc formatted srcpart partition(ds="2008-04-08", hr="12") > . > # Storage Information > SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format1 > The fileformat of the partition is converted back to the table's original > fileformat. It should have retained and written the data in the modified > fileformat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release
[ https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736863#comment-15736863 ] Owen O'Malley commented on HIVE-14007: -- Ok, I've updated the pull request https://github.com/apache/hive/pull/81 and will post a new patch once ORC 1.2.3 is released. The patch ensures that the moved config variables don't create an error by adding them to the exception map. https://github.com/omalley/hive/commit/d078aea84ecb1fe7d3e9b95cc845dfbdee63587c#diff-7f16e0de4170e5b6c031990da80f5643 As to why the patch is changing test files, it is because the ORC project has a couple of fixes and features that hadn't been back ported to Hive. In particular, * ORC-101 fixes the bloom filters to use utf-8 rather than the jvm default encoding. This changes the size of the ORC files and their write version. There is an option to write the old broken bloom filters in addition to the new ones. * ORC-54 makes the default for schema evolution to be by name instead of position if the ORC files has real column names. Some of the tests required the old behavior and so I changed the names to match for the intended matches. Note that Hive 1.x never got the fix to encode the real column names in the file metadata, so all files written by Hive 1.x will use positional matching. > Replace ORC module with ORC release > --- > > Key: HIVE-14007 > URL: https://issues.apache.org/jira/browse/HIVE-14007 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.2.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, > HIVE-14007.patch > > > This completes moving the core ORC reader & writer to the ORC project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14870) OracleStore: RawStore implementation optimized for Oracle
[ https://issues.apache.org/jira/browse/HIVE-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736848#comment-15736848 ] Alan Gates commented on HIVE-14870: --- Do you have code you could post showing the OracleStore? Obviously it won't be ready for inclusion but it would be helpful to see it. Did you consider using array types to store things like columns and parameters? I know this will decrease portability since every database does collection types differently (and some don't do them at all), but it would also remove additional calls or joins from a number of operations. Did you do any experimentation on the trade offs of duplicating data versus performance? For example, one could play with storing a storage descriptor in the partition or table object. Or serdes inside of a storage descriptor, etc. This would obviously grow the amount of data in the metastore but again reduce the number of calls or joins to fetch data. > OracleStore: RawStore implementation optimized for Oracle > - > > Key: HIVE-14870 > URL: https://issues.apache.org/jira/browse/HIVE-14870 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: OracleStoreDesignProposal.pdf > > > The attached document is a proposal for a RawStore implementation which is > optimized for Oracle and replaces DataNucleus. The document outlines schema > changes, OracleStore implementation details, and performance tests against > ObjectStore, ObjectStore+DirectSQL, and OracleStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736821#comment-15736821 ] Ashutosh Chauhan commented on HIVE-15397: - This is interesting, because Hive allows you to create partitions without any data. That will result in a partitioning column having a value. So, shall we assume that table has row(s) with partitioning column taking supplied value and other columns being null. I think no. This was the case earlier and I think its wrong. I think behavior we are getting now is correct. If partition exists but its empty, we should consider partition has 0 rows, thus value for partitioning column should not matter during query evaluation. So, max(partCol) from empty_table should be null even when there is a partition which has partcol = 1. So, I think behavior we are getting after patch is correct and desired. > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15401) Import constraints into HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-15401: -- Status: Patch Available (was: Open) > Import constraints into HBase metastore > --- > > Key: HIVE-15401 > URL: https://issues.apache.org/jira/browse/HIVE-15401 > Project: Hive > Issue Type: Sub-task > Components: HBase Metastore >Affects Versions: 2.1.1 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-15401.patch > > > Since HIVE-15342 added support for primary and foreign keys in the HBase > metastore we should support them in HBaseImport as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15401) Import constraints into HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-15401: -- Attachment: HIVE-15401.patch In addition to adding the import of constraints to HBaseImport this patch fixes a bug in HBaseStore where it was reversing the primary key and foreign key fields in HBaseStore.getForeignKeys() causing it to return wrong results. > Import constraints into HBase metastore > --- > > Key: HIVE-15401 > URL: https://issues.apache.org/jira/browse/HIVE-15401 > Project: Hive > Issue Type: Sub-task > Components: HBase Metastore >Affects Versions: 2.1.1 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-15401.patch > > > Since HIVE-15342 added support for primary and foreign keys in the HBase > metastore we should support them in HBaseImport as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15413) Primary key constraints forced to be unique across database and table names
[ https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-15413: - Target Version/s: 2.2.0 > Primary key constraints forced to be unique across database and table names > --- > > Key: HIVE-15413 > URL: https://issues.apache.org/jira/browse/HIVE-15413 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Alan Gates >Priority: Critical > > In the RDBMS underlying the metastore the table that stores primary and > foreign keys has it's own primary key (at the RDBMS level) of > (constraint_name, position). This means that a constraint name must be > unique across all tables and databases in a system. This is not reasonable. > Database and table name should be included in the RDBMS primary key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15413) Primary key constraints forced to be unique across database and table names
[ https://issues.apache.org/jira/browse/HIVE-15413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736687#comment-15736687 ] Alan Gates commented on HIVE-15413: --- This applies to foreign keys as well, meaning even a foreign key and primary key cannot share a name. > Primary key constraints forced to be unique across database and table names > --- > > Key: HIVE-15413 > URL: https://issues.apache.org/jira/browse/HIVE-15413 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Alan Gates >Priority: Critical > > In the RDBMS underlying the metastore the table that stores primary and > foreign keys has it's own primary key (at the RDBMS level) of > (constraint_name, position). This means that a constraint name must be > unique across all tables and databases in a system. This is not reasonable. > Database and table name should be included in the RDBMS primary key. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15342) Add support for primary/foreign keys in HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-15342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736677#comment-15736677 ] Lefty Leverenz commented on HIVE-15342: --- Okay, thanks Alan. > Add support for primary/foreign keys in HBase metastore > --- > > Key: HIVE-15342 > URL: https://issues.apache.org/jira/browse/HIVE-15342 > Project: Hive > Issue Type: Improvement > Components: HBase Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 2.2.0 > > Attachments: HIVE-15342.patch > > > When HIVE-13076 was committed the calls into the HBase metastore were stubbed > out. We need to implement support for constraints in the HBase metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736659#comment-15736659 ] Lefty Leverenz commented on HIVE-15403: --- Okay, thanks Prasanth. > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736653#comment-15736653 ] Prasanth Jayachandran commented on HIVE-15403: -- I don't think it is required. This is mostly a bug fix. > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736648#comment-15736648 ] Lefty Leverenz commented on HIVE-15403: --- Does this need to be documented in the wiki? > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736608#comment-15736608 ] Hive QA commented on HIVE-15397: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842617/HIVE-15397.01.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10793 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=91) org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=213) org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery (batchId=215) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2529/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2529/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2529/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842617 - PreCommit-HIVE-Build > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736604#comment-15736604 ] Prasanth Jayachandran commented on HIVE-15403: -- Test failures are unrelated to this patch. Already tracked in HIVE-15058 > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15403: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15329) NullPointerException might occur when create table
[ https://issues.apache.org/jira/browse/HIVE-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15329: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master! Thanks [~winningalong] for the contribution! > NullPointerException might occur when create table > -- > > Key: HIVE-15329 > URL: https://issues.apache.org/jira/browse/HIVE-15329 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.0.0, 2.1.0 >Reporter: Meilong Huang >Assignee: Meilong Huang > Labels: metastore > Fix For: 2.2.0 > > Attachments: HIVE-15329.1.patch > > > NullPointerException might occur if table.getParameters() returns null when > method isNonNativeTable is invoked in class MetaStoreUtils. > {code} > public static boolean isNonNativeTable(Table table) { > if (table == null) { > return false; > } > return > (table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != > null); > } > {code} > This will cause a stack trace without any suggestive information at client: > {code} > org.apache.hadoop.hive.metastore.api.MetaException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15329) NullPointerException might occur when create table
[ https://issues.apache.org/jira/browse/HIVE-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736584#comment-15736584 ] Prasanth Jayachandran commented on HIVE-15329: -- The test failures are not related to this patch and are already failing in master. Failures are tracked in HIVE-15058 > NullPointerException might occur when create table > -- > > Key: HIVE-15329 > URL: https://issues.apache.org/jira/browse/HIVE-15329 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.0.0, 2.1.0 >Reporter: Meilong Huang >Assignee: Meilong Huang > Labels: metastore > Attachments: HIVE-15329.1.patch > > > NullPointerException might occur if table.getParameters() returns null when > method isNonNativeTable is invoked in class MetaStoreUtils. > {code} > public static boolean isNonNativeTable(Table table) { > if (table == null) { > return false; > } > return > (table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != > null); > } > {code} > This will cause a stack trace without any suggestive information at client: > {code} > org.apache.hadoop.hive.metastore.api.MetaException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_with_environment_context_result$create_table_with_environment_context_resultStandardScheme.read... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time
[ https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736556#comment-15736556 ] Zoltan Haindrich commented on HIVE-14735: - Thank you for the command [~stakiar], i've added it to the patch. I've uploaded #3: I hope I didn't break anything ...the ptest execution will shed light on this. [~spena] i've addressed much of your comments (however I still use fixed version for the maven plugins - i've forgot fix that) and also...i've missed your previous question about "where the downloaded file is": it's inside the local maven repository. i've changed the following: * added a project to repack the spark artifact under dev-support, with a readme describing the procedure * {{itests/thirparty}} is now a module - this way these maven "tricks" are isolated, other modules rely on that thirdparty have already finished - this also enabled to support even multiple spark versions - which may come handy for people who switch between branches which pull different spark version * it now only unpacks the spark assembly to only 1 place [~spena] what do you think about the new changes? > Build Infra: Spark artifacts download takes a long time > --- > > Key: HIVE-14735 > URL: https://issues.apache.org/jira/browse/HIVE-14735 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Vaibhav Gumashta >Assignee: Zoltan Haindrich > Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, > HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, HIVE-14735.3.patch > > > In particular this command: > {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz > http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release
[ https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736547#comment-15736547 ] Gunther Hagleitner commented on HIVE-14007: --- Moving the config vars over to ORC and removing them from HiveConf will still break existing apps, because hive throws an error when it doesn't know about a config var that starts with hive as a precaution afaik. Why is this patch changing test files? Seems this is changing behavior in addition to removing the files? > Replace ORC module with ORC release > --- > > Key: HIVE-14007 > URL: https://issues.apache.org/jira/browse/HIVE-14007 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.2.0 >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-14007.patch, HIVE-14007.patch, HIVE-14007.patch, > HIVE-14007.patch > > > This completes moving the core ORC reader & writer to the ORC project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-15385: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Thanks [~stakiar]. I committed this to master. > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Fix For: 2.2.0 > > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14735) Build Infra: Spark artifacts download takes a long time
[ https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-14735: Attachment: HIVE-14735.3.patch > Build Infra: Spark artifacts download takes a long time > --- > > Key: HIVE-14735 > URL: https://issues.apache.org/jira/browse/HIVE-14735 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Vaibhav Gumashta >Assignee: Zoltan Haindrich > Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, > HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, HIVE-14735.3.patch > > > In particular this command: > {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz > http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736495#comment-15736495 ] Sergey Shelukhin commented on HIVE-15403: - +1 > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736491#comment-15736491 ] Hive QA commented on HIVE-15403: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842611/HIVE-15403.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10792 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2528/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2528/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2528/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842611 - PreCommit-HIVE-Build > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736456#comment-15736456 ] Sergey Shelukhin edited comment on HIVE-15397 at 12/9/16 9:59 PM: -- Updated the out files, expanded one test to run with and without metadataonly enabled to make sure results are consistent, fixed the typo. was (Author: sershe): Updated the out files, expanded one test to run with and without metadataonly enable to make sure results are consistent, fixed the typo. > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15397: Attachment: HIVE-15397.01.patch Updated the out files, expanded one test to run with and without metadataonly enable to make sure results are consistent, fixed the typo. > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.01.patch, HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15397) metadata-only queries may return incorrect results with empty tables
[ https://issues.apache.org/jira/browse/HIVE-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736432#comment-15736432 ] Sergey Shelukhin commented on HIVE-15397: - Interesting q file changes.. according to our take on 1=1 group by 1=1 they are correct. E.g. table has 3 partitions, part=a, part=b, and part=c. Only a and c have data. select distinct part from t used to return "a, b, c". However, there are no rows in the table that actually have value b. So, the result has changed to "a, c". [~ashutoshc] [~jcamachorodriguez] would you say it's the correct change and previous result is incorrect? Same for max(partcol) from an empty table - should it be null? Cause there are no rows in the table to derive max from, similar how there are no rows in gby 1=1 to group by. > metadata-only queries may return incorrect results with empty tables > > > Key: HIVE-15397 > URL: https://issues.apache.org/jira/browse/HIVE-15397 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15397.patch > > > Queries like select 1=1 from t group by 1=1 may return rows, based on > OneNullRowInputFormat, even if the source table is empty. For now, add some > basic detection of empty tables and turn this off by default (since we can't > know whether a table is empty or not based on there being some files, without > reading them). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736388#comment-15736388 ] Hive QA commented on HIVE-15118: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842601/HIVE-15118.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10792 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=91) org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade (batchId=209) org.apache.hive.hcatalog.api.TestHCatClientNotification.createTable (batchId=219) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2527/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2527/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2527/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842601 - PreCommit-HIVE-Build > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15411) ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES
[ https://issues.apache.org/jira/browse/HIVE-15411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736385#comment-15736385 ] Anthony Hsu commented on HIVE-15411: Proposal is to extend the ADD PARTITION grammar to support the following: {noformat} ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION (part_col='part_value', ...) [FILEFORMAT ] -- new [SERDEPROPERTIES ('key1'='val', ...)] -- new [LOCATION 'location1'] PARTITION (part_col='part_value', ...) [FILEFORMAT ] -- new [SERDEPROPERTIES ('key1'='val', ...)] -- new [LOCATION 'location2'] ...; {noformat} > ADD PARTITION should support setting FILEFORMAT and SERDEPROPERTIES > --- > > Key: HIVE-15411 > URL: https://issues.apache.org/jira/browse/HIVE-15411 > Project: Hive > Issue Type: Improvement >Reporter: Anthony Hsu >Assignee: Anthony Hsu > > Currently, {{ALTER TABLE ... ADD PARTITION}} only lets you set the > partition's LOCATION but not its FILEFORMAT or SERDEPROPERTIES. In order to > change the FILEFORMAT or SERDEPROPERTIES, you have to issue two additional > calls to {{ALTER TABLE ... PARTITION ... SET FILEFORMAT}} and {{ALTER TABLE > ... PARTITION ... SET SERDEPROPERTIES}}. This is not atomic, and queries that > interleave the ALTER TABLE commands may fail. > We should extend the grammar to support setting FILEFORMAT and > SERDEPROPERTIES atomically as part of the ADD PARTITION command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736347#comment-15736347 ] Thejas M Nair commented on HIVE-15410: -- [~ctang.ma] What is the complete valid set of property values ? Where/how do we restrict the acceptable values in hive sql ? Can it end with a "." or "-" ? Also, can you include a unit test to the validate function ? > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15338) Wrong result from non-vectorized DATEDIFF with scalar parameter of type DATE/TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736296#comment-15736296 ] Jason Dere commented on HIVE-15338: --- Changes look good, +1. Just see about the diff in vector_between_in > Wrong result from non-vectorized DATEDIFF with scalar parameter of type > DATE/TIMESTAMP > -- > > Key: HIVE-15338 > URL: https://issues.apache.org/jira/browse/HIVE-15338 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15338.01.patch, HIVE-15338.02.patch, > HIVE-15338.03.patch, HIVE-15338.04.patch > > > Vectorization in vectorized DATEDIFF accidentally treated the scalar > parameter is type DATE (e.g. CURRENT_DATE) as 0. > Current Q file test vectorized_date_funcs.q DOES NOT test the DATE/TIMESTAMP > scalar type case. > And, non-vectorized cases of DATEDIFF are using UTF and returning the wrong > results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views
[ https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736292#comment-15736292 ] Hive QA commented on HIVE-14496: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842593/HIVE-14496.04.patch {color:green}SUCCESS:{color} +1 due to 18 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 121 failed/errored test(s), 10784 tests executed *Failed tests:* {noformat} TestHBaseImport - did not produce a TEST-*.xml file (likely timed out) (batchId=193) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cteViews] (batchId=70) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite] (batchId=2) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_unionDistinct_2] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_views] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[unionDistinct_2] (batchId=139) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[unionDistinct_2] (batchId=91) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.alterInvalidation (batchId=199) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.altersInvalidation (batchId=199) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.hit (batchId=199) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.invalidation (batchId=199) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.someWithStats (batchId=199) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.allWithStats (batchId=186) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.noneWithStats (batchId=186) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.nonexistentPartitions (batchId=186) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions (batchId=186) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCacheWithBitVector.allPartitions (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.MiddleOfPartitionsHaveBitVectorStatus (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDouble (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusLong (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.TwoEndsOfPartitionsHaveBitVectorStatus (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusDecimal (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusDouble (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusLong (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.allPartitionsHaveBitVectorStatusString (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsExtrapolation.noPartitionsHaveBitVectorStatus (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.MiddleOfPartitionsHaveBitVectorStatus (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDecimal (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusDouble (batchId=187) org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsNDVUniformDist.TwoEndsAndMiddleOfPartitionsHaveBitVectorStatusLong (batchId=187)
[jira] [Updated] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15403: - Attachment: HIVE-15403.2.patch updated comment > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15403.1.patch, HIVE-15403.2.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved
[ https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736240#comment-15736240 ] Eugene Koifman edited comment on HIVE-15048 at 12/9/16 8:26 PM: WRT dynamic partitioning, that is also not new. Update/delete statements have always ran with dyn part regardless of what WriteEntity objects there are there. we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the lock management logic aware of. HIVE-15032 is tracking improving this was (Author: ekoifman): WRT dynamic partitioning, that is also not new. Update/delete statements have always ran with dyn part regardless of what WriteEntity objects there are there. we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the lock management logic aware of. > Update/Delete statement using wrong WriteEntity when subqueries are involved > > > Key: HIVE-15048 > URL: https://issues.apache.org/jira/browse/HIVE-15048 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, > HIVE-15048.03.patch, HIVE-15048.04.patch > > > See TestDbTxnManager2 for referenced methods > {noformat} > checkCmdOnDriver(driver.run("create table target (a int, b int) " + > "partitioned by (p int, q int) clustered by (a) into 2 buckets " + > "stored as orc TBLPROPERTIES ('transactional'='true')")); > checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, > q1 int) clustered by (a1) into 2 buckets stored as orc TBLPROPERTIES > ('transactional'='true')")); > checkCmdOnDriver(driver.run("insert into target partition(p,q) values > (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)")); > checkCmdOnDriver(driver.run( > "update source set b1 = 1 where p1 in (select t.q from target t where > t.p=2)")); > {noformat} > The last Update stmt creates the following Entity objects in the QueryPlan > inputs: [default@source, default@target, default@target@p=2/q=2] > outputs: [default@target@p=2/q=2] > Which is clearly wrong for outputs - the target table is not even > partitioned(or called 'target'). > This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze() > I suspect > update T ... where T.p IN (select d from T where ...) > type query would also get messed up (but not necessarily fail) if T is > partitioned and the subquery filters out some partitions but that does not > mean that the same partitions are filtered out in the parent query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15403) LLAP: Login with kerberos before starting the daemon
[ https://issues.apache.org/jira/browse/HIVE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736243#comment-15736243 ] Sergey Shelukhin commented on HIVE-15403: - I don't think "This is to avoid ... " comment is correct - we don't need to kinit on every node, rather we have problems if someone kinits ;) Otherwise looks good. > LLAP: Login with kerberos before starting the daemon > > > Key: HIVE-15403 > URL: https://issues.apache.org/jira/browse/HIVE-15403 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15403.1.patch > > > In LLAP cluster, if some of the nodes are kinit'ed with some user (other than > default hive user) and some nodes are kinit'ed with hive user, both will end > up in different paths under zk registry and may not be reported by the llap > status tool. The reason for that is when creating zk paths we use > UGI.getCurrentUser() but current user may not be same across all nodes > (someone has to do global kinit). Before bringing up the daemon, if security > is enabled each daemons should login based on specified kerberos principal > and keytab for llap daemon service and update the current logged in user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved
[ https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736240#comment-15736240 ] Eugene Koifman commented on HIVE-15048: --- WRT dynamic partitioning, that is also not new. Update/delete statements have always ran with dyn part regardless of what WriteEntity objects there are there. we.setDynamicPartitionWrite(original.isDynamicPartitionWrite()); just makes the lock management logic aware of. > Update/Delete statement using wrong WriteEntity when subqueries are involved > > > Key: HIVE-15048 > URL: https://issues.apache.org/jira/browse/HIVE-15048 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, > HIVE-15048.03.patch, HIVE-15048.04.patch > > > See TestDbTxnManager2 for referenced methods > {noformat} > checkCmdOnDriver(driver.run("create table target (a int, b int) " + > "partitioned by (p int, q int) clustered by (a) into 2 buckets " + > "stored as orc TBLPROPERTIES ('transactional'='true')")); > checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, > q1 int) clustered by (a1) into 2 buckets stored as orc TBLPROPERTIES > ('transactional'='true')")); > checkCmdOnDriver(driver.run("insert into target partition(p,q) values > (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)")); > checkCmdOnDriver(driver.run( > "update source set b1 = 1 where p1 in (select t.q from target t where > t.p=2)")); > {noformat} > The last Update stmt creates the following Entity objects in the QueryPlan > inputs: [default@source, default@target, default@target@p=2/q=2] > outputs: [default@target@p=2/q=2] > Which is clearly wrong for outputs - the target table is not even > partitioned(or called 'target'). > This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze() > I suspect > update T ... where T.p IN (select d from T where ...) > type query would also get messed up (but not necessarily fail) if T is > partitioned and the subquery filters out some partitions but that does not > mean that the same partitions are filtered out in the parent query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved
[ https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736227#comment-15736227 ] Eugene Koifman commented on HIVE-15048: --- That is not what it does. The code removes the table WriteEntity for target table and replaces it with some number of partition WriteEntity objects for that table. So conceptually it does the same thing as before. If you look at the new .q.out, the output shows the set inputs/outputs that it ends up with (not clearly highlight but they are there) > Update/Delete statement using wrong WriteEntity when subqueries are involved > > > Key: HIVE-15048 > URL: https://issues.apache.org/jira/browse/HIVE-15048 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, > HIVE-15048.03.patch, HIVE-15048.04.patch > > > See TestDbTxnManager2 for referenced methods > {noformat} > checkCmdOnDriver(driver.run("create table target (a int, b int) " + > "partitioned by (p int, q int) clustered by (a) into 2 buckets " + > "stored as orc TBLPROPERTIES ('transactional'='true')")); > checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, > q1 int) clustered by (a1) into 2 buckets stored as orc TBLPROPERTIES > ('transactional'='true')")); > checkCmdOnDriver(driver.run("insert into target partition(p,q) values > (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)")); > checkCmdOnDriver(driver.run( > "update source set b1 = 1 where p1 in (select t.q from target t where > t.p=2)")); > {noformat} > The last Update stmt creates the following Entity objects in the QueryPlan > inputs: [default@source, default@target, default@target@p=2/q=2] > outputs: [default@target@p=2/q=2] > Which is clearly wrong for outputs - the target table is not even > partitioned(or called 'target'). > This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze() > I suspect > update T ... where T.p IN (select d from T where ...) > type query would also get messed up (but not necessarily fail) if T is > partitioned and the subquery filters out some partitions but that does not > mean that the same partitions are filtered out in the parent query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15342) Add support for primary/foreign keys in HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-15342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736104#comment-15736104 ] Alan Gates commented on HIVE-15342: --- I don't think so, as this just keeps the HBase metastore up to date with the RDBMS based one. > Add support for primary/foreign keys in HBase metastore > --- > > Key: HIVE-15342 > URL: https://issues.apache.org/jira/browse/HIVE-15342 > Project: Hive > Issue Type: Improvement > Components: HBase Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 2.2.0 > > Attachments: HIVE-15342.patch > > > When HIVE-13076 was committed the calls into the HBase metastore were stubbed > out. We need to implement support for constraints in the HBase metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15048) Update/Delete statement using wrong WriteEntity when subqueries are involved
[ https://issues.apache.org/jira/browse/HIVE-15048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736090#comment-15736090 ] Alan Gates commented on HIVE-15048: --- I'm not sure I understand the change here. The previous code looks like it was trying to avoid locking the whole table by figuring out which partitions would be read and only locking those partitions. It looks like this goes wrong when there's a subquery involved, but in general should be sound. If I understand your changes you're just moving it to always use dynamic partitioning. But that locks the whole table, which we don't want. > Update/Delete statement using wrong WriteEntity when subqueries are involved > > > Key: HIVE-15048 > URL: https://issues.apache.org/jira/browse/HIVE-15048 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-15048.01.patch, HIVE-15048.02.patch, > HIVE-15048.03.patch, HIVE-15048.04.patch > > > See TestDbTxnManager2 for referenced methods > {noformat} > checkCmdOnDriver(driver.run("create table target (a int, b int) " + > "partitioned by (p int, q int) clustered by (a) into 2 buckets " + > "stored as orc TBLPROPERTIES ('transactional'='true')")); > checkCmdOnDriver(driver.run("create table source (a1 int, b1 int, p1 int, > q1 int) clustered by (a1) into 2 buckets stored as orc TBLPROPERTIES > ('transactional'='true')")); > checkCmdOnDriver(driver.run("insert into target partition(p,q) values > (1,2,1,2), (3,4,1,2), (5,6,1,3), (7,8,2,2)")); > checkCmdOnDriver(driver.run( > "update source set b1 = 1 where p1 in (select t.q from target t where > t.p=2)")); > {noformat} > The last Update stmt creates the following Entity objects in the QueryPlan > inputs: [default@source, default@target, default@target@p=2/q=2] > outputs: [default@target@p=2/q=2] > Which is clearly wrong for outputs - the target table is not even > partitioned(or called 'target'). > This happens in UpdateDeleteSemanticAnalyzer.reparseAndSuperAnalyze() > I suspect > update T ... where T.p IN (select d from T where ...) > type query would also get messed up (but not necessarily fail) if T is > partitioned and the subquery filters out some partitions but that does not > mean that the same partitions are filtered out in the parent query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)
[ https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-14731: Status: Open (was: Patch Available) Resubmit patch for jenkins test. > Use Tez cartesian product edge in Hive (unpartitioned case only) > > > Key: HIVE-14731 > URL: https://issues.apache.org/jira/browse/HIVE-14731 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, > HIVE-14731.2.patch, HIVE-14731.3.patch, HIVE-14731.4.patch, > HIVE-14731.5.patch, HIVE-14731.6.patch, HIVE-14731.7.patch, > HIVE-14731.8.patch, HIVE-14731.9.patch > > > Given cartesian product edge is available in Tez now (see TEZ-3230), let's > integrate it into Hive on Tez. This allows us to have more than one reducer > in cross product queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)
[ https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiyuan Yang updated HIVE-14731: Status: Patch Available (was: Open) > Use Tez cartesian product edge in Hive (unpartitioned case only) > > > Key: HIVE-14731 > URL: https://issues.apache.org/jira/browse/HIVE-14731 > Project: Hive > Issue Type: Bug >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, > HIVE-14731.2.patch, HIVE-14731.3.patch, HIVE-14731.4.patch, > HIVE-14731.5.patch, HIVE-14731.6.patch, HIVE-14731.7.patch, > HIVE-14731.8.patch, HIVE-14731.9.patch > > > Given cartesian product edge is available in Tez now (see TEZ-3230), let's > integrate it into Hive on Tez. This allows us to have more than one reducer > in cross product queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736071#comment-15736071 ] Sahil Takiar commented on HIVE-15385: - The test failures seem unrelated, and they are failing in other QA runs. > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736061#comment-15736061 ] Hive QA commented on HIVE-15385: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842388/HIVE-15385.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10798 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2525/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2525/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2525/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842388 - PreCommit-HIVE-Build > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15118: Attachment: (was: HIVE-15118.2.patch) > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15118: Comment: was deleted (was: Patch-2: address comment to drop table for upgrade.) > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15118: Attachment: HIVE-15118.2.patch patch-2: address comments to remove table during upgrade. > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15118: Attachment: HIVE-15118.2.patch Patch-2: address comment to drop table for upgrade. > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch, HIVE-15118.2.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14798) MSCK REPAIR TABLE throws null pointer exception
[ https://issues.apache.org/jira/browse/HIVE-14798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736022#comment-15736022 ] Carl Laird commented on HIVE-14798: --- This appears to be fixed in 2.1.1. > MSCK REPAIR TABLE throws null pointer exception > --- > > Key: HIVE-14798 > URL: https://issues.apache.org/jira/browse/HIVE-14798 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Anbu Cheeralan > > MSCK REPAIR TABLE statement throws null pointer exception in Hive 2.1 > I have tested the same against external/internal tables created both in HDFS > and in Google Cloud. > The error shown in beeline/sql client > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1) > Hive Logs: > 2016-09-20T17:28:00,717 ERROR [HiveServer2-Background-Pool: Thread-92]: > metadata.HiveMetaStoreChecker (:()) - java.lang.NullPointerException > 2016-09-20T17:28:00,717 WARN [HiveServer2-Background-Pool: Thread-92]: > exec.DDLTask (:()) - Failed to run metacheck: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:444) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:388) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:309) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:285) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:230) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:109) > at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1814) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:403) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011) > at > java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:432) > at > org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:418) > ... 4 more > Here are the steps to recreate this issue: > use default; > DROP TABLE IF EXISTS repairtable; > CREATE TABLE repairtable(col STRING) PARTITIONED BY (p1 STRING, p2 STRING); > MSCK REPAIR TABLE default.repairtable; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736021#comment-15736021 ] Sahil Takiar commented on HIVE-15385: - Thanks Ashutosh! > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736015#comment-15736015 ] Ashutosh Chauhan commented on HIVE-15385: - yeah.. that was not conscious choice to alter behavior. Sounds good to restore documented behavior. Thanks for fixing this up. +1 > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735990#comment-15735990 ] Sahil Takiar commented on HIVE-15385: - Thanks Sergio. I believe [~ashutoshc] was working on HIVE-12988, HIVE-13716, and HIVE-13933 - any chance you could comment on this JIRA. To summarize, Hive documentation claims that when {{hive.warehouse.subdir.inherit.perms}} is {{true}}, any failure to inherit permissions will not cause queries to fail, only a warning will be logged. It looks like the aforementioned JIRAs changed that by not catching exceptions thrown by {{HdfsUtils.setFullFileStatus}}, just wondering if that was intentional or not. > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl
[ https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735989#comment-15735989 ] Ashutosh Chauhan commented on HIVE-14998: - +1 > Fix and update test: TestPluggableHiveSessionImpl > - > > Key: HIVE-14998 > URL: https://issues.apache.org/jira/browse/HIVE-14998 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch > > > this test either prints an exception to the stdout ... or not - in its > current form it doesn't really usefull. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-14948) properly handle special characters in identifiers
[ https://issues.apache.org/jira/browse/HIVE-14948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-14948 started by Eugene Koifman. - > properly handle special characters in identifiers > - > > Key: HIVE-14948 > URL: https://issues.apache.org/jira/browse/HIVE-14948 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-14948.01.patch, HIVE-14948.02.patch > > > The treatment of quoted identifiers in HIVE-14943 is inconsistent. Need to > clean this up and if possible only quote those identifiers that need to be > quoted in the generated SQL statement -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735983#comment-15735983 ] Chaoyu Tang commented on HIVE-15410: The failed tests are not related. > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735970#comment-15735970 ] Hive QA commented on HIVE-15410: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842591/HIVE-15410.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10792 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=92) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2524/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2524/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2524/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842591 - PreCommit-HIVE-Build > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15376) Improve heartbeater scheduling for transactions
[ https://issues.apache.org/jira/browse/HIVE-15376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735968#comment-15735968 ] Eugene Koifman commented on HIVE-15376: --- How is the heartbeater going to be started for read only queries? > Improve heartbeater scheduling for transactions > --- > > Key: HIVE-15376 > URL: https://issues.apache.org/jira/browse/HIVE-15376 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-15376.1.patch, HIVE-15376.2.patch, > HIVE-15376.3.patch, HIVE-15376.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views
[ https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735938#comment-15735938 ] Jesus Camacho Rodriguez commented on HIVE-14496: [~ashutoshc], let's try to check this one in. I have addressed the most important comments in the last patch. In particular: - Loading all materialized views definitions for all users when HS2 starts (instead of per session). - Adding just an additional field for rewrite enabled (instead of creating a 'view descriptor'). This simplified a lot the changes in the scripts to upgrade metastore. I left for a follow-up: - Extension of rules to match new patterns. > Enable Calcite rewriting with materialized views > > > Key: HIVE-14496 > URL: https://issues.apache.org/jira/browse/HIVE-14496 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, > HIVE-14496.03.patch, HIVE-14496.04.patch, HIVE-14496.patch > > > Calcite already supports query rewriting using materialized views. We will > use it to support this feature in Hive. > In order to do that, we need to register the existing materialized views with > Calcite view service and enable the materialized views rewriting rules. > We should include a HiveConf flag to completely disable query rewriting using > materialized views if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14496) Enable Calcite rewriting with materialized views
[ https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14496: --- Attachment: HIVE-14496.04.patch > Enable Calcite rewriting with materialized views > > > Key: HIVE-14496 > URL: https://issues.apache.org/jira/browse/HIVE-14496 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, > HIVE-14496.03.patch, HIVE-14496.04.patch, HIVE-14496.patch > > > Calcite already supports query rewriting using materialized views. We will > use it to support this feature in Hive. > In order to do that, we need to register the existing materialized views with > Calcite view service and enable the materialized views rewriting rules. > We should include a HiveConf flag to completely disable query rewriting using > materialized views if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree
[ https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735927#comment-15735927 ] Sergey Shelukhin commented on HIVE-15405: - +1 except for tests; is stats_based_fetch_decision new? > Improve FileUtils.isPathWithinSubtree > - > > Key: HIVE-15405 > URL: https://issues.apache.org/jira/browse/HIVE-15405 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, > HIVE-15405.2.patch > > > When running single node LLAP with the following query multiple number of > times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} > became a hotpath. > {noformat} > SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} > based on path depth comparison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735923#comment-15735923 ] Chaoyu Tang commented on HIVE-15410: [~daijy], [~thejas] you have done quite some work on the WebHCat, could you review the patch? Thanks > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735908#comment-15735908 ] Sergio Peña commented on HIVE-15385: I deleted the HiveQA comment and allow retriggering the Jenkins job. > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-15385: --- Comment: was deleted (was: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842388/HIVE-15385.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10745 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=143) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=126) [ppd_transform.q,auto_join9.q,auto_join1.q,vector_data_types.q,input14.q,union30.q,input12.q,union_remove_22.q,vectorization_3.q,groupby1_map_nomap.q,cbo_union.q,disable_merge_for_bucketing.q,reduce_deduplicate_exclude_join.q,filter_join_breaktask2.q,join30.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2498/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2498/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2498/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842388 - PreCommit-HIVE-Build) > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735896#comment-15735896 ] Sergio Peña commented on HIVE-15385: Thanks [~stakiar] for the explanation. I spent some time investigating the history of this permission issue, and as you mentioned, the places where the IOException is not ignored could have been by accident. Based on the history that permissions should not throw an exception and make the query to fail, I agree on just send a warning to the log inside the setFullFileStatus() instead of throwing an exception. This is confusing for people that want to use such method. +1 I don't think tests failures are related, but could you re-attach the patch to see if the others stop failing? > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15410: --- Attachment: HIVE-15410.patch > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15410) WebHCat supports get/set table property with its name containing period and hyphen
[ https://issues.apache.org/jira/browse/HIVE-15410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-15410: --- Status: Patch Available (was: Open) > WebHCat supports get/set table property with its name containing period and > hyphen > -- > > Key: HIVE-15410 > URL: https://issues.apache.org/jira/browse/HIVE-15410 > Project: Hive > Issue Type: Improvement >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-15410.patch > > > Hive table properties could have period (.) or hyphen (-) in their names, > auto.purge is one of the examples. But WebHCat APIs does not support either > set or get these properties, and they throw out the error msg ""Invalid DDL > identifier :property". For example: > {code} > [root@ctang-1 ~]# curl -s > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key1?user.name=hiveuser' > {"error":"Invalid DDL identifier :property"} > [root@ctang-1 ~]# curl -s -X PUT -HContent-type:application/json -d '{ > "value": "true" }' > 'http://ctang-1.gce.cloudera.com:7272/templeton/v1/ddl/database/default/table/sample_07/property/prop.key2?user.name=hiveuser/' > {"error":"Invalid DDL identifier :property"} > {code} > This patch is going to add the supports to the property name containing > period and/or hyphen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14496) Enable Calcite rewriting with materialized views
[ https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735842#comment-15735842 ] Hive QA commented on HIVE-14496: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842587/HIVE-14496.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2523/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2523/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2523/ Messages: {noformat} This message was trimmed, see log for full details [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/util/Tool.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-common/2.7.2/hadoop-common-2.7.2.jar(org/apache/hadoop/conf/Configurable.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/ClassNotFoundException.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/curator/curator-framework/2.7.1/curator-framework-2.7.1.jar(org/apache/curator/framework/CuratorFrameworkFactory.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/curator/curator-client/2.7.1/curator-client-2.7.1.jar(org/apache/curator/retry/ExponentialBackoffRetry.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.2/hadoop-mapreduce-client-core-2.7.2.jar(org/apache/hadoop/mapreduce/Mapper.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Iterator.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/LinkedList.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/ExecutorService.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/Executors.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/concurrent/TimeUnit.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.2/hadoop-mapreduce-client-core-2.7.2.jar(org/apache/hadoop/mapreduce/Mapper$Context.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/net/URLDecoder.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Enumeration.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/util/Properties.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/core/UriBuilder.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/apache-github-source-source/common/target/hive-common-2.2.0-SNAPSHOT.jar(org/apache/hadoop/hive/common/LogUtils.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Class.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Annotation.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-annotations/2.7.2/hadoop-annotations-2.7.2.jar(org/apache/hadoop/classification/InterfaceAudience.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/org/apache/hadoop/hadoop-annotations/2.7.2/hadoop-annotations-2.7.2.jar(org/apache/hadoop/classification/InterfaceAudience$LimitedPrivate.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Retention.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/RetentionPolicy.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/Target.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/annotation/ElementType.class)]] [loading ZipFileIndexFileObject[/data/hiveptest/working/maven/com/sun/jersey/jersey-core/1.14/jersey-core-1.14.jar(javax/ws/rs/HttpMethod.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/SuppressWarnings.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/Override.class)]] [loading ZipFileIndexFileObject[/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(sun/misc/Contended.class)]] [loading
[jira] [Updated] (HIVE-14496) Enable Calcite rewriting with materialized views
[ https://issues.apache.org/jira/browse/HIVE-14496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14496: --- Attachment: HIVE-14496.03.patch > Enable Calcite rewriting with materialized views > > > Key: HIVE-14496 > URL: https://issues.apache.org/jira/browse/HIVE-14496 > Project: Hive > Issue Type: Sub-task > Components: Materialized views >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-14496.01.patch, HIVE-14496.02.patch, > HIVE-14496.03.patch, HIVE-14496.patch > > > Calcite already supports query rewriting using materialized views. We will > use it to support this feature in Hive. > In order to do that, we need to register the existing materialized views with > Calcite view service and enable the materialized views rewriting rules. > We should include a HiveConf flag to completely disable query rewriting using > materialized views if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15353) Metastore throws NPE if StorageDescriptor.cols is null
[ https://issues.apache.org/jira/browse/HIVE-15353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735806#comment-15735806 ] Anthony Hsu commented on HIVE-15353: HIVE-15353.3.patch seems to have been tested; the results just weren't auto-posted to this JIRA: https://builds.apache.org/job/PreCommit-HIVE-Build/2502/console. Looks like the PreCommit build is currently failing due to: {noformat} [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] No compiler is provided in this environment. Perhaps you are running on a JRE rather than a JDK? {noformat} > Metastore throws NPE if StorageDescriptor.cols is null > -- > > Key: HIVE-15353 > URL: https://issues.apache.org/jira/browse/HIVE-15353 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0, 2.2.0 >Reporter: Anthony Hsu >Assignee: Anthony Hsu > Attachments: HIVE-15353.1.patch, HIVE-15353.2.patch, > HIVE-15353.3.patch > > > When using the HiveMetaStoreClient API directly to talk to the metastore, you > get NullPointerExceptions when StorageDescriptor.cols is null in the > Table/Partition object in the following calls: > * create_table > * alter_table > * alter_partition > Calling add_partition with StorageDescriptor.cols set to null causes null to > be stored in the metastore database and subsequent calls to alter_partition > for that partition to fail with an NPE. > Null checks should be added to eliminate the NPEs in the metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning
[ https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735685#comment-15735685 ] Hive QA commented on HIVE-15053: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842555/HIVE-15053.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10760 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=143) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,cbo_rp_subq_not_in.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2522/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2522/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2522/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842555 - PreCommit-HIVE-Build > Beeline#addlocaldriver - reduce classpath scanning > -- > > Key: HIVE-15053 > URL: https://issues.apache.org/jira/browse/HIVE-15053 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, > HIVE-15053.1.patch, HIVE-15053.2.patch > > > There is a classpath scanning machinery inside {{ClassNameCompleter}}. > I think the sole purpose of these things is to scan for jdbc drivers...(but > not entirely sure) > if it is indeed looking for jdbc drivers..then possibly this can be removed > without any issues because modern jdbc drivers usually advertise their driver > as a service-loadable class for {{java.sql.Driver}} > http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html > Auto-Loading of JDBC Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
[ https://issues.apache.org/jira/browse/HIVE-15118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735657#comment-15735657 ] Naveen Gangam commented on HIVE-15118: -- [~aihuaxu] The main schema file changes looks good but it will not work for upgrade scenarios. When you upgrade from hive 2.1, this table will still exist. Can you also add changes for upgrade scenario? Thanks > Remove unused 'COLUMNS' table from derby schema > --- > > Key: HIVE-15118 > URL: https://issues.apache.org/jira/browse/HIVE-15118 > Project: Hive > Issue Type: Sub-task > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Attachments: HIVE-15118.1.patch > > > COLUMNS table is unused any more. Other databases already removed it. Remove > from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent
[ https://issues.apache.org/jira/browse/HIVE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15392: Attachment: HIVE-15392.2.patch patch-2: minor change (a string output) during the patch submission. > Refactoring the validate function of HiveSchemaTool to make the output > consistent > - > > Key: HIVE-15392 > URL: https://issues.apache.org/jira/browse/HIVE-15392 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15392.1.patch, HIVE-15392.2.patch > > > The validate output is not consistent. Make it more consistent. > {noformat} > Starting metastore validationValidating schema version > Succeeded in schema version validation. > Validating sequence number for SEQUENCE_TABLE > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Validating tables in the schema for version 2.2.0 > Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 > tables > Schema table validation successful > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Validating columns for incorrect NULL values > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Done with metastore validationschemaTool completed > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent
[ https://issues.apache.org/jira/browse/HIVE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-15392: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks Chaoyu for reviewing. > Refactoring the validate function of HiveSchemaTool to make the output > consistent > - > > Key: HIVE-15392 > URL: https://issues.apache.org/jira/browse/HIVE-15392 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-15392.1.patch > > > The validate output is not consistent. Make it more consistent. > {noformat} > Starting metastore validationValidating schema version > Succeeded in schema version validation. > Validating sequence number for SEQUENCE_TABLE > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Validating tables in the schema for version 2.2.0 > Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 > tables > Schema table validation successful > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Validating columns for incorrect NULL values > Metastore connection URL: > jdbc:derby:;databaseName=metastore_db;create=true > Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver > Metastore connection User: APP > Done with metastore validationschemaTool completed > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15391) Location validation for table should ignore the values for view.
[ https://issues.apache.org/jira/browse/HIVE-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735536#comment-15735536 ] Yongzhi Chen commented on HIVE-15391: - The failures are not related. > Location validation for table should ignore the values for view. > > > Key: HIVE-15391 > URL: https://issues.apache.org/jira/browse/HIVE-15391 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Affects Versions: 2.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Minor > Attachments: HIVE-15206.1.patch > > > When use schematool to do location validation, we got error message for > views, for example: > {noformat} > n DB with Name: viewa > NULL Location for TABLE with Name: viewa > In DB with Name: viewa > NULL Location for TABLE with Name: viewb > In DB with Name: viewa > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl
[ https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735538#comment-15735538 ] Hive QA commented on HIVE-14998: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842552/HIVE-14998.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10789 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] (batchId=91) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2521/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2521/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2521/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842552 - PreCommit-HIVE-Build > Fix and update test: TestPluggableHiveSessionImpl > - > > Key: HIVE-14998 > URL: https://issues.apache.org/jira/browse/HIVE-14998 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch > > > this test either prints an exception to the stdout ... or not - in its > current form it doesn't really usefull. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735429#comment-15735429 ] Hive QA commented on HIVE-15161: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842551/HIVE-15161.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10783 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2520/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2520/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2520/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842551 - PreCommit-HIVE-Build > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15053) Beeline#addlocaldriver - reduce classpath scanning
[ https://issues.apache.org/jira/browse/HIVE-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15053: Attachment: HIVE-15053.2.patch rebased patch..i hope nothing is broken ;) patch#2 here is #1 on reviewboard ;) https://reviews.apache.org/r/54585/ > Beeline#addlocaldriver - reduce classpath scanning > -- > > Key: HIVE-15053 > URL: https://issues.apache.org/jira/browse/HIVE-15053 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-15053.1.patch, HIVE-15053.1.patch, > HIVE-15053.1.patch, HIVE-15053.2.patch > > > There is a classpath scanning machinery inside {{ClassNameCompleter}}. > I think the sole purpose of these things is to scan for jdbc drivers...(but > not entirely sure) > if it is indeed looking for jdbc drivers..then possibly this can be removed > without any issues because modern jdbc drivers usually advertise their driver > as a service-loadable class for {{java.sql.Driver}} > http://www.onjava.com/2006/08/02/jjdbc-4-enhancements-in-java-se-6.html > Auto-Loading of JDBC Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15405) Improve FileUtils.isPathWithinSubtree
[ https://issues.apache.org/jira/browse/HIVE-15405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735303#comment-15735303 ] Hive QA commented on HIVE-15405: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842517/HIVE-15405.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10792 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=132) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2519/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2519/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2519/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842517 - PreCommit-HIVE-Build > Improve FileUtils.isPathWithinSubtree > - > > Key: HIVE-15405 > URL: https://issues.apache.org/jira/browse/HIVE-15405 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-15405-profiler.view.png, HIVE-15405.1.patch, > HIVE-15405.2.patch > > > When running single node LLAP with the following query multiple number of > times (flights table had 7000+ partitions) {{FileUtils.isPathWithinSubtree}} > became a hotpath. > {noformat} > SELECT COUNT(`flightnum`) AS `cnt_flightnum_ok`, > YEAR(`flights`.`dateofflight`) AS `yr_flightdate_ok` > FROM `flights` as `flights` > JOIN `airlines` ON (`uniquecarrier` = `airlines`.`code`) > JOIN `airports` as `source_airport` ON (`origin` = `source_airport`.`iata`) > JOIN `airports` as `dest_airport` ON (`flights`.`dest` = > `dest_airport`.`iata`) > GROUP BY YEAR(`flights`.`dateofflight`); > {noformat} > It would be good to have early exit in {{FileUtils.isPathWithinSubtree}} > based on path depth comparison. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14998) Fix and update test: TestPluggableHiveSessionImpl
[ https://issues.apache.org/jira/browse/HIVE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-14998: Attachment: HIVE-14998.2.patch rebased patch to current master patch(#2) uploaded to reviewboard(#1): https://reviews.apache.org/r/54584 > Fix and update test: TestPluggableHiveSessionImpl > - > > Key: HIVE-14998 > URL: https://issues.apache.org/jira/browse/HIVE-14998 > Project: Hive > Issue Type: Bug > Components: Tests >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-14998.1.patch, HIVE-14998.2.patch > > > this test either prints an exception to the stdout ... or not - in its > current form it doesn't really usefull. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13306) Better Decimal vectorization
[ https://issues.apache.org/jira/browse/HIVE-13306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-13306: -- Description: Decimal Vectorization Requirements • Today, the LongColumnVector, DoubleColumnVector, BytesColumnVector, TimestampColumnVector classes store the data as primitive Java data types long, double, or byte arrays for efficiency. • DecimalColumnVector is different - it has an array of Object references to HiveDecimal objects. • The HiveDecimal object uses an internal object BigDecimal for its implementation. Further, BigDecimal itself uses an internal object BigInteger for its implementation, and BigInteger uses an int array. 4 objects total. • And, HiveDecimal is an immutable object which means arithmetic and other operations produce new HiveDecimal object with 3 new objects underneath. • A major reason Vectorization is fast is the ColumnVector classes except DecimalColumnVector do not have to allocate additional memory per row. This avoids memory fragmentation and pressure on the Java Garbage Collector that DecimalColumnVector can generate. It is very significant. • What can be done with DecimalColumnVector to make it much more efficient? o Design several new decimal classes that allow the caller to manage the decimal storage. o If it takes 2 long values to store a decimal then a new DecimalColumnVector would have a long[] of length 2*1024 (where 1024 is the default column vector size). o Why store a decimal in separate long values? • Java does not support 128 bit integers. • Java does not support unsigned integers. • Int array representation uses smaller memory, but long array representation covers wider value range for fast primitive operations. • But really since we do not have unsigned, really you can only do multiplications on N-1 bits or 63 bits. • So, 2 longs are needed for decimal storage of 38 digits. Future works o It makes sense to have just one algorithm for decimals rather than one for HiveDecimal and another for DecimalColumnVector. So, make HiveDecimal store 2 long values, too. o A lower level primitive decimal class would accept decimals stored as long arrays and produces results into long arrays. It would be used by HiveDecimal and DecimalColumnVector. was: Decimal Vectorization Requirements • Today, the LongColumnVector, DoubleColumnVector, BytesColumnVector, TimestampColumnVector classes store the data as primitive Java data types long, double, or byte arrays for efficiency. • DecimalColumnVector is different - it has an array of Object references to HiveDecimal objects. • The HiveDecimal object uses an internal object BigDecimal for its implementation. Further, BigDecimal itself uses an internal object BigInteger for its implementation, and BigInteger uses an int array. 4 objects total. • And, HiveDecimal is an immutable object which means arithmetic and other operations produce new HiveDecimal object with 3 new objects underneath. • A major reason Vectorization is fast is the ColumnVector classes except DecimalColumnVector do not have to allocate additional memory per row. This avoids memory fragmentation and pressure on the Java Garbage Collector that DecimalColumnVector can generate. It is very significant. • What can be done with DecimalColumnVector to make it much more efficient? o Design several new decimal classes that allow the caller to manage the decimal storage. o If it takes N int values to store a decimal (e.g. N=1..5), then a new DecimalColumnVector would have an int[] of length N*1024 (where 1024 is the default column vector size). o Why store a decimal in separate int values? • Java does not support 128 bit integers. • Java does not support unsigned integers. • In order to do multiplication of a decimal represented in a long you need twice the storage (i.e. 128 bits). So you need to represent parts in 32 bit integers. • But really since we do not have unsigned, really you can only do multiplications on N-1 bits or 31 bits. • So, 5 ints are needed for decimal storage... of 38 digits. o It makes sense to have just one algorithm for decimals rather than one for HiveDecimal and another for DecimalColumnVector. So, make HiveDecimal store N int values, too. o A lower level primitive decimal class would accept decimals stored as int arrays and produces results into int arrays. It would be used by HiveDecimal and DecimalColumnVector. > Better Decimal vectorization > > > Key: HIVE-13306 > URL: https://issues.apache.org/jira/browse/HIVE-13306 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >
[jira] [Updated] (HIVE-15161) migrate ColumnStats to use jackson
[ https://issues.apache.org/jira/browse/HIVE-15161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15161: Attachment: HIVE-15161.4.patch re-uploaded patch (a whole month have passed) > migrate ColumnStats to use jackson > -- > > Key: HIVE-15161 > URL: https://issues.apache.org/jira/browse/HIVE-15161 > Project: Hive > Issue Type: Sub-task >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.2.0 > > Attachments: HIVE-15161.1.patch, HIVE-15161.2.patch, > HIVE-15161.3.patch, HIVE-15161.4.patch, HIVE-15161.4.patch > > > * json.org has license issues > * jackson can provide a fully compatible alternative to it > * there are a few flakiness issues caused by the order of the map entries of > the columns...this cat be addressed, org.json api was unfriendly in this > manner ;) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15407) add distcp to classpath by default, because hive depends on it.
[ https://issues.apache.org/jira/browse/HIVE-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735210#comment-15735210 ] Hive QA commented on HIVE-15407: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12842515/HIVE-15407.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10792 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample2] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample6] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample7] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample9] (batchId=38) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_based_fetch_decision] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=91) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=91) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2518/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2518/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2518/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12842515 - PreCommit-HIVE-Build > add distcp to classpath by default, because hive depends on it. > > > Key: HIVE-15407 > URL: https://issues.apache.org/jira/browse/HIVE-15407 > Project: Hive > Issue Type: Bug > Components: Beeline, CLI >Affects Versions: 2.2.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-15407.1.patch > > > when i run hive queries, i get errors as follow > java.lang.NoClassDefFoundError: org/apache/hadoop/tools/DistCpOptions > ... > I dig into code, and find that hive depends on distcp ,but distcp is not in > classpath by default. > I think if adding distcp to hadoop classpath by default in hadoop project, > but hadoop committers will not do that. discussions in HADOOP-13865 . They > propose that Resolving this problem on HIVE > So i add distcp to classpath on HIVE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file
[ https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-15182: Release Note: (was: After I've started working on this it turned out, that the problem can't be addressed like this... The reason behind these "code too large" issues are that antlr generates a bunch of things to try to prevail in "hard-to-decide which rule will match this one" situations. ) > Move 'clause' rules from IdentifierParser to a different file > - > > Key: HIVE-15182 > URL: https://issues.apache.org/jira/browse/HIVE-15182 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > > I'm hitting antlr / code too large errors ; and these rules belong to a > different class than the other. > Moving them to a separate file greatly reduces generated IdentifierParser > size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file
[ https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15735205#comment-15735205 ] Zoltan Haindrich commented on HIVE-15182: - After I've started working on this it turned out, that the problem can't be addressed like this... The reason behind these "code too large" issues are that antlr generates a bunch of things to try to prevail in "hard-to-decide which rule will match this one" situations. > Move 'clause' rules from IdentifierParser to a different file > - > > Key: HIVE-15182 > URL: https://issues.apache.org/jira/browse/HIVE-15182 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > > I'm hitting antlr / code too large errors ; and these rules belong to a > different class than the other. > Moving them to a separate file greatly reduces generated IdentifierParser > size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-15182) Move 'clause' rules from IdentifierParser to a different file
[ https://issues.apache.org/jira/browse/HIVE-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-15182. - Resolution: Fixed Release Note: After I've started working on this it turned out, that the problem can't be addressed like this... The reason behind these "code too large" issues are that antlr generates a bunch of things to try to prevail in "hard-to-decide which rule will match this one" situations. > Move 'clause' rules from IdentifierParser to a different file > - > > Key: HIVE-15182 > URL: https://issues.apache.org/jira/browse/HIVE-15182 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > > I'm hitting antlr / code too large errors ; and these rules belong to a > different class than the other. > Moving them to a separate file greatly reduces generated IdentifierParser > size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-1478) Non-boolean expression in WHERE should be rejected
[ https://issues.apache.org/jira/browse/HIVE-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-1478: --- Resolution: Won't Fix Status: Resolved (was: Patch Available) > Non-boolean expression in WHERE should be rejected > -- > > Key: HIVE-1478 > URL: https://issues.apache.org/jira/browse/HIVE-1478 > Project: Hive > Issue Type: Bug >Affects Versions: 0.7.0 >Reporter: Paul Yang >Assignee: Zoltan Haindrich >Priority: Minor > Attachments: HIVE-1478.1.patch, HIVE-1478.2.patch > > > Automatically casting strings or other types into boolean may confuse even > the user - and somehow it doesn't always work (HIVE-15089) > sql2011 states that "where expression" should accept a boolean expression. > Original reported problem: > If the expression in the where clause does not evaluate to a boolean, the job > will fail with the following exception in the task logs: > Query: > SELECT key FROM src WHERE 1; > Exception in mapper: > 2010-07-21 17:00:31,460 FATAL ExecMapper: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row {"key":"238","value":"val_238"} > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417) > at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:180) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:159) > Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to > java.lang.Boolean > at > org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:84) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:45) > at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400) > ... 5 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)