[jira] [Assigned] (HIVE-21564) Load data into a bucketed table is ignoring partitions specs and loading data into default partition.
[ https://issues.apache.org/jira/browse/HIVE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reassigned HIVE-21564: --- > Load data into a bucketed table is ignoring partitions specs and loading data > into default partition. > - > > Key: HIVE-21564 > URL: https://issues.apache.org/jira/browse/HIVE-21564 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > > When running below command to load data into bucketed tables it is not > loading into specified partition instead loaded into default partition. > LOAD DATA INPATH '/tmp/files/00_0' OVERWRITE INTO TABLE call > PARTITION(year_partition=2012, month=12); > SELECT * FROM call WHERE year_partition=2012 AND month=12; --> returns 0 rows. > {code} > CREATE TABLE call( > date_time_date date, > ssn string, > name string, > location string) > PARTITIONED BY ( > year_partition int, > month int) > CLUSTERED BY ( > date_time_date) > SORTED BY ( > date_time_date ASC) > INTO 1 BUCKETS > STORED AS ORC; > {code} > If set hive.exec.dynamic.partition to false, it fails with below error. > {code} > Error: Error while compiling statement: FAILED: SemanticException 1:18 > Dynamic partition is disabled. Either enable it by setting > hive.exec.dynamic.partition=true or specify partition column values. Error > encountered near token 'month' (state=42000,code=4) > {code} > When we "set hive.strict.checks.bucketing=false;", the load works fine. > This is a behaviour imposed by HIVE-15148 to avoid incorrectly named data > files being loaded to the bucketed tables. In customer use case, if the files > are named properly with bucket_id (0_0, 0_1 etc), then it is safe to > set this flag to false. > However, current behaviour of loading into default partitions when > hive.strict.checks.bucketing=true and partitions specified, was a bug > injected by HIVE-19311 where the given query is re-written into a insert > query (to handle incorrect file names and Orc versions) but missed to > incorporate the partitions specs to it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: HIVE-21539.2.patch > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Open (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Assigned] (HIVE-21563) Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce
[ https://issues.apache.org/jira/browse/HIVE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned HIVE-21563: -- > Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce > --- > > Key: HIVE-21563 > URL: https://issues.apache.org/jira/browse/HIVE-21563 > Project: Hive > Issue Type: Improvement >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > > We do not need registerAllFunctionsOnce when {{Table#getEmptyTable}}. The > stack trace: > {noformat} > at > org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) > at > org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:209) > at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:247) > at > org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231) > at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388) > at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332) > at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312) > at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288) > at > org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:913) > at > org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:877) > at > org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1479) > at > org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1150) > at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:180) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807409#comment-16807409 ] Hive QA commented on HIVE-9995: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964487/HIVE-9995.10.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15890 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16811/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16811/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16811/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12964487 - PreCommit-HIVE-Build > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807348#comment-16807348 ] Hive QA commented on HIVE-21539: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 28s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 2 new + 1 unchanged - 0 fixed = 3 total (was 1) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16807/dev-support/hive-personality.sh | | git revision | master / 2111c01 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16807/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16807/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges:
[jira] [Commented] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807370#comment-16807370 ] Hive QA commented on HIVE-21560: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964482/HIVE-21560.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16808/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16808/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16808/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-04-02 02:40:09.551 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-16808/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-04-02 02:40:09.554 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 2111c01 HIVE-21537: Scalar query rewrite could be improved to not generate an extra join if subquery is guaranteed to produce atmost one row (Vineet Garg, reviewed by Jesus Camacho Rodriguez) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 2111c01 HIVE-21537: Scalar query rewrite could be improved to not generate an extra join if subquery is guaranteed to produce atmost one row (Vineet Garg, reviewed by Jesus Camacho Rodriguez) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-04-02 02:40:10.213 + rm -rf ../yetus_PreCommit-HIVE-Build-16808 + mkdir ../yetus_PreCommit-HIVE-Build-16808 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-16808 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16808/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/metastore/scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not exist in index error: a/standalone-metastore/metastore-server/src/main/resources/package.jdo: does not exist in index error: a/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-3.2.0.derby.sql: does not exist in index error: a/standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql: does not exist in index error: a/standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.1.0-to-3.2.0.derby.sql: does not exist in index error: metastore/scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not exist in index error: scripts/upgrade/derby/058-HIVE-21560.derby.sql: does not exist in index error: metastore-server/src/main/resources/package.jdo: does not exist in index error: metastore-server/src/main/sql/derby/hive-schema-3.2.0.derby.sql: does not exist in index error: metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql: does not exist in index error: metastore-server/src/main/sql/derby/upgrade-3.1.0-to-3.2.0.derby.sql: does not exist in index The patch does not appear to apply with p0, p1, or p2 + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-16808 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12964482 - PreCommit-HIVE-Build > Update Derby DDL to use CLOB instead of LONG VARCHAR > > > Key: HIVE-21560 > URL: https://issues.apache.org/jira/browse/HIVE-21560 > Project: Hive > Issue Type: Bug >Reporter: Shawn Weeks >Assignee: Rajkumar Singh >Priority: Minor > Attachments: HIVE-21560.patch > > > in the
[jira] [Commented] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807369#comment-16807369 ] Hive QA commented on HIVE-21539: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964480/HIVE-21539.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15861 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=171) [authorization_view_8.q,load_dyn_part5.q,vector_groupby_grouping_sets5.q,vector_complex_join.q,orc_llap.q,vectorization_7.q,cbo_gby.q,vectorized_dynamic_semijoin_reduction2.q,bucket_num_reducers_acid2.q,schema_evol_orc_vec_table.q,auto_sortmerge_join_1.q,results_cache_empty_result.q,lineage3.q,materialized_view_rewrite_empty.q,q93_with_constraints.q,vector_struct_in.q,bucketmapjoin3.q,vectorization_16.q,orc_ppd_schema_evol_2a.q,partition_ctas.q,vector_windowing_multipartitioning.q,vectorized_join46.q,orc_ppd_date.q,create_merge_compressed.q,vector_outer_join1.q,dynpart_sort_optimization_acid.q,vectorization_not.q,having.q,vector_topnkey.q,special_character_in_tabnames_1.q] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16807/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16807/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16807/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12964480 - PreCommit-HIVE-Build > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > |
[jira] [Comment Edited] (HIVE-21166) Keyword as column name in DBS table of Hive metastore
[ https://issues.apache.org/jira/browse/HIVE-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807359#comment-16807359 ] Jianguo Tian edited comment on HIVE-21166 at 4/2/19 2:22 AM: - You can query DESC column like this: {code:sql} select `DESC` from DBS limit 10; {code} was (Author: jonnyr): You can query DESC column like this: {code:java} // select `DESC` from DBS limit 10; {code} > Keyword as column name in DBS table of Hive metastore > - > > Key: HIVE-21166 > URL: https://issues.apache.org/jira/browse/HIVE-21166 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Vamsi UCSS >Priority: Blocker > > The table "DBS" in hive schema (metastore) has a column called "DESC" which > is a Hive keyword. This is causing any queries on this table to result in a > syntax error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21166) Keyword as column name in DBS table of Hive metastore
[ https://issues.apache.org/jira/browse/HIVE-21166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807359#comment-16807359 ] Jianguo Tian commented on HIVE-21166: - You can query DESC column like this: {code:java} // select `DESC` from DBS limit 10; {code} > Keyword as column name in DBS table of Hive metastore > - > > Key: HIVE-21166 > URL: https://issues.apache.org/jira/browse/HIVE-21166 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Vamsi UCSS >Priority: Blocker > > The table "DBS" in hive schema (metastore) has a column called "DESC" which > is a Hive keyword. This is causing any queries on this table to result in a > syntax error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild
[ https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807382#comment-16807382 ] Hive QA commented on HIVE-20382: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 5s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 38s{color} | {color:blue} common in master has 63 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 26s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 46s{color} | {color:red} ql: The patch generated 6 new + 147 unchanged - 2 fixed = 153 total (was 149) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 43s{color} | {color:red} ql generated 1 new + 2258 unchanged - 0 fixed = 2259 total (was 2258) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 31m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewRule.MATERIALIZED_VIEW_REWRITING_RULES is a mutable array At HiveMaterializedViewRule.java: At HiveMaterializedViewRule.java:[line 105] | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16810/dev-support/hive-personality.sh | | git revision | master / 2111c01 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus/diff-checkstyle-ql.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus/new-findbugs-ql.html | | modules | C: common ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16810/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Materialized views: Introduce heuristic to favour incremental rebuild > - > > Key: HIVE-20382 > URL: https://issues.apache.org/jira/browse/HIVE-20382 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20382.01.patch, HIVE-20382.patch, HIVE-20382.patch > > > Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer > (this should be fixed by HIVE-20313). Even
[jira] [Updated] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21560: -- Attachment: HIVE-21560.01.patch Status: Patch Available (was: Open) > Update Derby DDL to use CLOB instead of LONG VARCHAR > > > Key: HIVE-21560 > URL: https://issues.apache.org/jira/browse/HIVE-21560 > Project: Hive > Issue Type: Bug >Reporter: Shawn Weeks >Assignee: Rajkumar Singh >Priority: Minor > Attachments: HIVE-21560.01.patch, HIVE-21560.patch > > > in the Hive 1.x and 2.x metastore version for Derby there are two column in > "TBLS" that are set to LONG VARCHAR. This causes larger create view > statements to fail when using embedded metastore for testing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21560) Update Derby DDL to use CLOB instead of LONG VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-21560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh updated HIVE-21560: -- Status: Open (was: Patch Available) > Update Derby DDL to use CLOB instead of LONG VARCHAR > > > Key: HIVE-21560 > URL: https://issues.apache.org/jira/browse/HIVE-21560 > Project: Hive > Issue Type: Bug >Reporter: Shawn Weeks >Assignee: Rajkumar Singh >Priority: Minor > Attachments: HIVE-21560.01.patch, HIVE-21560.patch > > > in the Hive 1.x and 2.x metastore version for Derby there are two column in > "TBLS" that are set to LONG VARCHAR. This causes larger create view > statements to fail when using embedded metastore for testing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21499) should not remove the function from registry if create command failed with AlreadyExistsException
[ https://issues.apache.org/jira/browse/HIVE-21499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-21499: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Rajkumar! > should not remove the function from registry if create command failed with > AlreadyExistsException > - > > Key: HIVE-21499 > URL: https://issues.apache.org/jira/browse/HIVE-21499 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.0 > Environment: Hive-3.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21499.01.patch, HIVE-21499.02.patch, > HIVE-21499.patch > > > As a part of HIVE-20953 we are removing the function if creation for same > failed with any reason, this will yield into the following situation. > 1. create function failed since function already exists > 2. on #1 failure hive will clear the permanent function from the registry > 3. this function will be of no use until hiveserver2 restarted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild
[ https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-20382: --- Attachment: HIVE-20382.02.patch > Materialized views: Introduce heuristic to favour incremental rebuild > - > > Key: HIVE-20382 > URL: https://issues.apache.org/jira/browse/HIVE-20382 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20382.01.patch, HIVE-20382.02.patch, > HIVE-20382.patch, HIVE-20382.patch > > > Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer > (this should be fixed by HIVE-20313). Even if we did, we always assume > uniform distribution of the column values, which can easily lead to > overestimations on the number of rows read when we filter on > ROW\_\_ID.writeId for materialized views (think about a large transaction for > MV creation and then small ones for incremental maintenance). This > overestimation can lead to incremental view maintenance not being triggered > as cost of the incremental plan is overestimated (we think we will read more > rows than we actually do). This could be fixed by introducing histograms that > reflect better the column values distribution. > Till both fixes are implemented, we will use a config variable that will > multiply the estimated cost of the rebuild plan and hence will be able to > favour incremental rebuild over full rebuild. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807402#comment-16807402 ] Hive QA commented on HIVE-9995: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 28s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} ql: The patch generated 0 new + 823 unchanged - 11 fixed = 823 total (was 834) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16811/dev-support/hive-personality.sh | | git revision | master / 5bf5d14 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16811/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514
[jira] [Updated] (HIVE-21563) Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce
[ https://issues.apache.org/jira/browse/HIVE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated HIVE-21563: --- Attachment: HIVE-21563.001.patch Status: Patch Available (was: Open) > Improve Table#getEmptyTable performance by disable registerAllFunctionsOnce > --- > > Key: HIVE-21563 > URL: https://issues.apache.org/jira/browse/HIVE-21563 > Project: Hive > Issue Type: Improvement >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Attachments: HIVE-21563.001.patch > > > We do not need registerAllFunctionsOnce when {{Table#getEmptyTable}}. The > stack trace: > {noformat} > at > org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:177) > at > org.apache.hadoop.hive.ql.exec.Registry.registerGenericUDF(Registry.java:170) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.(FunctionRegistry.java:209) > at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:247) > at > org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231) > at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:388) > at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332) > at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312) > at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288) > at > org.apache.hadoop.hive.ql.session.SessionState.setAuthorizerV2Config(SessionState.java:913) > at > org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:877) > at > org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:1479) > at > org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:1150) > at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:180) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum
[ https://issues.apache.org/jira/browse/HIVE-21561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807415#comment-16807415 ] Hive QA commented on HIVE-21561: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 2m 48s{color} | {color:blue} standalone-metastore/metastore-common in master has 29 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 19m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16813/dev-support/hive-personality.sh | | git revision | master / 5bf5d14 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: standalone-metastore/metastore-common U: standalone-metastore/metastore-common | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16813/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Revert removal of TableType.INDEX_TABLE enum > > > Key: HIVE-21561 > URL: https://issues.apache.org/jira/browse/HIVE-21561 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere >Priority: Major > Attachments: HIVE-21561.1.patch, HIVE-21561.2.patch > > > Index tables have been removed from Hive as of HIVE-18715. > However, in case users still have index tables defined in the metastore, we > should keep the TableType.INDEX_TABLE enum around so that users can drop > these tables. Without the enum defined Hive cannot do anything with them as > it fails with IllegalArgumentException errors when trying to call > TableType.valueOf() on INDEX_TABLE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20382) Materialized views: Introduce heuristic to favour incremental rebuild
[ https://issues.apache.org/jira/browse/HIVE-20382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807392#comment-16807392 ] Hive QA commented on HIVE-20382: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964486/HIVE-20382.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15890 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv] (batchId=195) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMetastoreTablesCleanup (batchId=327) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16810/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16810/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16810/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12964486 - PreCommit-HIVE-Build > Materialized views: Introduce heuristic to favour incremental rebuild > - > > Key: HIVE-20382 > URL: https://issues.apache.org/jira/browse/HIVE-20382 > Project: Hive > Issue Type: Improvement > Components: Materialized views >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-20382.01.patch, HIVE-20382.patch, HIVE-20382.patch > > > Currently, we do not expose stats over ROW\_\_ID.writeId to the optimizer > (this should be fixed by HIVE-20313). Even if we did, we always assume > uniform distribution of the column values, which can easily lead to > overestimations on the number of rows read when we filter on > ROW\_\_ID.writeId for materialized views (think about a large transaction for > MV creation and then small ones for incremental maintenance). This > overestimation can lead to incremental view maintenance not being triggered > as cost of the incremental plan is overestimated (we think we will read more > rows than we actually do). This could be fixed by introducing histograms that > reflect better the column values distribution. > Till both fixes are implemented, we will use a config variable that will > multiply the estimated cost of the rebuild plan and hence will be able to > favour incremental rebuild over full rebuild. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum
[ https://issues.apache.org/jira/browse/HIVE-21561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-21561: -- Attachment: HIVE-21561.2.patch > Revert removal of TableType.INDEX_TABLE enum > > > Key: HIVE-21561 > URL: https://issues.apache.org/jira/browse/HIVE-21561 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere >Priority: Major > Attachments: HIVE-21561.1.patch, HIVE-21561.2.patch > > > Index tables have been removed from Hive as of HIVE-18715. > However, in case users still have index tables defined in the metastore, we > should keep the TableType.INDEX_TABLE enum around so that users can drop > these tables. Without the enum defined Hive cannot do anything with them as > it fails with IllegalArgumentException errors when trying to call > TableType.valueOf() on INDEX_TABLE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path
[ https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807321#comment-16807321 ] Xinli Shang commented on HIVE-14836: It seems merging errors. Is it something you are going to fix? > Test the predicate pushing down support for Parquet vectorization read path > --- > > Key: HIVE-14836 > URL: https://issues.apache.org/jira/browse/HIVE-14836 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu >Priority: Major > Labels: pull-request-available > Attachments: HIVE-14836.patch > > > We should add more UT test for predicate pushing down support for Parquet > vectorization read path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21558) Query based compaction fails if the temporary FS is different than the table FS
[ https://issues.apache.org/jira/browse/HIVE-21558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807322#comment-16807322 ] Hive QA commented on HIVE-21558: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 30s{color} | {color:blue} ql in master has 2258 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16806/dev-support/hive-personality.sh | | git revision | master / 2111c01 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16806/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Query based compaction fails if the temporary FS is different than the table > FS > --- > > Key: HIVE-21558 > URL: https://issues.apache.org/jira/browse/HIVE-21558 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21558.02.patch, HIVE-21558.patch > > > The Exception I got is like this: > {code:java} > 2019-04-01T13:45:44,035 ERROR [PeterVary-MBP15.local-33] compactor.Worker: > Caught exception while trying to compact > id:24,dbname:default,tableName:acid,partName:null,state:,type:MAJOR,properties:null,runAs:petervary,tooManyAborts:false,highestWriteId:9. > Marking failed to avoid repeated failures, > java.lang.IllegalArgumentException: Wrong FS: > pfile:/Users/petervary/data/apache/hive/warehouse/acid/base_009_v284/bucket_0, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) > at > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86) > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454) > at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1768) > at >
[jira] [Commented] (HIVE-21558) Query based compaction fails if the temporary FS is different than the table FS
[ https://issues.apache.org/jira/browse/HIVE-21558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807335#comment-16807335 ] Hive QA commented on HIVE-21558: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964479/HIVE-21558.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15890 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16806/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16806/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16806/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12964479 - PreCommit-HIVE-Build > Query based compaction fails if the temporary FS is different than the table > FS > --- > > Key: HIVE-21558 > URL: https://issues.apache.org/jira/browse/HIVE-21558 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21558.02.patch, HIVE-21558.patch > > > The Exception I got is like this: > {code:java} > 2019-04-01T13:45:44,035 ERROR [PeterVary-MBP15.local-33] compactor.Worker: > Caught exception while trying to compact > id:24,dbname:default,tableName:acid,partName:null,state:,type:MAJOR,properties:null,runAs:petervary,tooManyAborts:false,highestWriteId:9. > Marking failed to avoid repeated failures, > java.lang.IllegalArgumentException: Wrong FS: > pfile:/Users/petervary/data/apache/hive/warehouse/acid/base_009_v284/bucket_0, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) > at > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86) > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454) > at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1768) > at > org.apache.hadoop.hive.ql.io.ProxyLocalFileSystem.rename(ProxyLocalFileSystem.java:34) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.commitCrudMajorCompaction(CompactorMR.java:583) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:401) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:248) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:195){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Description: We has upgrade to jetty 9 in HIVE-16049, the configuration `org.mortbay` in log4j.properties for old version of jetty is useless. (was: We has upgrade to jetty 9 in [HIVE-16049](https://issues.apache.org/jira/browse/HIVE-16049), the configuration `org.mortbay` in log4j.properties for old version of jetty is useless. ) > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > > We has upgrade to jetty 9 in HIVE-16049, the configuration `org.mortbay` in > log4j.properties for old version of jetty is useless. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Labels: patch-available (was: ) > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > Labels: patch-available > Attachments: HIVE-21556.1.patch > > > > {code:java} > logger.Mortbay.name = org.mortbay > logger.Mortbay.level = INFO > {code} > The logger `Mortbay` in log4j.properties is used to control logging > activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, > the package name has changed to `org.eclipse.jetty` and we have added the new > logger to control jetty. `Mortbay` is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21404) MSSQL upgrade script alters the wrong column
[ https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-21404: --- Assignee: David Lavati (was: Zoltan Haindrich) > MSSQL upgrade script alters the wrong column > > > Key: HIVE-21404 > URL: https://issues.apache.org/jira/browse/HIVE-21404 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.2.0 >Reporter: David Lavati >Assignee: David Lavati >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, > HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying > the wrong table: > {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}} > https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21404) MSSQL upgrade script alters the wrong column
[ https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-21404: Attachment: HIVE-21404.4.patch > MSSQL upgrade script alters the wrong column > > > Key: HIVE-21404 > URL: https://issues.apache.org/jira/browse/HIVE-21404 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.2.0 >Reporter: David Lavati >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, > HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying > the wrong table: > {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}} > https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21407) Parquet predicate pushdown is not working correctly for char column types
[ https://issues.apache.org/jira/browse/HIVE-21407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora reassigned HIVE-21407: Assignee: Marta Kuczora > Parquet predicate pushdown is not working correctly for char column types > - > > Key: HIVE-21407 > URL: https://issues.apache.org/jira/browse/HIVE-21407 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Major > > If the 'hive.optimize.index.filter' parameter is false, the filter predicate > is not pushed to parquet, so the filtering only happens within Hive. If the > parameter is true, the filter is pushed to parquet, but for a char type, the > value which is pushed to Parquet will be padded with spaces: > {noformat} > @Override > public void setValue(String val, int len) { > super.setValue(HiveBaseChar.getPaddedValue(val, len), -1); > } > {noformat} > So if we have a char(10) column which contains the value "apple" and the > where condition looks like 'where c='apple'', the value pushed to Paquet will > be 'apple' followed by 5 spaces. But the stored values are not padded, so no > rows will be returned from Parquet. > How to reproduce: > {noformat} > $ create table ppd (c char(10), v varchar(10), i int) stored as parquet; > $ insert into ppd values ('apple', 'bee', 1),('apple', 'tree', 2),('hello', > 'world', 1),('hello','vilag',3); > $ set hive.optimize.ppd.storage=true; > $ set hive.vectorized.execution.enabled=true; > $ set hive.vectorized.execution.enabled=false; > $ set hive.optimize.ppd=true; > $ set hive.optimize.index.filter=true; > $ set hive.parquet.timestamp.skip.conversion=false; > $ select * from ppd where c='apple'; > ++++ > | ppd.c | ppd.v | ppd.i | > ++++ > ++++ > $ set hive.optimize.index.filter=false; or set > hive.optimize.ppd.storage=false; > $ select * from ppd where c='apple'; > +-+++ > |ppd.c| ppd.v | ppd.i | > +-+++ > | apple | bee| 1 | > | apple | tree | 2 | > +-+++ > {noformat} > The issue surfaced after uploading the fix for > [HIVE-21327|https://issues.apache.org/jira/browse/HIVE-21327] was uploaded > upstream. Before the HIVE-21327 fix, setting the parameter > 'hive.parquet.timestamp.skip.conversion' to true in the parquet_ppd_char.q > test hid this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.
[ https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806425#comment-16806425 ] Hive QA commented on HIVE-21540: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 22s{color} | {color:blue} ql in master has 2257 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16790/dev-support/hive-personality.sh | | git revision | master / dc0b16a | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16790/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Query with join condition having date literal throws SemanticException. > --- > > Key: HIVE-21540 > URL: https://issues.apache.org/jira/browse/HIVE-21540 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.1.0, 3.1.1 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: Analyzer, DateField, pull-request-available > Attachments: HIVE-21540.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This semantic exception is thrown for the following query. > *SemanticException '2019-03-20' encountered with 0 children* > {code} > create table date_1 (key int, dd date); > create table date_2 (key int, dd date); > select d1.key, d2.dd from( > select key, dd as start_dd, current_date as end_dd from date_1) d1 > join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and > end_dd; > {code} > When the WHERE condition below is commented out, the query completes > successfully. > where d2.dd between start_dd and end_dd > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Description: {code:java} logger.Mortbay.name = org.mortbay logger.Mortbay.level = INFO {code} The logger `Mortbay` in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty` and we have added the new logger to control jetty. `Mortbay` is useless. I guess we can remove it. was: {{The logger `Mortbay`}} {code:java} // code placeholder {code} {{in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty` and we have added `Jetty` logger to control the new version. `Mortbay` is useless. I guess we can remove it.}} > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > > > {code:java} > logger.Mortbay.name = org.mortbay > logger.Mortbay.level = INFO > {code} > The logger `Mortbay` in log4j.properties is used to control logging > activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, > the package name has changed to `org.eclipse.jetty` and we have added the new > logger to control jetty. `Mortbay` is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221121 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:49 Start Date: 01/Apr/19 06:49 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270732089 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, Map partSpec) throws int size = addPartitionDesc.getPartitionCount(); List in = new ArrayList(size); -AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl, true); long writeId; String validWriteIdList; -if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) { - writeId = tableSnapshot.getWriteId(); - validWriteIdList = tableSnapshot.getValidWriteIdList(); + +// In case of replication, get the writeId from the source and use valid write Id list +// for replication. +if (addPartitionDesc.getReplicationSpec().isInReplicationScope() && +addPartitionDesc.getPartition(0).getWriteId() > 0) { Review comment: writeId will be 0 for non-transactional tables. Also this is createPartitions code, which may get executed for partitions created when writeId 0 for transactional tables as well. The condition, which I borrowed from the old code is required so that we don't create a valid writeId list or try to get a table snapshot when writeId is zero. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221121) Time Spent: 9.5h (was: 9h 20m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9.5h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221119 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:48 Start Date: 01/Apr/19 06:48 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270732089 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, Map partSpec) throws int size = addPartitionDesc.getPartitionCount(); List in = new ArrayList(size); -AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl, true); long writeId; String validWriteIdList; -if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) { - writeId = tableSnapshot.getWriteId(); - validWriteIdList = tableSnapshot.getValidWriteIdList(); + +// In case of replication, get the writeId from the source and use valid write Id list +// for replication. +if (addPartitionDesc.getReplicationSpec().isInReplicationScope() && +addPartitionDesc.getPartition(0).getWriteId() > 0) { Review comment: writeId will be 0 for non-transactional tables. Also this is createPartitions code, which may get executed for partitions created when writeId 0 for transactional tables as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221119) Time Spent: 9h 20m (was: 9h 10m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9h 20m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.
[ https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806457#comment-16806457 ] Hive QA commented on HIVE-21540: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964397/HIVE-21540.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15884 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16790/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16790/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16790/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12964397 - PreCommit-HIVE-Build > Query with join condition having date literal throws SemanticException. > --- > > Key: HIVE-21540 > URL: https://issues.apache.org/jira/browse/HIVE-21540 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.1.0, 3.1.1 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: Analyzer, DateField, pull-request-available > Attachments: HIVE-21540.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This semantic exception is thrown for the following query. > *SemanticException '2019-03-20' encountered with 0 children* > {code} > create table date_1 (key int, dd date); > create table date_2 (key int, dd date); > select d1.key, d2.dd from( > select key, dd as start_dd, current_date as end_dd from date_1) d1 > join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and > end_dd; > {code} > When the WHERE condition below is commented out, the query completes > successfully. > where d2.dd between start_dd and end_dd > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21392) Misconfigurations of DataNucleus log in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21392: Attachment: HIVE-21392.08.patch > Misconfigurations of DataNucleus log in log4j.properties > > > Key: HIVE-21392 > URL: https://issues.apache.org/jira/browse/HIVE-21392 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.0.0 >Reporter: Chen Zhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21392.02.patch, HIVE-21392.03.patch, > HIVE-21392.04.patch, HIVE-21392.05.patch, HIVE-21392.06.patch, > HIVE-21392.07.patch, HIVE-21392.08.patch, HIVE-21392.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > In the patch of > [HIVE-12020|https://issues.apache.org/jira/browse/HIVE-12020], we changed the > DataNucleus related logging configuration from nine fine-grained loggers with > three coarse-grained loggers (DataNucleus, Datastore and JPOX). As Prasanth > Jayachandran > [explain|https://issues.apache.org/jira/browse/HIVE-12020?focusedCommentId=15025612=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15025612], > these three loggers are the top-level logger in DataNucleus, so that we > don't need to specify other loggers for DataNucleus. However, according to > the > [documents|http://www.datanucleus.org/products/accessplatform/logging.html] > and [source > codes|https://github.com/datanucleus/datanucleus-core/blob/master/src/main/java/org/datanucleus/util/NucleusLogger.java#L108] > of DataNucleus, the top-level logger in DataNucleus is `DataNucleus`. > Therefore, we just need to keep the right one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221122 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:50 Start Date: 01/Apr/19 06:50 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270732089 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, Map partSpec) throws int size = addPartitionDesc.getPartitionCount(); List in = new ArrayList(size); -AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl, true); long writeId; String validWriteIdList; -if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) { - writeId = tableSnapshot.getWriteId(); - validWriteIdList = tableSnapshot.getValidWriteIdList(); + +// In case of replication, get the writeId from the source and use valid write Id list +// for replication. +if (addPartitionDesc.getReplicationSpec().isInReplicationScope() && +addPartitionDesc.getPartition(0).getWriteId() > 0) { Review comment: writeId will be 0 for non-transactional tables. Also this is createPartitions code, which may get executed for with writeId = 0 for non-transactional modifications to partitions for transactional tables as well. The condition, which I borrowed from the old code is required so that we don't create a valid writeId list or try to get a table snapshot when writeId is zero. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221122) Time Spent: 9h 40m (was: 9.5h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9h 40m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21230) LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side (HiveJoinAddNotNullRule bails out for outer joins)
[ https://issues.apache.org/jira/browse/HIVE-21230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-21230: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) pushed to master. Thank you Vineet ! > LEFT OUTER JOIN does not generate transitive IS NOT NULL filter on right side > (HiveJoinAddNotNullRule bails out for outer joins) > > > Key: HIVE-21230 > URL: https://issues.apache.org/jira/browse/HIVE-21230 > Project: Hive > Issue Type: Improvement > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Vineet Garg >Priority: Major > Labels: newbie > Fix For: 4.0.0 > > Attachments: HIVE-21230.1.patch, HIVE-21230.2.patch, > HIVE-21230.3.patch, HIVE-21230.4.patch, HIVE-21230.5.patch, > HIVE-21230.6.patch, HIVE-21230.7.patch, HIVE-21230.8.patch > > > For instance, given the following query: > {code:sql} > SELECT t0.col0, t0.col1 > FROM > ( > SELECT col0, col1 FROM tab > ) AS t0 > LEFT JOIN > ( > SELECT col0, col1 FROM tab > ) AS t1 > ON t0.col0 = t1.col0 AND t0.col1 = t1.col1 > {code} > we could still infer that col0 and col1 cannot be null in the right input and > introduce the corresponding filter predicate. Currently, the rule just bails > out if it is not an inner join. > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinAddNotNullRule.java#L79 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21402) Compaction state remains 'working' when major compaction fails
[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-21402: -- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks for the review [~vgumashta]! > Compaction state remains 'working' when major compaction fails > -- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 4.0.0 >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21404) MSSQL upgrade script alters the wrong column
[ https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806551#comment-16806551 ] Hive QA commented on HIVE-21404: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964409/HIVE-21404.4.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 15885 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDeprecatedConfigIsOverwritten (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testMasterKeyOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty (batchId=230) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16792/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16792/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16792/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12964409 - PreCommit-HIVE-Build > MSSQL upgrade script alters the wrong column > > > Key: HIVE-21404 > URL: https://issues.apache.org/jira/browse/HIVE-21404 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.2.0 >Reporter: David Lavati >Assignee: David Lavati >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, > HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying > the wrong table: > {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}} > https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221179 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 09:38 Start Date: 01/Apr/19 09:38 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270786715 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -269,11 +294,23 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, WarehouseInstance.Tuple dumpTuple = primary.run("use " + primaryDbName) .dump(primaryDbName, lastReplicationId, withClauseList); + // Load, if necessary changing configuration. if (parallelLoad) { replica.hiveConf.setBoolVar(HiveConf.ConfVars.EXECPARALLEL, true); } +// Fail load if for testing failure and retry scenario. Fail the load while setting +// checkpoint for a table in the middle of list of tables. +if (failRetry) { + if (lastReplicationId == null) { +failBootstrapLoad(dumpTuple, tableNames.size()/2); + } else { +failIncrementalLoad(dumpTuple, tableNames.size()/2); Review comment: We are counting UpdateTableStats or UpdatePartStats events and not every event. So, we will fail only after encountering no of tables/2 events of those types. So it can not fail before applying update stats events. But to be on the safer side, I have changed the code to fail after second event so that we have at least one successful application before we fail. Since we are performing multiple insert events per table, we can be sure that there are at least 2 events of each type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221179) Time Spent: 9h 50m (was: 9h 40m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9h 50m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221112=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221112 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:36 Start Date: 01/Apr/19 06:36 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270729852 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -694,7 +695,9 @@ public void alterTable(String catName, String dbName, String tblName, Table newT AcidUtils.TableSnapshot tableSnapshot = null; if (transactional) { if (replWriteId > 0) { - ValidWriteIdList writeIds = AcidUtils.getTableValidWriteIdListWithTxnList(conf, dbName, tblName); + ValidWriteIdList writeIds = new ValidReaderWriteIdList(TableName.getDbTable(dbName, tblName), Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221112) Time Spent: 8h 50m (was: 8h 40m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 8h 50m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Description: The configuration `org.mortbay` in log4j.properties is used to control logging activities of jetty . However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty`, this configuration is useless. I guess we can remove it. (was: We has upgrade to jetty 9 in HIVE-16049, the configuration `org.mortbay` in log4j.properties for old version of jetty is useless. ) > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > > The configuration `org.mortbay` in log4j.properties is used to control > logging activities of jetty . However, we have upgrade to jetty 9 in > HIVE-16049, the package name has changed to `org.eclipse.jetty`, this > configuration is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?focusedWorklogId=221137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221137 ] ASF GitHub Bot logged work on HIVE-21556: - Author: ASF GitHub Bot Created on: 01/Apr/19 07:29 Start Date: 01/Apr/19 07:29 Worklog Time Spent: 10m Work Description: coder-chenzhi commented on pull request #586: HIVE-21556 remove configuraion for old version of jetty (6.x) URL: https://github.com/apache/hive/pull/586 ``` logger.Mortbay.name = org.mortbay logger.Mortbay.level = INFO ``` The logger `Mortbay` in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty` and we have added the new logger to control jetty. `Mortbay` is useless. I guess we can remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221137) Time Spent: 10m Remaining Estimate: 0h > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > Labels: patch-available, pull-request-available > Attachments: HIVE-21556.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > > {code:java} > logger.Mortbay.name = org.mortbay > logger.Mortbay.level = INFO > {code} > The logger `Mortbay` in log4j.properties is used to control logging > activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, > the package name has changed to `org.eclipse.jetty` and we have added the new > logger to control jetty. `Mortbay` is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-21556: -- Labels: patch-available pull-request-available (was: patch-available) > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > Labels: patch-available, pull-request-available > Attachments: HIVE-21556.1.patch > > > > {code:java} > logger.Mortbay.name = org.mortbay > logger.Mortbay.level = INFO > {code} > The logger `Mortbay` in log4j.properties is used to control logging > activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, > the package name has changed to `org.eclipse.jetty` and we have added the new > logger to control jetty. `Mortbay` is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21404) MSSQL upgrade script alters the wrong column
[ https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-21404: --- Assignee: Zoltan Haindrich (was: David Lavati) > MSSQL upgrade script alters the wrong column > > > Key: HIVE-21404 > URL: https://issues.apache.org/jira/browse/HIVE-21404 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.2.0 >Reporter: David Lavati >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, > HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying > the wrong table: > {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}} > https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221113 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:38 Start Date: 01/Apr/19 06:38 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270730194 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, Map partSpec) throws int size = addPartitionDesc.getPartitionCount(); List in = new ArrayList(size); -AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl, true); long writeId; String validWriteIdList; -if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) { - writeId = tableSnapshot.getWriteId(); - validWriteIdList = tableSnapshot.getValidWriteIdList(); + +// In case of replication, get the writeId from the source and use valid write Id list +// for replication. +if (addPartitionDesc.getReplicationSpec().isInReplicationScope() && +addPartitionDesc.getPartition(0).getWriteId() > 0) { + writeId = addPartitionDesc.getPartition(0).getWriteId(); + validWriteIdList = new ValidReaderWriteIdList(TableName.getDbTable(tbl.getDbName(), Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221113) Time Spent: 9h (was: 8h 50m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Description: The configuration `org.mortbay` in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty`. This configuration is useless. I guess we can remove it. (was: The configuration `org.mortbay` in log4j.properties is used to control logging activities of jetty . However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty`, this configuration is useless. I guess we can remove it.) > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > > The configuration `org.mortbay` in log4j.properties is used to control > logging activities of jetty (6.x). However, we have upgrade to jetty 9 in > HIVE-16049, the package name has changed to `org.eclipse.jetty`. This > configuration is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21392) Misconfigurations of DataNucleus log in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806487#comment-16806487 ] Hive QA commented on HIVE-21392: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964407/HIVE-21392.08.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16791/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16791/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16791/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2019-04-01 07:43:32.550 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-16791/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2019-04-01 07:43:32.553 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at dc0b16a HIVE-21001: Upgrade to calcite-1.19 (Zoltan Haindrich reviewed by Jesus Camacho Rodriguez) + git clean -f -d Removing standalone-metastore/metastore-server/src/gen/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at dc0b16a HIVE-21001: Upgrade to calcite-1.19 (Zoltan Haindrich reviewed by Jesus Camacho Rodriguez) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2019-04-01 07:43:33.315 + rm -rf ../yetus_PreCommit-HIVE-Build-16791 + mkdir ../yetus_PreCommit-HIVE-Build-16791 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-16791 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-16791/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: common/src/main/resources/hive-log4j2.properties:51 error: repository lacks the necessary blob to fall back on 3-way merge. error: common/src/main/resources/hive-log4j2.properties: patch does not apply error: patch failed: common/src/test/resources/hive-exec-log4j2-test.properties:42 error: repository lacks the necessary blob to fall back on 3-way merge. error: common/src/test/resources/hive-exec-log4j2-test.properties: patch does not apply error: patch failed: common/src/test/resources/hive-log4j2-test.properties:49 error: repository lacks the necessary blob to fall back on 3-way merge. error: common/src/test/resources/hive-log4j2-test.properties: patch does not apply error: patch failed: data/conf/hive-log4j2.properties:50 error: repository lacks the necessary blob to fall back on 3-way merge. error: data/conf/hive-log4j2.properties: patch does not apply error: patch failed: hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-log4j2.properties:49 error: repository lacks the necessary blob to fall back on 3-way merge. error: hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-log4j2.properties: patch does not apply error: patch failed: llap-server/src/main/resources/llap-cli-log4j2.properties:58 error: repository lacks the necessary blob to fall back on 3-way merge. error: llap-server/src/main/resources/llap-cli-log4j2.properties: patch does not apply error: patch failed: llap-server/src/main/resources/llap-daemon-log4j2.properties:100 error: repository lacks the necessary blob to fall back on 3-way merge. error: llap-server/src/main/resources/llap-daemon-log4j2.properties: patch does not apply error: patch failed: llap-server/src/test/resources/llap-daemon-log4j2.properties:64 error: repository lacks the necessary blob to fall back on 3-way merge. error: llap-server/src/test/resources/llap-daemon-log4j2.properties: patch does not apply error: patch failed:
[jira] [Comment Edited] (HIVE-21540) Query with join condition having date literal throws SemanticException.
[ https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806491#comment-16806491 ] Zoltan Haindrich edited comment on HIVE-21540 at 4/1/19 7:53 AM: - +1 I think we might possibly also miss: TOK_TIMESTAMPLITERAL and TOK_TIMESTAMPLOCALTZLITERAL from that switch statement was (Author: kgyrtkirk): +1 > Query with join condition having date literal throws SemanticException. > --- > > Key: HIVE-21540 > URL: https://issues.apache.org/jira/browse/HIVE-21540 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.1.0, 3.1.1 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: Analyzer, DateField, pull-request-available > Attachments: HIVE-21540.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This semantic exception is thrown for the following query. > *SemanticException '2019-03-20' encountered with 0 children* > {code} > create table date_1 (key int, dd date); > create table date_2 (key int, dd date); > select d1.key, d2.dd from( > select key, dd as start_dd, current_date as end_dd from date_1) d1 > join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and > end_dd; > {code} > When the WHERE condition below is commented out, the query completes > successfully. > where d2.dd between start_dd and end_dd > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar
[ https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-21316: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) pushed to master. Thank you Vineet for reviewing the changes! > Comparision of varchar column and string literal should happen in varchar > - > > Key: HIVE-21316 > URL: https://issues.apache.org/jira/browse/HIVE-21316 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21316.01.patch, HIVE-21316.02.patch, > HIVE-21316.03.patch, HIVE-21316.04.patch, HIVE-21316.05.patch, > HIVE-21316.06.patch, HIVE-21316.06.patch, HIVE-21316.07.patch, > HIVE-21316.07.patch, HIVE-21316.07.patch, HIVE-21316.08.patch, > HIVE-21316.08.patch > > > this is most probably the root cause behind HIVE-21310 as well -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
[ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-21509: -- Status: Patch Available (was: In Progress) > LLAP may cache corrupted column vectors and return wrong query result > - > > Key: HIVE-21509 > URL: https://issues.apache.org/jira/browse/HIVE-21509 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, > HIVE-21509.2.patch, HIVE-21509.3.patch > > > In some scenarios, LLAP might store column vectors in cache that are getting > reused and reset just before their original content would be written. > The issue is a concurrency issue and is thereby flaky. It is not easy to > reproduce, but the odds of surfacing this issue can by improved by setting > LLAP executor and IO thread counts this way: > * set hive.llap.daemon.num.executors=32; > * set hive.llap.io.threadpool.size=1; > * using TPCDS input data of store_sales table, have at least a couple of > 100k's of rows, and use text format: > {code:java} > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') > STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code} > * having more splits increases the issue showing itself, so it is worth to > _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_ > * run query on this this table: select min(ss_sold_date_sk) from store_sales; > The first query result is correct (2450816 in my case). Repeating the query > will trigger reading from LLAP cache and produce a wrong result: 0. > If one wants to make sure of running into this issue, place a > Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run(). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
[ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-21509: -- Attachment: HIVE-21509.3.patch > LLAP may cache corrupted column vectors and return wrong query result > - > > Key: HIVE-21509 > URL: https://issues.apache.org/jira/browse/HIVE-21509 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, > HIVE-21509.2.patch, HIVE-21509.3.patch > > > In some scenarios, LLAP might store column vectors in cache that are getting > reused and reset just before their original content would be written. > The issue is a concurrency issue and is thereby flaky. It is not easy to > reproduce, but the odds of surfacing this issue can by improved by setting > LLAP executor and IO thread counts this way: > * set hive.llap.daemon.num.executors=32; > * set hive.llap.io.threadpool.size=1; > * using TPCDS input data of store_sales table, have at least a couple of > 100k's of rows, and use text format: > {code:java} > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') > STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code} > * having more splits increases the issue showing itself, so it is worth to > _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_ > * run query on this this table: select min(ss_sold_date_sk) from store_sales; > The first query result is correct (2450816 in my case). Repeating the query > will trigger reading from LLAP cache and produce a wrong result: 0. > If one wants to make sure of running into this issue, place a > Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run(). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
[ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-21509: -- Status: In Progress (was: Patch Available) > LLAP may cache corrupted column vectors and return wrong query result > - > > Key: HIVE-21509 > URL: https://issues.apache.org/jira/browse/HIVE-21509 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, > HIVE-21509.2.patch, HIVE-21509.3.patch > > > In some scenarios, LLAP might store column vectors in cache that are getting > reused and reset just before their original content would be written. > The issue is a concurrency issue and is thereby flaky. It is not easy to > reproduce, but the odds of surfacing this issue can by improved by setting > LLAP executor and IO thread counts this way: > * set hive.llap.daemon.num.executors=32; > * set hive.llap.io.threadpool.size=1; > * using TPCDS input data of store_sales table, have at least a couple of > 100k's of rows, and use text format: > {code:java} > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') > STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code} > * having more splits increases the issue showing itself, so it is worth to > _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_ > * run query on this this table: select min(ss_sold_date_sk) from store_sales; > The first query result is correct (2450816 in my case). Repeating the query > will trigger reading from LLAP cache and produce a wrong result: 0. > If one wants to make sure of running into this issue, place a > Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run(). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221109 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:30 Start Date: 01/Apr/19 06:30 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270728648 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String dbName, private int persistColumnStats(Hive db) throws HiveException, MetaException, IOException { ColumnStatistics colStats = constructColumnStatsFromInput(); -ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc(); -// We do not support stats replication for a transactional table yet. If we are converting -// a non-transactional table to a transactional table during replication, we might get -// column statistics but we shouldn't update those. -if (work.getColStats() != null && - AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(), - colStatsDesc.getTableName( { - LOG.debug("Skipped updating column stats for table " + -TableName.getDbTable(colStatsDesc.getDbName(), colStatsDesc.getTableName()) + -" because it is converted to a transactional table during replication."); - return 0; -} - SetPartitionsStatsRequest request = new SetPartitionsStatsRequest(Collections.singletonList(colStats)); + +// Set writeId and validWriteId list for replicated statistics. +if (work.getColStats() != null) { + String dbName = colStats.getStatsDesc().getDbName(); + String tblName = colStats.getStatsDesc().getTableName(); + Table tbl = db.getTable(dbName, tblName); + long writeId = work.getWriteId(); + // If it's a transactional table on source and target, we will get a valid writeId + // associated with it. Otherwise it's a non-transactional table on source migrated to a + // transactional table on target, we need to craft a valid writeId here. + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList writeIds; +if (writeId <= 0) { Review comment: We can not set writeId in the ColumnStatsUpdateWork because the writeId for migration is available only after a transaction is opened for migration, which doesn't happen at the load time (when the work is created). Going by the gist of your suggestion, I have set a flag in work to indicate that the writeId should be the one used for migration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221109) Time Spent: 8h 40m (was: 8.5h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 8h 40m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221117=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221117 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 06:44 Start Date: 01/Apr/19 06:44 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270731373 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String dbName, private int persistColumnStats(Hive db) throws HiveException, MetaException, IOException { ColumnStatistics colStats = constructColumnStatsFromInput(); -ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc(); -// We do not support stats replication for a transactional table yet. If we are converting -// a non-transactional table to a transactional table during replication, we might get -// column statistics but we shouldn't update those. -if (work.getColStats() != null && - AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(), - colStatsDesc.getTableName( { - LOG.debug("Skipped updating column stats for table " + -TableName.getDbTable(colStatsDesc.getDbName(), colStatsDesc.getTableName()) + -" because it is converted to a transactional table during replication."); - return 0; -} - SetPartitionsStatsRequest request = new SetPartitionsStatsRequest(Collections.singletonList(colStats)); + +// Set writeId and validWriteId list for replicated statistics. +if (work.getColStats() != null) { + String dbName = colStats.getStatsDesc().getDbName(); + String tblName = colStats.getStatsDesc().getTableName(); + Table tbl = db.getTable(dbName, tblName); + long writeId = work.getWriteId(); + // If it's a transactional table on source and target, we will get a valid writeId + // associated with it. Otherwise it's a non-transactional table on source migrated to a + // transactional table on target, we need to craft a valid writeId here. + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList writeIds; +if (writeId <= 0) { + Long tmpWriteId = ReplUtils.getMigrationCurrentTblWriteId(conf); + if (tmpWriteId == null) { +throw new HiveException("DDLTask : Write id is not set in the config by open txn task for migration"); + } + writeId = tmpWriteId; +} +writeIds = new ValidReaderWriteIdList(TableName.getDbTable(dbName, tblName), new long[0], Review comment: work.getColStats() returns non-null value only in case of replication flow. This block of code is under that condition. So, it executes only in repl flow. Added a comment to that effect. Also added a comment per your suggestion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221117) Time Spent: 9h 10m (was: 9h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 9h 10m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Description: {{The logger `Mortbay`}} {code:java} // code placeholder {code} {{in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty` and we have added `Jetty` logger to control the new version. `Mortbay` is useless. I guess we can remove it.}} was:The configuration `org.mortbay` in log4j.properties is used to control logging activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, the package name has changed to `org.eclipse.jetty`. This configuration is useless. I guess we can remove it. > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > > {{The logger `Mortbay`}} > {code:java} > // code placeholder > {code} > {{in log4j.properties is used to control logging activities of jetty (6.x). > However, we have upgrade to jetty 9 in HIVE-16049, the package name has > changed to `org.eclipse.jetty` and we have added `Jetty` logger to control > the new version. `Mortbay` is useless. I guess we can remove it.}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21556) Useless configuration for old jetty in log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-21556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhi updated HIVE-21556: Attachment: HIVE-21556.1.patch > Useless configuration for old jetty in log4j.properties > --- > > Key: HIVE-21556 > URL: https://issues.apache.org/jira/browse/HIVE-21556 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Chen Zhi >Priority: Minor > Attachments: HIVE-21556.1.patch > > > > {code:java} > logger.Mortbay.name = org.mortbay > logger.Mortbay.level = INFO > {code} > The logger `Mortbay` in log4j.properties is used to control logging > activities of jetty (6.x). However, we have upgrade to jetty 9 in HIVE-16049, > the package name has changed to `org.eclipse.jetty` and we have added the new > logger to control jetty. `Mortbay` is useless. I guess we can remove it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.
[ https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806491#comment-16806491 ] Zoltan Haindrich commented on HIVE-21540: - +1 > Query with join condition having date literal throws SemanticException. > --- > > Key: HIVE-21540 > URL: https://issues.apache.org/jira/browse/HIVE-21540 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.1.0, 3.1.1 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: Analyzer, DateField, pull-request-available > Attachments: HIVE-21540.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This semantic exception is thrown for the following query. > *SemanticException '2019-03-20' encountered with 0 children* > {code} > create table date_1 (key int, dd date); > create table date_2 (key int, dd date); > select d1.key, d2.dd from( > select key, dd as start_dd, current_date as end_dd from date_1) d1 > join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and > end_dd; > {code} > When the WHERE condition below is commented out, the query completes > successfully. > where d2.dd between start_dd and end_dd > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-15546) Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel
[ https://issues.apache.org/jira/browse/HIVE-15546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804873#comment-16804873 ] t oo edited comment on HIVE-15546 at 4/1/19 8:01 AM: - [~stakiar] Issue still faced with single threading - HIVE-21546 was (Author: toopt4): Did this ever make release 2.3? I can't see it in [https://github.com/apache/hive/blob/rel/release-2.3.0/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java] and issue still faced with single threading (https://stackoverflow.com/questions/55416703/hiveserver2-on-spark-mapred-fileinputformat-total-input-files-to-process) > Optimize Utilities.getInputPaths() so each listStatus of a partition is done > in parallel > > > Key: HIVE-15546 > URL: https://issues.apache.org/jira/browse/HIVE-15546 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Fix For: 2.3.0 > > Attachments: HIVE-15546.1.patch, HIVE-15546.2.patch, > HIVE-15546.3.patch, HIVE-15546.4.patch, HIVE-15546.5.patch, HIVE-15546.6.patch > > > When running on blobstores (like S3) where metadata operations (like > listStatus) are costly, Utilities.getInputPaths() can add significant > overhead when setting up the input paths for an MR / Spark / Tez job. > The method performs a listStatus on all input paths in order to check if the > path is empty. If the path is empty, a dummy file is created for the given > partition. This is all done sequentially. This can be really slow when there > are a lot of empty partitions. Even when all partitions have input data, this > can take a long time. > We should either: > (1) Just remove the logic to check if each input path is empty, and handle > any edge cases accordingly. > (2) Multi-thread the listStatus calls -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21404) MSSQL upgrade script alters the wrong column
[ https://issues.apache.org/jira/browse/HIVE-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806516#comment-16806516 ] Hive QA commented on HIVE-21404: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 20 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 11m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16792/dev-support/hive-personality.sh | | git revision | master / f8a73a8 | | Default Java | 1.8.0_111 | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-16792/yetus/whitespace-tabs.txt | | modules | C: standalone-metastore/metastore-server U: standalone-metastore/metastore-server | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16792/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > MSSQL upgrade script alters the wrong column > > > Key: HIVE-21404 > URL: https://issues.apache.org/jira/browse/HIVE-21404 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.2.0 >Reporter: David Lavati >Assignee: David Lavati >Priority: Major > Labels: pull-request-available > Fix For: 3.2.0 > > Attachments: HIVE-21404.1.patch, HIVE-21404.2.patch, > HIVE-21404.3.patch, HIVE-21404.4.patch, HIVE-21404.4.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-20221 changes PARTITION_PARAMS, so the following command is modifying > the wrong table: > {{ALTER TABLE "SERDE_PARAMS" ALTER COLUMN "PARAM_VALUE" nvarchar(MAX);}} > https://github.com/apache/hive/blob/d3b036920acde7bb04840697eb13038103b062b4/standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.1.0-to-3.2.0.mssql.sql#L21 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-14836) Test the predicate pushing down support for Parquet vectorization read path
[ https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806928#comment-16806928 ] Xinli Shang commented on HIVE-14836: Hi [~Ferd] , is this task done? > Test the predicate pushing down support for Parquet vectorization read path > --- > > Key: HIVE-14836 > URL: https://issues.apache.org/jira/browse/HIVE-14836 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu >Priority: Major > Labels: pull-request-available > Attachments: HIVE-14836.patch > > > We should add more UT test for predicate pushing down support for Parquet > vectorization read path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18576) Support to read nested complex type with Parquet in vectorization mode
[ https://issues.apache.org/jira/browse/HIVE-18576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806930#comment-16806930 ] Xinli Shang commented on HIVE-18576: Hi [~jerrychenhf] , is this task done? > Support to read nested complex type with Parquet in vectorization mode > -- > > Key: HIVE-18576 > URL: https://issues.apache.org/jira/browse/HIVE-18576 > Project: Hive > Issue Type: Sub-task >Reporter: Colin Ma >Assignee: Haifeng Chen >Priority: Major > > Nested complex type is common used, eg: Struct, s2 > List>. Currently, nested complex type can't be parsed in vectorization > mode, this ticket is target to support it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
[ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221331 ] ASF GitHub Bot logged work on HIVE-21529: - Author: ASF GitHub Bot Created on: 01/Apr/19 15:33 Start Date: 01/Apr/19 15:33 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r270896979 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot, Hive dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot); dmd.write(); -// If external tables are enabled for replication and -// - If bootstrap is enabled, then need to combine bootstrap dump of external tables. -// - If metadata-only dump is enabled, then shall skip dumping external tables data locations to -// _external_tables_info file. If not metadata-only, then dump the data locations. -if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES) -&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY) -|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) { +// Examine all the tables if required. +if (shouldExamineTablesToDump()) { Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true); + + // If we are bootstrapping ACID tables, stop all the concurrent transactions and take a + // snapshot to dump those tables. Record the last event id in case we are performing + // bootstrap of ACID tables. + String validTxnList = null; + long bootstrapLastReplId = 0; + if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) { +validTxnList = getValidTxnListForReplDump(hiveDb); +bootstrapLastReplId = hiveDb.getMSC().getCurrentNotificationEventId().getEventId(); Review comment: bootstrapLastReplId should be captured before open txn of REPL DUMP query. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221331) Time Spent: 20m (was: 10m) > Hive support bootstrap of ACID/MM tables on an existing policy. > --- > > Key: HIVE-21529 > URL: https://issues.apache.org/jira/browse/HIVE-21529 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21529.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > > If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an > existing repl policy, then need to combine bootstrap dump of these tables > along with the ongoing incremental dump. > Shall add a one time config "hive.repl.bootstrap.acid.tables" to include > bootstrap in the given dump. > The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up > partially bootstrapped tables in case of retry is already in place, thanks to > the work done during external tables. Need to test that it actually works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
[ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221333 ] ASF GitHub Bot logged work on HIVE-21529: - Author: ASF GitHub Bot Created on: 01/Apr/19 15:33 Start Date: 01/Apr/19 15:33 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r270926635 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot, Hive dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot); dmd.write(); -// If external tables are enabled for replication and -// - If bootstrap is enabled, then need to combine bootstrap dump of external tables. -// - If metadata-only dump is enabled, then shall skip dumping external tables data locations to -// _external_tables_info file. If not metadata-only, then dump the data locations. -if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES) -&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY) -|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) { +// Examine all the tables if required. +if (shouldExamineTablesToDump()) { Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true); + + // If we are bootstrapping ACID tables, stop all the concurrent transactions and take a + // snapshot to dump those tables. Record the last event id in case we are performing + // bootstrap of ACID tables. + String validTxnList = null; + long bootstrapLastReplId = 0; + if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) { +validTxnList = getValidTxnListForReplDump(hiveDb); +bootstrapLastReplId = hiveDb.getMSC().getCurrentNotificationEventId().getEventId(); Review comment: There is a very corner case where this logic can be problem even for full bootstrap. Driver.java: L663 if ((queryState.getHiveOperation() != null) && queryState.getHiveOperation().equals(HiveOperation.REPLDUMP)) { setLastReplIdForDump(queryState.getConf()); } openTransaction(); Here we capture last repl ID just before opening txn from REPL DUMP execution thread (let's say T1) If a concurrent thread (let's say T2) opens a txn after getting last repl ID but before openTransaction() in T1. Let's say T2 writes in to the ACID table. In this case, REPL DUMP would wait for T2 to commit txn before dumping ACID table. Now, the snapshot of bootstrap dumped table includes the data written by Thread T2. But, those events will be part of subsequent incremental dump. When we apply these OpenTxn, allocateWriteId and commitTxn events, it may create duplicate data. Need some idempotent logic in replica side to handle this. Please check if my theory make sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221333) Time Spent: 40m (was: 0.5h) > Hive support bootstrap of ACID/MM tables on an existing policy. > --- > > Key: HIVE-21529 > URL: https://issues.apache.org/jira/browse/HIVE-21529 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21529.01.patch > > Time Spent: 40m > Remaining Estimate: 0h > > If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an > existing repl policy, then need to combine bootstrap dump of these tables > along with the ongoing incremental dump. > Shall add a one time config "hive.repl.bootstrap.acid.tables" to include > bootstrap in the given dump. > The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up > partially bootstrapped tables in case of retry is already in place, thanks to > the work done during external tables. Need to test that it actually works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
[ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221335 ] ASF GitHub Bot logged work on HIVE-21529: - Author: ASF GitHub Bot Created on: 01/Apr/19 15:33 Start Date: 01/Apr/19 15:33 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r270897339 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot, Hive dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot); Review comment: Events dump should be limited until last repl ID that was captured before open txn. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221335) Time Spent: 1h (was: 50m) > Hive support bootstrap of ACID/MM tables on an existing policy. > --- > > Key: HIVE-21529 > URL: https://issues.apache.org/jira/browse/HIVE-21529 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21529.01.patch > > Time Spent: 1h > Remaining Estimate: 0h > > If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an > existing repl policy, then need to combine bootstrap dump of these tables > along with the ongoing incremental dump. > Shall add a one time config "hive.repl.bootstrap.acid.tables" to include > bootstrap in the given dump. > The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up > partially bootstrapped tables in case of retry is already in place, thanks to > the work done during external tables. Need to test that it actually works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
[ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221332 ] ASF GitHub Bot logged work on HIVE-21529: - Author: ASF GitHub Bot Created on: 01/Apr/19 15:33 Start Date: 01/Apr/19 15:33 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r270898935 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -193,27 +226,39 @@ private Long incrementalDump(Path dumpRoot, DumpMetaData dmd, Path cmRoot, Hive dmd.setDump(DumpType.INCREMENTAL, work.eventFrom, lastReplId, cmRoot); dmd.write(); -// If external tables are enabled for replication and -// - If bootstrap is enabled, then need to combine bootstrap dump of external tables. -// - If metadata-only dump is enabled, then shall skip dumping external tables data locations to -// _external_tables_info file. If not metadata-only, then dump the data locations. -if (conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_EXTERNAL_TABLES) -&& (!conf.getBoolVar(HiveConf.ConfVars.REPL_DUMP_METADATA_ONLY) -|| conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_EXTERNAL_TABLES))) { +// Examine all the tables if required. +if (shouldExamineTablesToDump()) { Path dbRoot = getBootstrapDbRoot(dumpRoot, dbName, true); + + // If we are bootstrapping ACID tables, stop all the concurrent transactions and take a + // snapshot to dump those tables. Record the last event id in case we are performing + // bootstrap of ACID tables. + String validTxnList = null; + long bootstrapLastReplId = 0; + if (conf.getBoolVar(HiveConf.ConfVars.REPL_BOOTSTRAP_ACID_TABLES)) { +validTxnList = getValidTxnListForReplDump(hiveDb); +bootstrapLastReplId = hiveDb.getMSC().getCurrentNotificationEventId().getEventId(); + } + try (Writer writer = new Writer(dumpRoot, conf)) { Review comment: Shall we create the _external_info file only if shouldDumpExternalTableLocation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221332) Time Spent: 0.5h (was: 20m) > Hive support bootstrap of ACID/MM tables on an existing policy. > --- > > Key: HIVE-21529 > URL: https://issues.apache.org/jira/browse/HIVE-21529 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21529.01.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an > existing repl policy, then need to combine bootstrap dump of these tables > along with the ongoing incremental dump. > Shall add a one time config "hive.repl.bootstrap.acid.tables" to include > bootstrap in the given dump. > The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up > partially bootstrapped tables in case of retry is already in place, thanks to > the work done during external tables. Need to test that it actually works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
[ https://issues.apache.org/jira/browse/HIVE-21529?focusedWorklogId=221334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221334 ] ASF GitHub Bot logged work on HIVE-21529: - Author: ASF GitHub Bot Created on: 01/Apr/19 15:33 Start Date: 01/Apr/19 15:33 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r270911338 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/Utils.java ## @@ -196,6 +196,11 @@ public static boolean shouldReplicate(ReplicationSpec replicationSpec, Table tab } return shouldReplicateExternalTables; } + + // Skip dumping events related to ACID tables if bootstrap is enabled on it Review comment: shouldReplicate is not checked by events such as AllocateWriteIdEvent and CommitTxnEvent. CommitTxnEvent cannot be skipped but need to remove all AcidWriteEvents which are packed along with it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221334) Time Spent: 50m (was: 40m) > Hive support bootstrap of ACID/MM tables on an existing policy. > --- > > Key: HIVE-21529 > URL: https://issues.apache.org/jira/browse/HIVE-21529 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 4.0.0 >Reporter: Sankar Hariappan >Assignee: Ashutosh Bapat >Priority: Major > Labels: DR, pull-request-available, replication > Attachments: HIVE-21529.01.patch > > Time Spent: 50m > Remaining Estimate: 0h > > If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an > existing repl policy, then need to combine bootstrap dump of these tables > along with the ongoing incremental dump. > Shall add a one time config "hive.repl.bootstrap.acid.tables" to include > bootstrap in the given dump. > The support for hive.repl.bootstrap.cleanup.type for ACID tables to clean-up > partially bootstrapped tables in case of retry is already in place, thanks to > the work done during external tables. Need to test that it actually works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16795) Measure Performance for Parquet Vectorization Reader
[ https://issues.apache.org/jira/browse/HIVE-16795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806927#comment-16806927 ] Xinli Shang commented on HIVE-16795: Hi [~Ferd] , is this effort done? Any numbers you can share? Sorry If I missed other channels for this task since I just joined recently. > Measure Performance for Parquet Vectorization Reader > > > Key: HIVE-16795 > URL: https://issues.apache.org/jira/browse/HIVE-16795 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Colin Ma >Priority: Major > > We need to measure the performance of Parquet Vectorization reader feature > using TPCx-BB or TPC-DS to see how much performance gain we can archive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null
[ https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-21557: -- Attachment: HIVE-21557.02.patch > Query based compaction fails with NullPointerException: Non-local session > path expected to be non-null > -- > > Key: HIVE-21557 > URL: https://issues.apache.org/jira/browse/HIVE-21557 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21557.02.patch, HIVE-21557.patch > > > {code:java} > 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 > db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" > level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if > exists default_tmp_compactor_asd_1553864659196 > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57) > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194) > Caused by: java.lang.NullPointerException: Non-local session path expected to > be non-null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838) > at org.apache.hadoop.hive.ql.Context.(Context.java:319) > at org.apache.hadoop.hive.ql.Context.(Context.java:305) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) > at > org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null
[ https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806917#comment-16806917 ] Ashutosh Chauhan commented on HIVE-21557: - +1 > Query based compaction fails with NullPointerException: Non-local session > path expected to be non-null > -- > > Key: HIVE-21557 > URL: https://issues.apache.org/jira/browse/HIVE-21557 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21557.02.patch, HIVE-21557.patch > > > {code:java} > 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 > db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" > level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if > exists default_tmp_compactor_asd_1553864659196 > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57) > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194) > Caused by: java.lang.NullPointerException: Non-local session path expected to > be non-null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838) > at org.apache.hadoop.hive.ql.Context.(Context.java:319) > at org.apache.hadoop.hive.ql.Context.(Context.java:305) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) > at > org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221203 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 09:58 Start Date: 01/Apr/19 09:58 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270794735 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} Review comment: I don't think we need to be really hard and fast about the exact number of tables loaded. All we are testing is whether there was a failure and the retry loaded the stats successfully. Current set of checks is enough for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221203) Time Spent: 10h (was: 9h 50m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 10h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Polishchuk reassigned HIVE-21532: --- Assignee: Oleksiy Sayankin (was: Oleksandr Polishchuk) > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21511) beeline -f report no such file if file is not on local fs
[ https://issues.apache.org/jira/browse/HIVE-21511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806598#comment-16806598 ] Zoltan Haindrich commented on HIVE-21511: - +1 patch 1 had a green run; patch 2 only added a new test (which is also passing) > beeline -f report no such file if file is not on local fs > - > > Key: HIVE-21511 > URL: https://issues.apache.org/jira/browse/HIVE-21511 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Bruno Pusztahazi >Assignee: Bruno Pusztahazi >Priority: Blocker > Labels: patch > Attachments: HIVE-21511.1.patch, HIVE-21511.2.patch, > HIVE-21511.3.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > I test like this > HQL=hdfs://hacluster/tmp/ff.hql > if hadoop fs -test -f ${HQL} > then > beeline -f ${HQL} > fi > test ${HQL} ok, but beeline report ${HQL} no such file or directory -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21034) Add option to schematool to drop Hive databases
[ https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-21034: Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) pushed to master. Thank you [~dvoros]! > Add option to schematool to drop Hive databases > --- > > Key: HIVE-21034 > URL: https://issues.apache.org/jira/browse/HIVE-21034 > Project: Hive > Issue Type: Improvement >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, > HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, > HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch > > > An option to remove all Hive managed data could be a useful addition to > {{schematool}}. > I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all > databases with CASCADE* to remove all data of managed tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221241 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:13 Start Date: 01/Apr/19 11:13 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270818464 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java ## @@ -297,21 +303,34 @@ private ColumnStatisticsDesc getColumnStatsDesc(String dbName, private int persistColumnStats(Hive db) throws HiveException, MetaException, IOException { ColumnStatistics colStats = constructColumnStatsFromInput(); -ColumnStatisticsDesc colStatsDesc = colStats.getStatsDesc(); -// We do not support stats replication for a transactional table yet. If we are converting -// a non-transactional table to a transactional table during replication, we might get -// column statistics but we shouldn't update those. -if (work.getColStats() != null && - AcidUtils.isTransactionalTable(getHive().getTable(colStatsDesc.getDbName(), - colStatsDesc.getTableName( { - LOG.debug("Skipped updating column stats for table " + -TableName.getDbTable(colStatsDesc.getDbName(), colStatsDesc.getTableName()) + -" because it is converted to a transactional table during replication."); - return 0; -} - SetPartitionsStatsRequest request = new SetPartitionsStatsRequest(Collections.singletonList(colStats)); + +// Set writeId and validWriteId list for replicated statistics. +if (work.getColStats() != null) { + String dbName = colStats.getStatsDesc().getDbName(); + String tblName = colStats.getStatsDesc().getTableName(); + Table tbl = db.getTable(dbName, tblName); + long writeId = work.getWriteId(); + // If it's a transactional table on source and target, we will get a valid writeId + // associated with it. Otherwise it's a non-transactional table on source migrated to a + // transactional table on target, we need to craft a valid writeId here. + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList writeIds; +if (writeId <= 0) { + Long tmpWriteId = ReplUtils.getMigrationCurrentTblWriteId(conf); + if (tmpWriteId == null) { +throw new HiveException("DDLTask : Write id is not set in the config by open txn task for migration"); + } + writeId = tmpWriteId; +} +writeIds = new ValidReaderWriteIdList(TableName.getDbTable(dbName, tblName), new long[0], Review comment: I think, this assumption can change in future if someone uses this task to update stats even in non-repl flow. I suggest to add explicit check for repl scope. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221241) Time Spent: 11h (was: 10h 50m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 11h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806701#comment-16806701 ] Hive QA commented on HIVE-21532: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964427/HIVE-21532.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 15890 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230) org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty (batchId=230) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMultiInsert (batchId=327) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16794/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16794/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16794/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12964427 - PreCommit-HIVE-Build > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221207 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 10:02 Start Date: 01/Apr/19 10:02 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270795998 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} + } + + private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int failAfterNumEvents) throws Throwable { +// fail add notification when updating table stats after given number of such events. Thus we +// test successful application as well as failed application of this event. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntEvents = 0; + @Override + public Boolean apply(NotificationEvent entry) { +cntEvents++; +if (entry.getEventType().equalsIgnoreCase(EventMessage.EventType.UPDATE_TABLE_COLUMN_STAT.toString()) && +cntEvents > failAfterNumEvents) { + injectionPathCalled = true; + LOG.warn("Verifier - DB: " + entry.getDbName() + + " Table: " + entry.getTableName() + + " Event: " + entry.getEventType()); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAddNotificationModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, dumpTuple.dumpLocation); +} finally { + InjectableBehaviourObjectStore.resetAddNotificationModifier(); +} +callerVerifier.assertInjectionsPerformed(true, false); + +// fail add notification when updating partition stats for for the second time. Thus we test +// successful application as well as failed application of this event. +callerVerifier = new BehaviourInjection() { + int cntEvents = 1; + + @Override + public Boolean apply(NotificationEvent entry) { +cntEvents++; +if (entry.getEventType().equalsIgnoreCase(EventMessage.EventType.UPDATE_PARTITION_COLUMN_STAT.toString()) && +cntEvents > failAfterNumEvents) { + injectionPathCalled = true; + LOG.warn("Verifier - DB: " + entry.getDbName() + + " Table: " + entry.getTableName() + + " Event: " + entry.getEventType()); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAddNotificationModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, dumpTuple.dumpLocation); +} finally { + InjectableBehaviourObjectStore.resetAddNotificationModifier(); +} Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221207) Time Spent: 10h 20m (was: 10h 10m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 >
[jira] [Updated] (HIVE-19034) hadoop fs test can check srcipt ok, but beeline -f report no such file
[ https://issues.apache.org/jira/browse/HIVE-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-19034: Resolution: Duplicate Status: Resolved (was: Patch Available) duplicate of HIVE-21511 > hadoop fs test can check srcipt ok, but beeline -f report no such file > -- > > Key: HIVE-19034 > URL: https://issues.apache.org/jira/browse/HIVE-19034 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 1.3.0, beeline-cli-branch > Environment: java version: 1.8.0_112-b15 > hadoop version: 2.7.2 > hive version:1.3.0 > hive JDBS version: 1.3.0 > beeline version: 1.3.0 >Reporter: fengxianghui >Assignee: Bruno Pusztahazi >Priority: Blocker > Labels: patch, todoc4.0 > Fix For: 1.3.0 > > Attachments: HIVE-19034.1.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > I test like this > HQL=hdfs://hacluster/tmp/ff.hql > if hadoop fs -test -f ${HQL} > then > beeline -f ${HQL} > fi > test ${HQL} ok, but beeline report ${HQL} no such file or directory -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806622#comment-16806622 ] Oleksiy Sayankin commented on HIVE-21532: - Rebased the patch against master. Let's see if there any conflicts. > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806622#comment-16806622 ] Oleksiy Sayankin edited comment on HIVE-21532 at 4/1/19 11:34 AM: -- Rebased the patch against master. Let's see if there are any conflicts. was (Author: osayankin): Rebased the patch against master. Let's see if there any conflicts. > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null
[ https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-21557: -- Attachment: HIVE-21557.patch > Query based compaction fails with NullPointerException: Non-local session > path expected to be non-null > -- > > Key: HIVE-21557 > URL: https://issues.apache.org/jira/browse/HIVE-21557 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Priority: Major > Attachments: HIVE-21557.patch > > > {code:java} > 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 > db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" > level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if > exists default_tmp_compactor_asd_1553864659196 > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57) > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194) > Caused by: java.lang.NullPointerException: Non-local session path expected to > be non-null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838) > at org.apache.hadoop.hive.ql.Context.(Context.java:319) > at org.apache.hadoop.hive.ql.Context.(Context.java:305) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) > at > org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null
[ https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-21557: -- Assignee: Peter Vary Status: Patch Available (was: Open) > Query based compaction fails with NullPointerException: Non-local session > path expected to be non-null > -- > > Key: HIVE-21557 > URL: https://issues.apache.org/jira/browse/HIVE-21557 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Attachments: HIVE-21557.patch > > > {code:java} > 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 > db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" > level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if > exists default_tmp_compactor_asd_1553864659196 > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57) > at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408) > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250) > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194) > Caused by: java.lang.NullPointerException: Non-local session path expected to > be non-null > at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228) > at > org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838) > at org.apache.hadoop.hive.ql.Context.(Context.java:319) > at org.apache.hadoop.hive.ql.Context.(Context.java:305) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753) > at > org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806592#comment-16806592 ] Peter Vary commented on HIVE-9995: -- [~dkuzmenko]: Rebase please :( > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221209 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 10:05 Start Date: 01/Apr/19 10:05 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270797067 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenariosMigrationNoAutogather.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder; +import org.junit.BeforeClass; +import org.junit.Rule; +import org.junit.rules.TestName; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +/** + * Tests statistics replication for ACID tables. + */ +public class TestStatsReplicationScenariosMigrationNoAutogather extends TestStatsReplicationScenarios { + @Rule + public final TestName testName = new TestName(); + + protected static final Logger LOG = LoggerFactory.getLogger(TestReplicationScenarios.class); Review comment: Removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221209) Time Spent: 10.5h (was: 10h 20m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 10.5h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221210 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 10:05 Start Date: 01/Apr/19 10:05 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270797151 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; Review comment: Hmm. Thanks for catching this. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221210) Time Spent: 10h 40m (was: 10.5h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 10h 40m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
[ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806649#comment-16806649 ] Hive QA commented on HIVE-21509: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12964421/HIVE-21509.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15886 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16793/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16793/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16793/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12964421 - PreCommit-HIVE-Build > LLAP may cache corrupted column vectors and return wrong query result > - > > Key: HIVE-21509 > URL: https://issues.apache.org/jira/browse/HIVE-21509 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, > HIVE-21509.2.patch, HIVE-21509.3.patch > > > In some scenarios, LLAP might store column vectors in cache that are getting > reused and reset just before their original content would be written. > The issue is a concurrency issue and is thereby flaky. It is not easy to > reproduce, but the odds of surfacing this issue can by improved by setting > LLAP executor and IO thread counts this way: > * set hive.llap.daemon.num.executors=32; > * set hive.llap.io.threadpool.size=1; > * using TPCDS input data of store_sales table, have at least a couple of > 100k's of rows, and use text format: > {code:java} > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( 'field.delim'='|', 'serialization.format'='|') > STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code} > * having more splits increases the issue showing itself, so it is worth to > _set tez.grouping.min-size=1024; set tez.grouping.max-size=1024;_ > * run query on this this table: select min(ss_sold_date_sk) from store_sales; > The first query result is correct (2450816 in my case). Repeating the query > will trigger reading from LLAP cache and produce a wrong result: 0. > If one wants to make sure of running into this issue, place a > Thread.sleep(250) at the beginning of VectorDeserializeOrcWriter#run(). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221240 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:10 Start Date: 01/Apr/19 11:10 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270817825 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenariosMigration.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder; +import org.junit.BeforeClass; +import org.junit.Rule; +import org.junit.rules.TestName; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +/** + * Tests statistics replication for ACID tables. + */ +public class TestStatsReplicationScenariosMigration extends TestStatsReplicationScenarios { + @Rule + public final TestName testName = new TestName(); + + protected static final Logger LOG = LoggerFactory.getLogger(TestReplicationScenarios.class); + + @BeforeClass + public static void classLevelSetup() throws Exception { +Map overrides = new HashMap<>(); +overrides.put(MetastoreConf.ConfVars.EVENT_MESSAGE_FACTORY.getHiveName(), +GzipJSONMessageEncoder.class.getCanonicalName()); + +HashMap replicaConfigs = new HashMap() {{ + put("hive.support.concurrency", "true"); + put("hive.txn.manager", "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager"); + put("hive.metastore.client.capability.check", "false"); + put("hive.repl.bootstrap.dump.open.txn.timeout", "1s"); + put("hive.exec.dynamic.partition.mode", "nonstrict"); + put("hive.strict.checks.bucketing", "false"); + put("hive.mapred.mode", "nonstrict"); + put("mapred.input.dir.recursive", "true"); + put("hive.metastore.disallow.incompatible.col.type.changes", "false"); + put("hive.strict.managed.tables", "true"); +}}; +replicaConfigs.putAll(overrides); + +HashMap primaryConfigs = new HashMap() {{ + put("hive.metastore.client.capability.check", "false"); + put("hive.repl.bootstrap.dump.open.txn.timeout", "1s"); + put("hive.exec.dynamic.partition.mode", "nonstrict"); + put("hive.strict.checks.bucketing", "false"); + put("hive.mapred.mode", "nonstrict"); + put("mapred.input.dir.recursive", "true"); + put("hive.metastore.disallow.incompatible.col.type.changes", "false"); + put("hive.support.concurrency", "false"); + put("hive.txn.manager", "org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager"); + put("hive.strict.managed.tables", "false"); +}}; +primaryConfigs.putAll(overrides); + +internalBeforeClassSetup(primaryConfigs, replicaConfigs, Review comment: In migration case, we shall validate if stats are associated with correct writeId. I think, in our tests, it should be pointing to last allocated writeId. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221240) Time Spent: 10h 50m (was: 10h 40m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority:
[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806681#comment-16806681 ] Zoltan Haindrich commented on HIVE-21532: - test? https://issues.apache.org/jira/browse/HIVE-21532?focusedCommentId=16806615=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16806615 > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21540) Query with join condition having date literal throws SemanticException.
[ https://issues.apache.org/jira/browse/HIVE-21540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806591#comment-16806591 ] Sankar Hariappan commented on HIVE-21540: - Thanks for the review [~kgyrtkirk]! I will create another ticket to track TOK_TIMESTAMPLITERAL and TOK_TIMESTAMPLOCALTZLITERAL cases. > Query with join condition having date literal throws SemanticException. > --- > > Key: HIVE-21540 > URL: https://issues.apache.org/jira/browse/HIVE-21540 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.1.0, 3.1.1 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: Analyzer, DateField, pull-request-available > Attachments: HIVE-21540.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This semantic exception is thrown for the following query. > *SemanticException '2019-03-20' encountered with 0 children* > {code} > create table date_1 (key int, dd date); > create table date_2 (key int, dd date); > select d1.key, d2.dd from( > select key, dd as start_dd, current_date as end_dd from date_1) d1 > join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and > end_dd; > {code} > When the WHERE condition below is commented out, the query completes > successfully. > where d2.dd between start_dd and end_dd > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221206 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 10:00 Start Date: 01/Apr/19 10:00 Worklog Time Spent: 10m Work Description: ashutosh-bapat commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270795354 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} + } + + private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int failAfterNumEvents) throws Throwable { +// fail add notification when updating table stats after given number of such events. Thus we +// test successful application as well as failed application of this event. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntEvents = 0; + @Override + public Boolean apply(NotificationEvent entry) { +cntEvents++; Review comment: This code has changed while working on another related comment. Again we don't need to count exact number of events. We need at least one successful event and other one unsuccessful event. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221206) Time Spent: 10h 10m (was: 10h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 10h 10m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
[ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806601#comment-16806601 ] Hive QA commented on HIVE-21509: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 57s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 33s{color} | {color:blue} storage-api in master has 48 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 49s{color} | {color:blue} llap-server in master has 81 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 23s{color} | {color:red} llap-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 23s{color} | {color:red} llap-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 23s{color} | {color:red} llap-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s{color} | {color:red} llap-server: The patch generated 4 new + 29 unchanged - 1 fixed = 33 total (was 30) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 22s{color} | {color:red} llap-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 17m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16793/dev-support/hive-personality.sh | | git revision | master / 7bbd93f | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | mvninstall | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-mvninstall-llap-server.txt | | compile | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt | | javac | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/diff-checkstyle-llap-server.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-findbugs-llap-server.txt | | modules | C: storage-api llap-server U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > LLAP may cache corrupted column vectors and return wrong query result > - > > Key: HIVE-21509 > URL: https://issues.apache.org/jira/browse/HIVE-21509 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Adam Szita >Assignee: Adam Szita >Priority: Major > Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, > HIVE-21509.2.patch, HIVE-21509.3.patch > > > In some scenarios, LLAP might store column
[jira] [Updated] (HIVE-21511) beeline -f report no such file if file is not on local fs
[ https://issues.apache.org/jira/browse/HIVE-21511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-21511: Labels: patch todoc4.0 (was: patch) > beeline -f report no such file if file is not on local fs > - > > Key: HIVE-21511 > URL: https://issues.apache.org/jira/browse/HIVE-21511 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Bruno Pusztahazi >Assignee: Bruno Pusztahazi >Priority: Blocker > Labels: patch, todoc4.0 > Attachments: HIVE-21511.1.patch, HIVE-21511.2.patch, > HIVE-21511.3.patch > > Original Estimate: 0.05h > Remaining Estimate: 0.05h > > I test like this > HQL=hdfs://hacluster/tmp/ff.hql > if hadoop fs -test -f ${HQL} > then > beeline -f ${HQL} > fi > test ${HQL} ok, but beeline report ${HQL} no such file or directory -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806615#comment-16806615 ] Zoltan Haindrich commented on HIVE-21532: - Please submit the patch against master - as it also might be affected (at first glance the patch seem to apply) Could you please write a qtest for this issue? ({{git grep FallbackHiveAuthorizerFactory|grep q:}} to get some examples > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-21532: Status: In Progress (was: Patch Available) > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir
[ https://issues.apache.org/jira/browse/HIVE-21532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-21532: Attachment: HIVE-21532.3.patch > RuntimeException due to AccessControlException during creating > hive-staging-dir > --- > > Key: HIVE-21532 > URL: https://issues.apache.org/jira/browse/HIVE-21532 > Project: Hive > Issue Type: Bug >Reporter: Oleksandr Polishchuk >Assignee: Oleksiy Sayankin >Priority: Minor > Attachments: HIVE-21532.1.patch, HIVE-21532.1.patch, > HIVE-21532.2.patch, HIVE-21532.3.patch > > > The bug was found with environment - Hive-2.3. > Steps lead to an exception: > 1) Create user without root permissions on your node. > 2) The {{hive-site.xml}} file has to contain the next properties: > {code:java} > > hive.security.authorization.enabled > true > > > hive.security.authorization.manager > > org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory > > {code} > 3) Open Hive CLI and do next query: > {code:java} > insert overwrite local directory '/tmp/test_dir' row format delimited fields > terminated by ',' select * from temp.test; > {code} > The previous query will fails with the next exception: > {code:java} > FAILED: RuntimeException Cannot create staging directory > 'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1': > User testuser(user id 3456) has been denied access to create > .hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1 > {code} > The investigation shows that if delete the mentioned above properties from > {{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in > the {{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the > Hive-2.1. everything will be fine. The current method is using in the > {{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}} - {{String statsTmpLoc > = ctx.getTempDirForPath(dest_path).toString();}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21553) Upgrade derby version in standalone-metastore
[ https://issues.apache.org/jira/browse/HIVE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806650#comment-16806650 ] Laszlo Bodor commented on HIVE-21553: - thanks for the review [~kgyrtkirk]! as this is just a derby version change, I haven't tried integration tests with other db vendors, and the basic units tests should have run with the precommit ptest, I hope it's enough > Upgrade derby version in standalone-metastore > - > > Key: HIVE-21553 > URL: https://issues.apache.org/jira/browse/HIVE-21553 > Project: Hive > Issue Type: Bug >Reporter: Laszlo Bodor >Assignee: Laszlo Bodor >Priority: Major > Attachments: HIVE-21553.01.patch > > > HIVE-17506 decoupled standalone metastore poms from hive, then HIVE-18586 > upgraded derby to 10.14.1.0 in hive's top level pom.xml. Maybe it could be > useful to synchronize them. > Current versions > hive: 10.14.1.0 > standalone-metastore: 10.10.2.0 > This idea came up while investigating some derby lock issues in HIVE-21550. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221247 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:18 Start Date: 01/Apr/19 11:18 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270820010 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} + } + + private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int failAfterNumEvents) throws Throwable { +// fail add notification when updating table stats after given number of such events. Thus we +// test successful application as well as failed application of this event. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntEvents = 0; + @Override + public Boolean apply(NotificationEvent entry) { +cntEvents++; Review comment: OK. Pls update the test to fail for 2nd update stats event. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221247) Time Spent: 11h 40m (was: 11.5h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 11h 40m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221246 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:18 Start Date: 01/Apr/19 11:18 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270820010 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} + } + + private void failIncrementalLoad(WarehouseInstance.Tuple dumpTuple, int failAfterNumEvents) throws Throwable { +// fail add notification when updating table stats after given number of such events. Thus we +// test successful application as well as failed application of this event. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntEvents = 0; + @Override + public Boolean apply(NotificationEvent entry) { +cntEvents++; Review comment: OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221246) Time Spent: 11.5h (was: 11h 20m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221244 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:17 Start Date: 01/Apr/19 11:17 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270819437 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -2950,21 +2957,32 @@ public Partition createPartition(Table tbl, Map partSpec) throws int size = addPartitionDesc.getPartitionCount(); List in = new ArrayList(size); -AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, tbl, true); long writeId; String validWriteIdList; -if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) { - writeId = tableSnapshot.getWriteId(); - validWriteIdList = tableSnapshot.getValidWriteIdList(); + +// In case of replication, get the writeId from the source and use valid write Id list +// for replication. +if (addPartitionDesc.getReplicationSpec().isInReplicationScope() && +addPartitionDesc.getPartition(0).getWriteId() > 0) { Review comment: I think, this assumption of table type is non-transactional (based on writeId=0) and ignoring failure case is not right. We can explicitly check if it is transactional table or not and then do necessary checks. If writeId comes as 0 for transactional table, then it is error flow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221244) Time Spent: 11h 10m (was: 11h) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 11h 10m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (HIVE-21109) Stats replication for ACID tables.
[ https://issues.apache.org/jira/browse/HIVE-21109?focusedWorklogId=221245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-221245 ] ASF GitHub Bot logged work on HIVE-21109: - Author: ASF GitHub Bot Created on: 01/Apr/19 11:17 Start Date: 01/Apr/19 11:17 Worklog Time Spent: 10m Work Description: sankarh commented on pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r270819769 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java ## @@ -301,12 +338,106 @@ private String dumpLoadVerify(List tableNames, String lastReplicationId, return dumpTuple.lastReplicationId; } + /** + * Run a bootstrap that will fail. + * @param tuple the location of bootstrap dump + */ + private void failBootstrapLoad(WarehouseInstance.Tuple tuple, int failAfterNumTables) throws Throwable { +// fail setting ckpt directory property for the second table so that we test the case when +// bootstrap load fails after some but not all tables are loaded. +BehaviourInjection callerVerifier += new BehaviourInjection() { + int cntTables = 0; + @Nullable + @Override + public Boolean apply(@Nullable CallerArguments args) { +cntTables++; +if (args.dbName.equalsIgnoreCase(replicatedDbName) && cntTables > failAfterNumTables) { + injectionPathCalled = true; + LOG.warn("Verifier - DB : " + args.dbName + " TABLE : " + args.tblName); + return false; +} +return true; + } +}; + +InjectableBehaviourObjectStore.setAlterTableModifier(callerVerifier); +try { + replica.loadFailure(replicatedDbName, tuple.dumpLocation); + callerVerifier.assertInjectionsPerformed(true, false); +} finally { + InjectableBehaviourObjectStore.resetAlterTableModifier(); +} Review comment: Ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 221245) Time Spent: 11h 20m (was: 11h 10m) > Stats replication for ACID tables. > -- > > Key: HIVE-21109 > URL: https://issues.apache.org/jira/browse/HIVE-21109 > Project: Hive > Issue Type: Sub-task >Reporter: Ashutosh Bapat >Assignee: Ashutosh Bapat >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21109.01.patch, HIVE-21109.02.patch, > HIVE-21109.03.patch, HIVE-21109.04.patch, HIVE-21109.05.patch, > HIVE-21109.06.patch, HIVE-21109.07.patch, HIVE-21109.08.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > Transactional tables require a writeID associated with the stats update. This > writeId needs to be in sync with the writeId on the source and hence needs to > be replicated from the source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)