[jira] [Work logged] (HIVE-24241) Enable SharedWorkOptimizer to merge downstream operators after an optimization step
[ https://issues.apache.org/jira/browse/HIVE-24241?focusedWorklogId=504632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504632 ] ASF GitHub Bot logged work on HIVE-24241: - Author: ASF GitHub Bot Created on: 26/Oct/20 05:54 Start Date: 26/Oct/20 05:54 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1562: URL: https://github.com/apache/hive/pull/1562#discussion_r511717213 ## File path: ql/src/test/results/clientpositive/perf/tez/constraints/query32.q.out ## @@ -160,7 +160,7 @@ Stage-0 Select Operator [SEL_115] (rows=286549727 width=119) Output:["_col0","_col1","_col2"] Filter Operator [FIL_113] (rows=286549727 width=119) -predicate:(cs_sold_date_sk is not null and cs_item_sk BETWEEN DynamicValue(RS_28_item_i_item_sk_min) AND DynamicValue(RS_28_item_i_item_sk_max) and in_bloom_filter(cs_item_sk, DynamicValue(RS_28_item_i_item_sk_bloom_filter))) Review comment: SJ is gone. Is this expected? ## File path: ql/src/test/results/clientpositive/perf/tez/constraints/query92.q.out ## @@ -164,7 +164,7 @@ Stage-0 Select Operator [SEL_115] (rows=143966864 width=119) Output:["_col0","_col1","_col2"] Filter Operator [FIL_113] (rows=143966864 width=119) -predicate:(ws_sold_date_sk is not null and ws_item_sk BETWEEN DynamicValue(RS_28_item_i_item_sk_min) AND DynamicValue(RS_28_item_i_item_sk_max) and in_bloom_filter(ws_item_sk, DynamicValue(RS_28_item_i_item_sk_bloom_filter))) Review comment: SJ got removed. Is this expected? ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ## @@ -1136,4 +1173,30 @@ public static boolean isOr(ExprNodeDesc expr) { return false; } + public static boolean isAnd(ExprNodeDesc expr) { +if (expr instanceof ExprNodeGenericFuncDesc) { Review comment: I think you could use `ExprNodeDescExprFactory.isANDFuncCallExpr` or `FunctionRegistry.isOpAnd(expr)`? ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -2595,6 +2595,8 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal HIVE_SHARED_WORK_DPPUNION_OPTIMIZATION("hive.optimize.shared.work.dppunion", true, "Enables dppops unioning. This optimization will enable to merge multiple tablescans with different " + "dynamic filters into a single one (with a more complex filter)"), + HIVE_SHARED_WORK_DOWNSTREAM_MERGE("hive.optimize.shared.work.downstream.merge", true, +"Analyzes and merges equiv downstream operators after a successfull shared work optimization step."), Review comment: nit. typo 'successfull' ## File path: ql/src/test/results/clientpositive/perf/tez/constraints/query1b.q.out ## @@ -176,7 +176,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: store_returns - filterExpr: (((sr_customer_sk is not null and sr_store_sk is not null and sr_returned_date_sk is not null) or (sr_store_sk is not null and sr_returned_date_sk is not null)) and sr_store_sk BETWEEN DynamicValue(RS_40_store_s_store_sk_min) AND DynamicValue(RS_40_store_s_store_sk_max) and in_bloom_filter(sr_store_sk, DynamicValue(RS_40_store_s_store_sk_bloom_filter))) (type: boolean) + filterExpr: (sr_store_sk BETWEEN DynamicValue(RS_40_store_s_store_sk_min) AND DynamicValue(RS_40_store_s_store_sk_max) and in_bloom_filter(sr_store_sk, DynamicValue(RS_40_store_s_store_sk_bloom_filter)) and ((sr_customer_sk is not null and sr_store_sk is not null and sr_returned_date_sk is not null) or (sr_store_sk is not null and sr_returned_date_sk is not null))) (type: boolean) Review comment: Same as above. Filter exprs order ## File path: ql/src/test/results/clientpositive/perf/tez/constraints/query54.q.out ## @@ -202,156 +202,154 @@ Stage-0 predicate:(_col1 <= _col3) Merge Join Operator [MERGEJOIN_294] (rows=15218525 width=12) Conds:(Inner),Output:["_col0","_col1","_col3"] - <-Reducer 15 [CUSTOM_SIMPLE_EDGE] + <-Reducer 20 [CUSTOM_SIMPLE_EDGE] PARTITION_ONLY_SHUFFLE [RS_99] Filter Operator [FIL_98] (rows=608741 width=12)
[jira] [Comment Edited] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception
[ https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220493#comment-17220493 ] Chao Gao edited comment on HIVE-24066 at 10/26/20, 5:52 AM: I could reproduce this issue using the given PARQUET file. When running the following query, it shows NULL. {code:java} hive> select context.os from sample_parquet_table; OK NULL NULL NULL NULL NULL{code} When running the following query, it throws the exception, which is expected showing NULL as well. {code:java} hive> select context.os.name from sample_parquet_table; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] {code} was (Author: chaoga): I could reproduce this issue using the given PARQUET file. When running the following query, it shows NULL. hive> select context.os from sample_parquet_table; OK NULL NULL NULL NULL NULL When running the following query, it throws the exception, which is expected showing NULL as well. hive> select context.os.name from sample_parquet_table; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] > Hive query on parquet data should identify if column is not present in file > schema and show NULL value instead of Exception > --- > > Key: HIVE-24066 > URL: https://issues.apache.org/jira/browse/HIVE-24066 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2, 2.3.5 >Reporter: Jainik Vora >Priority: Major > Attachments: day_01.snappy.parquet > > > I created a hive table containing columns with struct data type > > {code:java} > CREATE EXTERNAL TABLE test_dwh.sample_parquet_table ( > `context` struct< > `app`: struct< > `build`: string, > `name`: string, > `namespace`: string, > `version`: string > >, > `device`: struct< > `adtrackingenabled`: boolean, > `advertisingid`: string, > `id`: string, > `manufacturer`: string, > `model`: string, > `type`: string > >, > `locale`: string, > `library`: struct< > `name`: string, > `version`: string > >, > `os`: struct< > `name`: string, > `version`: string > >, > `screen`: struct< > `height`: bigint, > `width`: bigint > >, > `network`: struct< > `carrier`: string, > `cellular`: boolean, > `wifi`: boolean > >, > `timezone`: string, > `userAgent`: string > > > ) PARTITIONED BY (day string) > STORED as PARQUET > LOCATION 's3://xyz/events'{code} > > All columns are nullable hence the parquet files read by the table don't > always contain all columns. If any file in a partition doesn't have > "context.os" struct and if "context.os.name" is queried, Hive throws an > exception as below. Same for "context.screen" as well. > > {code:java} > 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 > main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with > exception java.io.IOException:java.lang.RuntimeException: Primitive type > osshould not doesn't match typeos[name] > 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 > main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with > exception java.io.IOException:java.lang.RuntimeException: Primitive type > osshould not doesn't match typeos[name]java.io.IOException: > java.lang.RuntimeException: Primitive type osshould not doesn't match > typeos[name] > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Commented] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception
[ https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220493#comment-17220493 ] Chao Gao commented on HIVE-24066: - I could reproduce this issue using the given PARQUET file. When running the following query, it shows NULL. hive> select context.os from sample_parquet_table; OK NULL NULL NULL NULL NULL When running the following query, it throws the exception, which is expected showing NULL as well. hive> select context.os.name from sample_parquet_table; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Primitive type osshould not doesn't match typeos[name] > Hive query on parquet data should identify if column is not present in file > schema and show NULL value instead of Exception > --- > > Key: HIVE-24066 > URL: https://issues.apache.org/jira/browse/HIVE-24066 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2, 2.3.5 >Reporter: Jainik Vora >Priority: Major > Attachments: day_01.snappy.parquet > > > I created a hive table containing columns with struct data type > > {code:java} > CREATE EXTERNAL TABLE test_dwh.sample_parquet_table ( > `context` struct< > `app`: struct< > `build`: string, > `name`: string, > `namespace`: string, > `version`: string > >, > `device`: struct< > `adtrackingenabled`: boolean, > `advertisingid`: string, > `id`: string, > `manufacturer`: string, > `model`: string, > `type`: string > >, > `locale`: string, > `library`: struct< > `name`: string, > `version`: string > >, > `os`: struct< > `name`: string, > `version`: string > >, > `screen`: struct< > `height`: bigint, > `width`: bigint > >, > `network`: struct< > `carrier`: string, > `cellular`: boolean, > `wifi`: boolean > >, > `timezone`: string, > `userAgent`: string > > > ) PARTITIONED BY (day string) > STORED as PARQUET > LOCATION 's3://xyz/events'{code} > > All columns are nullable hence the parquet files read by the table don't > always contain all columns. If any file in a partition doesn't have > "context.os" struct and if "context.os.name" is queried, Hive throws an > exception as below. Same for "context.screen" as well. > > {code:java} > 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 > main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with > exception java.io.IOException:java.lang.RuntimeException: Primitive type > osshould not doesn't match typeos[name] > 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 > main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with > exception java.io.IOException:java.lang.RuntimeException: Primitive type > osshould not doesn't match typeos[name]java.io.IOException: > java.lang.RuntimeException: Primitive type osshould not doesn't match > typeos[name] > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: java.lang.RuntimeException: Primitive type osshould not doesn't > match typeos[name] > at > org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.projectLeafTypes(DataWritableReadSupport.java:330) > > at >
[jira] [Work logged] (HIVE-23387) Flip the Warehouse.getDefaultTablePath() to return path from ext warehouse
[ https://issues.apache.org/jira/browse/HIVE-23387?focusedWorklogId=504609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504609 ] ASF GitHub Bot logged work on HIVE-23387: - Author: ASF GitHub Bot Created on: 26/Oct/20 01:01 Start Date: 26/Oct/20 01:01 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1116: URL: https://github.com/apache/hive/pull/1116#issuecomment-716247660 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504609) Time Spent: 40m (was: 0.5h) > Flip the Warehouse.getDefaultTablePath() to return path from ext warehouse > -- > > Key: HIVE-23387 > URL: https://issues.apache.org/jira/browse/HIVE-23387 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23387.patch, HIVE-23387.patch, HIVE-23387.patch > > Time Spent: 40m > Remaining Estimate: 0h > > For backward compatibility, initial fix returned path that was set on db. It > could have been either from managed warehouse or external depending on what > was set. There were tests relying on certain paths to be returned. This fix > is to address the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24051) Hive lineage information exposed in ExecuteWithHookContext
[ https://issues.apache.org/jira/browse/HIVE-24051?focusedWorklogId=504610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504610 ] ASF GitHub Bot logged work on HIVE-24051: - Author: ASF GitHub Bot Created on: 26/Oct/20 01:01 Start Date: 26/Oct/20 01:01 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1413: URL: https://github.com/apache/hive/pull/1413 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504610) Time Spent: 0.5h (was: 20m) > Hive lineage information exposed in ExecuteWithHookContext > -- > > Key: HIVE-24051 > URL: https://issues.apache.org/jira/browse/HIVE-24051 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24051.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > The lineage information is not populated unless certain hooks are enabled. > However, this is a bit fragile, and no way for another hook that we write to > get this information. This proposes a flag to enable this instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0
[ https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220351#comment-17220351 ] Chao Sun commented on HIVE-21737: - [~fokko] I'm not proposing to restore the API. Instead, I'm proposing to replace the API {{JsonProperties#getJsonProp}} with {{JsonProperties#getObjectProp}} (which is available since Avro 1.8) and then cast the returned object to the desired type in Hive. There are only 7 usages for {{getJsonProp}} in Hive and they are just used to retrieve scale/precision/maxLength for Decimal/Char types. > Upgrade Avro to version 1.10.0 > -- > > Key: HIVE-21737 > URL: https://issues.apache.org/jira/browse/HIVE-21737 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ismaël Mejía >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without > Jackson in the public API and Guava as a dependency. Worth the update. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0
[ https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220294#comment-17220294 ] Fokko Driesprong commented on HIVE-21737: - Hi Chao, unfortunately, that's not possible. The getJsonProps would return a JsonNode: [https://github.com/apache/avro/pull/135/files#diff-e86ec7c2ab127130c9faf2786059caad4b257aecbee571c3f9ad0b136935c43cR151] This JsonNode is from Jackson 1.0: org.codehaus.jackson.JsonNode And this library has been replaced by Jackson 2.x. Therefore we can't restore the function. The API shouldn't expose third party classes in the first place. > Upgrade Avro to version 1.10.0 > -- > > Key: HIVE-21737 > URL: https://issues.apache.org/jira/browse/HIVE-21737 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ismaël Mejía >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without > Jackson in the public API and Guava as a dependency. Worth the update. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=504550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504550 ] ASF GitHub Bot logged work on HIVE-24259: - Author: ASF GitHub Bot Created on: 25/Oct/20 09:20 Start Date: 25/Oct/20 09:20 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma opened a new pull request #1610: URL: https://github.com/apache/hive/pull/1610 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504550) Remaining Estimate: 0h Time Spent: 10m > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24259: -- Labels: pull-request-available (was: ) > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data mismatch between CachedStore and ObjectStore for constraints
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-24258: Summary: [CachedStore] Data mismatch between CachedStore and ObjectStore for constraints (was: [CachedStore] Data mismatch between CachedStore and ObjectStore) > [CachedStore] Data mismatch between CachedStore and ObjectStore for > constraints > --- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24258) [CachedStore] Data mismatch between CachedStore and ObjectStore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-24258. - Fix Version/s: 4.0.0 Target Version/s: 4.0.0 Resolution: Fixed Merged to master. Thanks [~ashish-kumar-sharma] for the patch! > [CachedStore] Data mismatch between CachedStore and ObjectStore > --- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data mismatch between CachedStore and ObjectStore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-24258: Summary: [CachedStore] Data mismatch between CachedStore and ObjectStore (was: [CachedStore] Data miss match between cachedstore and rawstore) > [CachedStore] Data mismatch between CachedStore and ObjectStore > --- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24258) [CachedStore] Data mismatch between CachedStore and ObjectStore
[ https://issues.apache.org/jira/browse/HIVE-24258?focusedWorklogId=504549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504549 ] ASF GitHub Bot logged work on HIVE-24258: - Author: ASF GitHub Bot Created on: 25/Oct/20 09:01 Start Date: 25/Oct/20 09:01 Worklog Time Spent: 10m Work Description: sankarh merged pull request #1587: URL: https://github.com/apache/hive/pull/1587 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 504549) Time Spent: 1h 20m (was: 1h 10m) > [CachedStore] Data mismatch between CachedStore and ObjectStore > --- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24259: - Parent: HIVE-21637 Issue Type: Sub-task (was: Task) > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24259: - Priority: Minor (was: Major) > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24259 started by Ashish Sharma. > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
[ https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24259: - Issue Type: Task (was: Improvement) > [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call > -- > > Key: HIVE-24259 > URL: https://issues.apache.org/jira/browse/HIVE-24259 > Project: Hive > Issue Type: Task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > > Description - > currently inorder to get all constraint form the cachedstore. 6 different > call is made to the store. Instead combine that 6 call in 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24258: - Parent: HIVE-21637 Issue Type: Sub-task (was: Task) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Sub-task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24258: - Issue Type: Task (was: Improvement) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Task >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24258) [CachedStore] Data miss match between cachedstore and rawstore
[ https://issues.apache.org/jira/browse/HIVE-24258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma reassigned HIVE-24258: Assignee: Ashish Sharma (was: Sankar Hariappan) > [CachedStore] Data miss match between cachedstore and rawstore > -- > > Key: HIVE-24258 > URL: https://issues.apache.org/jira/browse/HIVE-24258 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Description > Objects like table name, db name, column name etc are case insensitive as per > HIVE contract but standalone metastore cachedstore is case sensitive. As > result of which there is mismatch in rawstore output and cachedstore output. > Example - > expected:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> > but was:<[SQLPrimaryKey(table_db:test_table_ops, table_name:tbl, > column_name:col1, key_seq:1, pk_name:Pk1, enable_cstr:false, > validate_cstr:false, rely_cstr:false, catName:hive)]> -- This message was sent by Atlassian Jira (v8.3.4#803005)