[
https://issues.apache.org/jira/browse/HIVE-21172?focusedWorklogId=695610&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695610
]
ASF GitHub Bot logged work on HIVE-21172:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Dec/21 09:56
Start Date: 14/Dec/21 09:56
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on a change in pull request #2857:
URL: https://github.com/apache/hive/pull/2857#discussion_r768476035
##########
File path: ql/src/test/results/clientpositive/llap/masking_acid_no_masking.q.out
##########
@@ -54,8 +53,9 @@ POSTHOOK: Input: default@masking_acid_no_masking
POSTHOOK: Input: default@nonacid_n0
POSTHOOK: Output: default@masking_acid_no_masking
POSTHOOK: Output: default@masking_acid_no_masking
-POSTHOOK: Output: default@masking_acid_no_masking
POSTHOOK: Output: default@merge_tmp_table
POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]
+POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]
Review comment:
is this a duplicate?
##########
File path:
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
##########
@@ -1711,13 +1713,13 @@ public void
testMajorCompactionAfterTwoMergeStatements() throws Exception {
// Verify contents of bucket files.
List<String> expectedRsBucket0 =
Arrays.asList("{\"writeid\":1,\"bucketid\":536870912,\"rowid\":3}\t4\tvalue_4",
- "{\"writeid\":2,\"bucketid\":536870912,\"rowid\":0}\t6\tvalue_6",
- "{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",
- "{\"writeid\":3,\"bucketid\":536870912,\"rowid\":0}\t8\tvalue_8",
- "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":0}\t5\tnewestvalue_5",
- "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":1}\t7\tnewestvalue_7",
- "{\"writeid\":3,\"bucketid\":536870913,\"rowid\":2}\t1\tnewestvalue_1",
-
"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":3}\t2\tnewestvalue_2");
+ "{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",
Review comment:
seeing a change like this I wonder how much value this test adds...
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/parse/MergeSemanticAnalyzer.java
##########
@@ -441,6 +442,11 @@ private String handleUpdate(ASTNode
whenMatchedUpdateClause, StringBuilder rewri
default:
//do nothing
}
+
+ if ("`default`".equalsIgnoreCase(rhsExp.trim())) {
+ rhsExp = MapUtils.getString(colNameToDefaultConstraint, name,
"null");
Review comment:
iiuc this makes changes the column value to the default if we see the
"default" in the query
I see that this also work for a plain insert:
```
create table q2(a string default 'asd');
insert into q2 values(`default`)
select * from q2;
```
however I think the standard suggest to use the `DEFAULT` keyword and not as
a string literal; in Hive we seem to also "support" the "default" as a string
literal to be interpreted as default.
I think because of various reasons - the default keyword becomes the string
default at some point and it works like that right now.
could you open a follow-up to fix the `default` literal's handling?
##########
File path: ql/src/test/results/clientpositive/llap/explain_locks.q.out
##########
@@ -233,20 +221,14 @@ POSTHOOK: Input: default@target@p=2/q=2
POSTHOOK: Output: default@merge_tmp_table
POSTHOOK: Output: default@target
POSTHOOK: Output: default@target@p=1/q=2
-POSTHOOK: Output: default@target@p=1/q=2
-POSTHOOK: Output: default@target@p=1/q=3
POSTHOOK: Output: default@target@p=1/q=3
POSTHOOK: Output: default@target@p=2/q=2
-POSTHOOK: Output: default@target@p=2/q=2
LOCK INFORMATION:
default.source -> SHARED_READ
default.target.p=1/q=2 -> SHARED_READ
default.target.p=1/q=3 -> SHARED_READ
default.target.p=2/q=2 -> SHARED_READ
default.target.p=2/q=2 -> SHARED_WRITE
-default.target.p=2/q=2 -> SHARED_WRITE
Review comment:
I wonder why were these lock duplicated?
##########
File path:
ql/src/test/results/clientpositive/llap/acid_direct_update_delete_with_merge.q.out
##########
@@ -112,11 +110,13 @@ POSTHOOK: Input: default@transactions@tran_date=20170413
POSTHOOK: Output: default@merge_tmp_table
POSTHOOK: Output: default@transactions
POSTHOOK: Output: default@transactions@tran_date=20170410
-POSTHOOK: Output: default@transactions@tran_date=20170410
POSTHOOK: Output: default@transactions@tran_date=20170413
POSTHOOK: Output: default@transactions@tran_date=20170413
POSTHOOK: Output: default@transactions@tran_date=20170415
POSTHOOK: Lineage: merge_tmp_table.val EXPRESSION
[(transactions)t.FieldSchema(name:ROW__ID,
type:struct<writeId:bigint,bucketId:int,rowId:bigint>, comment:),
(transactions)t.FieldSchema(name:tran_date, type:string, comment:null), ]
+POSTHOOK: Lineage: transactions PARTITION(tran_date=20170413).id SIMPLE
[(merge_source)s.FieldSchema(name:id, type:int, comment:null), ]
Review comment:
is this a duplicate lineage entry?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 695610)
Time Spent: 20m (was: 10m)
> DEFAULT keyword handling in MERGE UPDATE clause issues
> ------------------------------------------------------
>
> Key: HIVE-21172
> URL: https://issues.apache.org/jira/browse/HIVE-21172
> Project: Hive
> Issue Type: Sub-task
> Components: SQL, Transactions
> Affects Versions: 4.0.0
> Reporter: Eugene Koifman
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> once HIVE-21159 lands, enable {{HiveConf.MERGE_SPLIT_UPDATE}} and run these
> tests.
> TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
> mvn test -Dtest=TestMiniLlapLocalCliDriver
> -Dqfile=insert_into_default_keyword.q
> Merge is rewritten as a multi-insert. When Update clause has DEFAULT, it's
> not properly replaced with a value in the muli-insert - it's treated as a
> literal
> {noformat}
> INSERT INTO `default`.`acidTable` -- update clause(insert part)
> SELECT `t`.`key`, `DEFAULT`, `t`.`value`
> WHERE `t`.`key` = `s`.`key` AND `s`.`key` > 3 AND NOT(`s`.`key` < 3)
> {noformat}
> See {{LOG.info("Going to reparse <" + originalQuery + "> as \n<" +
> rewrittenQueryStr.toString() + ">");}} in hive.log
> {{MergeSemanticAnalyzer.replaceDefaultKeywordForMerge()}} is only called in
> {{handleInsert}} but not {{handleUpdate()}}. Why does issue only show up with
> {{MERGE_SPLIT_UPDATE}}?
> Once this is fixed, HiveConf.MERGE_SPLIT_UPDATE should be true by default
--
This message was sent by Atlassian Jira
(v8.20.1#820001)