kasakrisz commented on code in PR #6413:
URL: https://github.com/apache/hive/pull/6413#discussion_r3468167776
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/rewrite/MergeRewriter.java:
##########
@@ -238,20 +239,29 @@ public void
appendWhenMatchedUpdateClause(MergeStatement.UpdateClause updateClau
protected void addValues(Table targetTable, String targetAlias,
Map<String, String> newValues,
List<String> values) {
- UnaryOperator<String> formatter = name -> String.format("%s.%s",
targetAlias,
+ UnaryOperator<String> formatter = name -> String.format("%s.%s",
targetAlias,
HiveUtils.unparseIdentifier(name, conf));
-
+ List<String> valuesToBeAdded = new
ArrayList<>(Collections.nCopies(targetTable.getAllCols().size(), null));
for (FieldSchema fieldSchema : targetTable.getCols()) {
- if (newValues.containsKey(fieldSchema.getName())) {
- String rhsExp = newValues.get(fieldSchema.getName());
- values.add(getRhsExpValue(rhsExp,
formatter.apply(fieldSchema.getName())));
- } else {
- values.add(formatter.apply(fieldSchema.getName()));
- }
+ setColumnValue(targetTable, valuesToBeAdded, newValues, formatter,
fieldSchema.getName(), true);
}
-
- targetTable.getPartCols().forEach(fieldSchema -> values.add(
- formatter.apply(fieldSchema.getName())));
+
+ for (FieldSchema partCol : targetTable.getPartCols()) {
+ setColumnValue(targetTable, valuesToBeAdded, newValues, formatter,
partCol.getName(),
+ targetTable.hasNonNativePartitionSupport());
+ }
+ values.addAll(valuesToBeAdded);
+ }
+
+ protected void setColumnValue(Table targetTable, List<String>
valuesToBeAdded,
+ Map<String, String> newValues, UnaryOperator<String> formatter, String
columnName,
+ boolean applyNewValues) {
+ int index = targetTable.getColumnIndexByName(columnName);
+ String formattedColumn = formatter.apply(columnName);
+ String value = applyNewValues && newValues.containsKey(columnName)
+ ? getRhsExpValue(newValues.get(columnName), formattedColumn)
+ : formattedColumn;
+ valuesToBeAdded.set(index, value);
Review Comment:
1. Extracting `setColumnValue` is a good approach. However it is violating
the single responsibility principle. In `MergeRewriter` please implement the
`setColumnValue` method which always sets the desired value of a column if a
new value is present in the `newValues` map and don't depend on magic flags
like `applyNewValues`.
2. In `MergeRewriter` the `addValues` method should set only non-partition
columns and the partition columns are just referenced like in the original
implementation. They can not be updated.
3. Add a new static inner class to `SplitMergeRewriter`. You can extend
`SplitMergeWhenClauseSqlGenerator` since only `addValues` should be overriden:
the new implementation should handle setting all columns: partition and
non-partition. This new subclass is used for handling tables
`hasNonNativePartitionSupport`
4. `SplitMergeRewriter.reateMergeSqlGenerator` is a factory method: this is
where we decide which WhenClauseSqlGenerator to used based on the target table.
By this approach every part of code is moved to where it belongs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]