[
https://issues.apache.org/jira/browse/BEAM-7070?focusedWorklogId=228729&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-228729
]
ASF GitHub Bot logged work on BEAM-7070:
----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Apr/19 22:36
Start Date: 16/Apr/19 22:36
Worklog Time Spent: 10m
Work Description: akedin commented on pull request #8301: [BEAM-7070]
JOIN condition should accept field access
URL: https://github.com/apache/beam/pull/8301#discussion_r276018587
##########
File path:
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/BeamJoinTransforms.java
##########
@@ -44,26 +46,48 @@
/** A {@code SimpleFunction} to extract join fields from the specified row.
*/
public static class ExtractJoinFields extends SimpleFunction<Row, KV<Row,
Row>> {
- private final List<Integer> joinColumns;
+ private final List<SerializableRexNode> joinColumns;
private final Schema schema;
+ private int leftRowColumnCount;
public ExtractJoinFields(
- boolean isLeft, List<Pair<Integer, Integer>> joinColumns, Schema
schema) {
+ boolean isLeft,
+ List<Pair<RexNode, RexNode>> joinColumns,
+ Schema schema,
+ int leftRowColumnCount) {
this.joinColumns =
- joinColumns.stream().map(pair -> isLeft ? pair.left :
pair.right).collect(toList());
+ joinColumns.stream()
+ .map(pair -> SerializableRexNode.builder(isLeft ? pair.left :
pair.right).build())
+ .collect(toList());
this.schema = schema;
+ this.leftRowColumnCount = leftRowColumnCount;
}
@Override
public KV<Row, Row> apply(Row input) {
- Row row =
joinColumns.stream().map(input::getValue).collect(toRow(schema));
+ Row row =
+ joinColumns.stream()
+ .map(v -> getValue(v, input, leftRowColumnCount))
+ .collect(toRow(schema));
return KV.of(row, input);
}
+ @SuppressWarnings("unused")
private Schema.Field toField(Schema schema, Integer fieldIndex) {
Schema.Field original = schema.getField(fieldIndex);
return original.withName("c" + fieldIndex);
}
+
+ private Object getValue(
Review comment:
This supports only one level nesting. How hard would it be to implement the
linked list of the field indices? E.g. `serializableRexNode.isNested()`,
`serializableRexNode.getNestedSerializableRexNode()`. This way the logic can
probably be generalized a bit. And you probably won't need the class hierarchy,
it would be a linked list with an int payload, and you stop when you reach the
end of the list.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 228729)
Time Spent: 1h 20m (was: 1h 10m)
> JOIN condition should accept field access
> -----------------------------------------
>
> Key: BEAM-7070
> URL: https://issues.apache.org/jira/browse/BEAM-7070
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Rui Wang
> Assignee: Rui Wang
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)