[
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=228820&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-228820
]
ASF GitHub Bot logged work on BEAM-4461:
----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Apr/19 05:21
Start Date: 17/Apr/19 05:21
Worklog Time Spent: 10m
Work Description: reuvenlax commented on issue #8273: [BEAM-4461] A
transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#issuecomment-483941125
@robinyqiu I think the fundamental difference between CoGroup and Join isn't
the cross product, it's that CoGroup is a general grouping/join on N inputs
while Join is something closer to a standard binary join. Moving the
cross-product into Join would mean that we would lose it for the N-input case
where N > 2. I don't think we want the Join transform to start dealing with
PCollectionTuples - the whole point of the binary join transform is simply to
be syntactic sugar that makes common use cases easy.
You are .correct that standard CoGroup is closer to Group. That's the same
with standard Beam too - CoGroupByKey is just the multi-input version of
GroupByKey. I think that it would make sense to allow similar aggregations in
CoGroup as we do in Group
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 228820)
Time Spent: 27h 10m (was: 27h)
> Create a library of useful transforms that use schemas
> ------------------------------------------------------
>
> Key: BEAM-4461
> URL: https://issues.apache.org/jira/browse/BEAM-4461
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-java-core
> Reporter: Reuven Lax
> Assignee: Reuven Lax
> Priority: Major
> Labels: triaged
> Time Spent: 27h 10m
> Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)