[ 
https://issues.apache.org/jira/browse/BEAM-4461?focusedWorklogId=228820&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-228820
 ]

ASF GitHub Bot logged work on BEAM-4461:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Apr/19 05:21
            Start Date: 17/Apr/19 05:21
    Worklog Time Spent: 10m 
      Work Description: reuvenlax commented on issue #8273: [BEAM-4461] A 
transform to perform binary joins of PCollections with schemas
URL: https://github.com/apache/beam/pull/8273#issuecomment-483941125
 
 
   @robinyqiu I think the fundamental difference between CoGroup and Join isn't 
the cross product, it's that CoGroup is a general grouping/join on N inputs 
while Join is something closer to a standard binary join. Moving the 
cross-product into Join would mean that we would lose it for the N-input case 
where N > 2. I don't think we want the Join transform to start dealing with 
PCollectionTuples - the whole point of the binary join transform is simply to 
be syntactic sugar that makes common use cases easy.
   
   You are .correct that standard CoGroup is closer to Group. That's the same 
with standard Beam too - CoGroupByKey is just the multi-input version of 
GroupByKey. I think that it would make sense to allow similar aggregations in 
CoGroup as we do in Group
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 228820)
    Time Spent: 27h 10m  (was: 27h)

> Create a library of useful transforms that use schemas
> ------------------------------------------------------
>
>                 Key: BEAM-4461
>                 URL: https://issues.apache.org/jira/browse/BEAM-4461
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-java-core
>            Reporter: Reuven Lax
>            Assignee: Reuven Lax
>            Priority: Major
>              Labels: triaged
>          Time Spent: 27h 10m
>  Remaining Estimate: 0h
>
> e.g. JoinBy(fields). Project, Filter, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to