[ 
https://issues.apache.org/jira/browse/BEAM-7629?focusedWorklogId=266749&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-266749
 ]

ASF GitHub Bot logged work on BEAM-7629:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Jun/19 15:00
            Start Date: 25/Jun/19 15:00
    Worklog Time Spent: 10m 
      Work Description: lostluck commented on issue #8936: [BEAM-7629] Go SDK 
additional Validation for DoFns (1st impl) (DO NOT MERGE)
URL: https://github.com/apache/beam/pull/8936#issuecomment-505484499
 
 
   > R: @lostluck
   > 
   > Hoping I could get your feedback on what I have so far and help answering 
some questions.
   > 
   > 1. Some of my tests currently fail because the following is expected in 
signatures for all DoFn methods: `side input parameters must follow main input 
parameter`. My understanding however is that when including side inputs in the 
StartBundle and FinishBundle methods (if ProcessElement has some) only the side 
inputs should be included, with no main inputs. So theoretically it should 
work, but right now all methods are treated as if they're ProcessElement. Is 
this correct?
   They shouldn't be treated exactly like ProcessElement, since the main inputs 
can never be populated in StartBundle and FinishBundle. The two should be 
treated the same as each other though.
   
   
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/graph/fn.go#L134
 is probably the root call where the error is being triggered.
   
   The state machine is probably being a bit overzealous, and needs to be 
relaxed. It should be validating the order only not requiring the presence of 
anything. This PR is adding the necessary presence checking in a more aware 
place. This is my fault, sorry for the inconvenience.
     
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/fn.go#L311
   
   > 2. I had trouble finding info for the expected inputs and outputs of the 
DoFn methods, and had to resort to checking other SDKs. Is there a place where 
that's documented for the Go SDK? If not, could you clarify for StartBundle, 
FinishBundle, Setup, and Teardown what parameters and return values they cab 
have (in addition to the side inputs and emits that I already account for)? 
Right now I make an assumption that Setup/Teardown shouldn't have any 
parameters based on the Java SDK, but couldn't find any concrete info for 
return values for any of the methods.
   
   Re: Setup & Teardown: That's right they should only accept context.Context, 
and that's it.
   Re: StartBundle & FinishBundle, should only be returning error I think? They 
almost certainly should *not* have main returns other than error.
   
   > 3. Should I worry about the Go SDK "generics" when it comes to the side 
input and emit validation? Is it even valid for a user to use generics to 
replace emits or side inputs there, and if so should I be checking that they're 
consistent? For this first draft I just ignored the possibility of generics.
   
   That's totally reasonable for this pass. I'm going to assume you mean the 
Universal Types ( beam.T, beam.X, beam.Y ... ). Universal types may only be 
used to replace a single type at present, not composite types (like, KVs, 
CoGBKs, or Iterators in any of their flavours.) When validating, all that 
matters is that they're used consistently (eg. a Beam.T emit maps to a Beam.T 
emit). They may not be substituted within a given DoFn between methods (eg. 
ProcessElement has a func(beam.T), then so too must Start and Finish).
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 266749)
    Time Spent: 0.5h  (was: 20m)

> Improve DoFn method validation in core/graph/fn.go
> --------------------------------------------------
>
>                 Key: BEAM-7629
>                 URL: https://issues.apache.org/jira/browse/BEAM-7629
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Daniel Oliveira
>            Assignee: Daniel Oliveira
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Various improvements can be made to validating the signatures and type usages 
> in DoFns. Some things that should probably be checked:
>  * Check that StartBundle and FinishBundle contain any emit parameters and 
> side inputs present in ProcessElement
>  * Check that any side inputs/emits have correctly matching types between 
> Start/FinishBundle and ProcessElement
>  * Check that parameters and return values for the various methods are valid 
> (for ex. Teardown/Setup should have no params)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to