[jira] [Commented] (BEAM-6372) Direct Runner should marshal data in a similar way to Dataflow runner

2020-06-10 Thread Beam JIRA Bot (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17131476#comment-17131476
 ] 

Beam JIRA Bot commented on BEAM-6372:
-

This issue was marked "stale-assigned" and has not received a public comment in 
7 days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> Direct Runner should marshal data in a similar way to Dataflow runner
> -
>
> Key: BEAM-6372
> URL: https://issues.apache.org/jira/browse/BEAM-6372
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct, sdk-go
>Reporter: Andrew Brampton
>Priority: P2
>
> I would test my pipeline using the direct runner, and it would happily run on 
> a sample. When I ran it on the Dataflow runner, it'll run for a hour, then 
> get to a stage that would crash like so:
>  
> {quote}java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> Error received from SDK harness for instruction -224: execute failed: panic: 
> reflect: Call using main.HistogramResult as type struct \{ Key string 
> "json:\"key\""; Files []string "json:\"files\""; Histogram 
> palette.ColorHistogram "json:\"histogram,omitempty\""; Palette []struct { R 
> uint8; G uint8; B uint8; A uint8 } "json:\"palette\"" } goroutine 70 
> [running]:{quote}
> This was because I forgot to register my HistogramResult type.
> It would be useful if the direct runner tried to marshal and unmarshal all 
> types, to help expose issues like this earlier.
> Also, when running on Dataflow, the value of flags, and captured variables, 
> would be the empty/default value. It would be good if direct also caused this 
> behaviour. For example:
> {code}
> prefix := “X”
> s = s.Scope(“Prefix ” + prefix)
> c = beam.ParDo(s, func(value string) string {
>   return prefix + value
> }, c)
> {code}
> Will work prefix "X" on the Direct runner, but will prefix "" on Dataflow. 
> Subtle behaviour, but I suspect the direct runner could expose this if it 
> marshalled and unmarshalled the func like the dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6372) Direct Runner should marshal data in a similar way to Dataflow runner

2020-06-01 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17122165#comment-17122165
 ] 

Kenneth Knowles commented on BEAM-6372:
---

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Direct Runner should marshal data in a similar way to Dataflow runner
> -
>
> Key: BEAM-6372
> URL: https://issues.apache.org/jira/browse/BEAM-6372
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct, sdk-go
>Reporter: Andrew Brampton
>Assignee: Robert Burke
>Priority: P2
>  Labels: stale-assigned
>
> I would test my pipeline using the direct runner, and it would happily run on 
> a sample. When I ran it on the Dataflow runner, it'll run for a hour, then 
> get to a stage that would crash like so:
>  
> {quote}java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> Error received from SDK harness for instruction -224: execute failed: panic: 
> reflect: Call using main.HistogramResult as type struct \{ Key string 
> "json:\"key\""; Files []string "json:\"files\""; Histogram 
> palette.ColorHistogram "json:\"histogram,omitempty\""; Palette []struct { R 
> uint8; G uint8; B uint8; A uint8 } "json:\"palette\"" } goroutine 70 
> [running]:{quote}
> This was because I forgot to register my HistogramResult type.
> It would be useful if the direct runner tried to marshal and unmarshal all 
> types, to help expose issues like this earlier.
> Also, when running on Dataflow, the value of flags, and captured variables, 
> would be the empty/default value. It would be good if direct also caused this 
> behaviour. For example:
> {code}
> prefix := “X”
> s = s.Scope(“Prefix ” + prefix)
> c = beam.ParDo(s, func(value string) string {
>   return prefix + value
> }, c)
> {code}
> Will work prefix "X" on the Direct runner, but will prefix "" on Dataflow. 
> Subtle behaviour, but I suspect the direct runner could expose this if it 
> marshalled and unmarshalled the func like the dataflow runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6372) Direct Runner should marshal data in a similar way to Dataflow runner

2019-01-07 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736201#comment-16736201
 ] 

Robert Burke commented on BEAM-6372:


Off the top of my head, I think I agree with the assessment, the error message 
would ideally be very clear about what's missing and possibly why. One reason 
it hasn't been done this way is that for testing  with ptest and `go test`, 
it's inconvenient to have the registration scaffolding everywhere. In 
particular we'd probably want to maintain that ease, but still have the 
semantics check everything correctly, likely with a --test flag that's used by 
default in the testing harness.

> Direct Runner should marshal data in a similar way to Dataflow runner
> -
>
> Key: BEAM-6372
> URL: https://issues.apache.org/jira/browse/BEAM-6372
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct, sdk-go
>Reporter: Andrew Brampton
>Assignee: Robert Burke
>Priority: Major
>
> I would test my pipeline using the direct runner, and it would happily run on 
> a sample. When I ran it on the Dataflow runner, it'll run for a hour, then 
> get to a stage that would crash like so:
>  
> {quote}java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> Error received from SDK harness for instruction -224: execute failed: panic: 
> reflect: Call using main.HistogramResult as type struct \{ Key string 
> "json:\"key\""; Files []string "json:\"files\""; Histogram 
> palette.ColorHistogram "json:\"histogram,omitempty\""; Palette []struct { R 
> uint8; G uint8; B uint8; A uint8 } "json:\"palette\"" } goroutine 70 
> [running]:{quote}
> This was because I forgot to register my HistogramResult type.
> It would be useful if the direct runner tried to marshal and unmarshal all 
> types, to help expose issues like this earlier.
> Also, when running on Dataflow, the value of flags, and captured variables, 
> would be the empty/default value. It would be good if direct also caused this 
> behaviour. For example:
> {code}
> prefix := “X”
> s = s.Scope(“Prefix ” + prefix)
> c = beam.ParDo(s, func(value string) string {
>   return prefix + value
> }, c)
> {code}
> Will work prefix "X" on the Direct runner, but will prefix "" on Dataflow. 
> Subtle behaviour, but I suspect the direct runner could expose this if it 
> marshalled and unmarshalled the func like the dataflow runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)