[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-16 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581093#comment-15581093
 ] 

Kenneth Knowles commented on BEAM-741:
--

Great investigation. I actually think the SDK should also always prefer the 
transform's coder. But, also, for input of type {{KV}}, the expected 
behavior is for the registry to associate the type {{V}} with the value coder 
and thus in this context provide exactly the same coder. So I'm going to reopen 
and see about both of these.

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-14 Thread Andrew Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575937#comment-15575937
 ] 

Andrew Martin commented on BEAM-741:


[~kenn] I dug into this more today, and found the specific reason for this 
failure - the inference process in Beam checks the coder registry, and if it 
doesn't find any will try to use a fallback coder provider. If it fails there, 
only then will it try to obtain the coder from the producing transform. In Scio 
we set our own fallback coder provider, so Beam will never end up using the 
output coder from the producing transform. So, in Scio we probably need to 
prefer using the default output coder of the producing transform, and fall back 
as a last resort. I will close this because it is an issue in Scio, not in Beam.

Thanks!

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-13 Thread Andrew Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572443#comment-15572443
 ] 

Andrew Martin commented on BEAM-741:


[~kenn] After investigating further, it seems like the coder for the output of 
the values transform is not inferred correctly because of what appears to be 
some loss of type information in the type descriptor - the output of the 
`Values` transform should be of type Iterable but the raw type is just 
Object during the inference process, so the default coder provider is used 
(which we set in our own code).

I'm part of a team at Spotify developing Scio (https://github.com/spotify/scio) 
and we have a work-in-progress branch for beam porting, and it is some tests in 
there that fail. I'd like to have a failing test written in the pure beam API 
so you can take a look - that being said, is it possible to invoke the 
@RunnableOnService tests locally using the direct runner? 

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-11 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566814#comment-15566814
 ] 

Kenneth Knowles commented on BEAM-741:
--

Actually, what you say makes me want to investigate further. The registry and 
the coder inference process is expected to propagate coders in a case like 
that. I'll leave it up to you whether you want to pursue, but if you do feel 
like offering a snippet (or a pull request with a failing test :-) we'd 
definitely look into it.

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-11 Thread Andrew Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1554#comment-1554
 ] 

Andrew Martin commented on BEAM-741:


I see, so being explicit about setting the coder in Values is probably a 
duct-tape solution for a more fundamental problem we have, perhaps something to 
do with the coder registry (we have some custom coders so the problem is likely 
here). I will close this.

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-11 Thread Andrew Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566259#comment-15566259
 ] 

Andrew Martin commented on BEAM-741:


I found this issue when using the Direct Runner for a test, which does dynamic 
re-sharding as part of the Write transform. It does a GroupByKey -> Values -> 
Flatten, and Flatten failed because it did not have an 'IterableLikeCoder' as 
the input coder, which should be the case after calling Values form a GBK.

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2016-10-11 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566252#comment-15566252
 ] 

Kenneth Knowles commented on BEAM-741:
--

Can you provide a reproduction? This is fairly surprising, since the {{Values}} 
transform does not manipulate the coder. But it does infer it based on static 
types, so perhaps this is causing an unexpected coder to be inferred?

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Davor Bonaci
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)