Hi Jesse,

Coder is not for type conversion, but for serialization.

I'm using the same as you:

https://github.com/jbonofre/beam-samples/blob/master/EventsByLocation/src/main/java/org/apache/beam/samples/EventsByLocation.java#L111

with a SimpleFunction (that I can reuse in different MapElements call).

I had the same need as you in different situation (like having PCollection<Foo> and I want PCollection<String> just calling toString() on Foo). I think it could be helpful to have TypeConverter like we have in Apache Camel. A list of TypeConverter (implicit) can be present in the Pipeline context as something like:

Element Source Type -> Element Target Type -> TypeConverter

(of course an user could add his own type converter with a source/target type).

Implicitly, when we have a PCollection<Source> and want a PCollection<Target> the type converter can be called.

A TypeConverter could be basically a PTransform.

Just thinking loud ;)

Regards
JB

On 11/08/2016 12:56 AM, Jesse Anderson wrote:
Is there a way to directly take a PCollection<KV> and make it a
PCollection<String>? I need to make the PCollection a
PCollection<String> before writing it out with TextIO.Write.

I tried using:
withCoder(KvCoder.of(StringDelegateCoder.of(String.class),
StringDelegateCoder.of(Long.class))

but that causes binary data to be written out by the KV coder.

The only way appears to be a manual transform with:
PCollection<String> stringCounts = counts.apply(MapElements
    .via((KV<String, Long> count) ->
    count.getKey() + ":" + count.getValue())
    .withOutputType(TypeDescriptors.strings()));

If this is missing, that manual step should be baked into the API. That
should be something either in StringDelegateCoder or a new String
transform. The new StringDelegateCoder method would take in any KV (or
list types) and put a specific String delimiter. The new transform would
take in any type in a PCollection<T> and makes it a PCollection<String>
using a specific String delimiter.

Thanks,

Jesse

--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to