Moving this thread to dev mailing list for discussion.

On Tue, Nov 8, 2016 at 1:24 AM Jean-Baptiste Onofré <[email protected]> wrote:

> Hi Jesse,
>
> Coder is not for type conversion, but for serialization.
>
> I'm using the same as you:
>
>
> https://github.com/jbonofre/beam-samples/blob/master/EventsByLocation/src/main/java/org/apache/beam/samples/EventsByLocation.java#L111
>
> with a SimpleFunction (that I can reuse in different MapElements call).
>
> I had the same need as you in different situation (like having
> PCollection<Foo> and I want PCollection<String> just calling toString()
> on Foo). I think it could be helpful to have TypeConverter like we have
> in Apache Camel.
> A list of TypeConverter (implicit) can be present in the Pipeline
> context as something like:
>
> Element Source Type -> Element Target Type -> TypeConverter
>
> (of course an user could add his own type converter with a source/target
> type).
>
> Implicitly, when we have a PCollection<Source> and want a
> PCollection<Target> the type converter can be called.
>
> A TypeConverter could be basically a PTransform.
>
> Just thinking loud ;)
>
> Regards
> JB
>
> On 11/08/2016 12:56 AM, Jesse Anderson wrote:
> > Is there a way to directly take a PCollection<KV> and make it a
> > PCollection<String>? I need to make the PCollection a
> > PCollection<String> before writing it out with TextIO.Write.
> >
> > I tried using:
> > withCoder(KvCoder.of(StringDelegateCoder.of(String.class),
> > StringDelegateCoder.of(Long.class))
> >
> > but that causes binary data to be written out by the KV coder.
> >
> > The only way appears to be a manual transform with:
> > PCollection<String> stringCounts = counts.apply(MapElements
> >     .via((KV<String, Long> count) ->
> >     count.getKey() + ":" + count.getValue())
> >     .withOutputType(TypeDescriptors.strings()));
> >
> > If this is missing, that manual step should be baked into the API. That
> > should be something either in StringDelegateCoder or a new String
> > transform. The new StringDelegateCoder method would take in any KV (or
> > list types) and put a specific String delimiter. The new transform would
> > take in any type in a PCollection<T> and makes it a PCollection<String>
> > using a specific String delimiter.
> >
> > Thanks,
> >
> > Jesse
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to