Good idea Jesse !

⁣​

On Nov 8, 2016, 14:45, at 14:45, Jesse Anderson <[email protected]> wrote:
>Moving this thread to dev mailing list for discussion.
>
>On Tue, Nov 8, 2016 at 1:24 AM Jean-Baptiste Onofré <[email protected]>
>wrote:
>
>> Hi Jesse,
>>
>> Coder is not for type conversion, but for serialization.
>>
>> I'm using the same as you:
>>
>>
>>
>https://github.com/jbonofre/beam-samples/blob/master/EventsByLocation/src/main/java/org/apache/beam/samples/EventsByLocation.java#L111
>>
>> with a SimpleFunction (that I can reuse in different MapElements
>call).
>>
>> I had the same need as you in different situation (like having
>> PCollection<Foo> and I want PCollection<String> just calling
>toString()
>> on Foo). I think it could be helpful to have TypeConverter like we
>have
>> in Apache Camel.
>> A list of TypeConverter (implicit) can be present in the Pipeline
>> context as something like:
>>
>> Element Source Type -> Element Target Type -> TypeConverter
>>
>> (of course an user could add his own type converter with a
>source/target
>> type).
>>
>> Implicitly, when we have a PCollection<Source> and want a
>> PCollection<Target> the type converter can be called.
>>
>> A TypeConverter could be basically a PTransform.
>>
>> Just thinking loud ;)
>>
>> Regards
>> JB
>>
>> On 11/08/2016 12:56 AM, Jesse Anderson wrote:
>> > Is there a way to directly take a PCollection<KV> and make it a
>> > PCollection<String>? I need to make the PCollection a
>> > PCollection<String> before writing it out with TextIO.Write.
>> >
>> > I tried using:
>> > withCoder(KvCoder.of(StringDelegateCoder.of(String.class),
>> > StringDelegateCoder.of(Long.class))
>> >
>> > but that causes binary data to be written out by the KV coder.
>> >
>> > The only way appears to be a manual transform with:
>> > PCollection<String> stringCounts = counts.apply(MapElements
>> >     .via((KV<String, Long> count) ->
>> >     count.getKey() + ":" + count.getValue())
>> >     .withOutputType(TypeDescriptors.strings()));
>> >
>> > If this is missing, that manual step should be baked into the API.
>That
>> > should be something either in StringDelegateCoder or a new String
>> > transform. The new StringDelegateCoder method would take in any KV
>(or
>> > list types) and put a specific String delimiter. The new transform
>would
>> > take in any type in a PCollection<T> and makes it a
>PCollection<String>
>> > using a specific String delimiter.
>> >
>> > Thanks,
>> >
>> > Jesse
>>
>> --
>> Jean-Baptiste Onofré
>> [email protected]
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>

Reply via email to