Nice. Of course for ultimate conciseness, you should have gone with Python :)

import apache_beam as beam, re
with beam.Pipeline() as p:
  (p
   | beam.io.textio.ReadFromText("playing_cards.tsv")
   | beam.Map(lamdba s: re.split("\\W+", s))
   | beam.combiners.Count.PerElement()
   | beam.Map(lambda (w, c): "%s: %d" % (w, c))
   | beam.io.textio.WriteToText("output/stringcounts")



On Wed, Dec 7, 2016 at 10:14 AM, Jean-Baptiste Onofré <[email protected]> wrote:
> Good idea Neelesh !
>
> definitively something we can add to the beam-samples (great complement to
> what I have on my github).
>
> Regards
> JB
>
> On 12/07/2016 07:10 PM, Neelesh Salian wrote:
>>
>> Perhaps we can add this to our examples.
>> Thank you Jesse. :)
>>
>> On Wed, Dec 7, 2016 at 10:07 AM, Jean-Baptiste Onofré <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Awesome !
>>
>>     Thanks Jesse !
>>
>>     Regards
>>     JB
>>
>>     On 12/07/2016 06:22 PM, Jesse Anderson wrote:
>>
>>         I wrote a post on the smallest WordCount
>>         <http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
>>         <http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/>> I
>>         could
>>         write. I go through everything line by line and talk about some
>>         of the
>>         newest DoFNs that allow you to easily run regular expressions in a
>>         distributed way.
>>
>>         Thanks,
>>
>>         Jesse
>>
>>
>>
>>     --
>>     Jean-Baptiste Onofré
>>     [email protected] <mailto:[email protected]>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
>>
>> --
>> Neelesh Srinivas Salian
>> Customer Operations Engineer
>>
>> *
>> *
>> *
>> *
>
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com

Reply via email to