I updated my Pico Wordcount example <http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/> to show the new ToString class that was released in 0.5.0. You don't have to manually convert objects to strings now if the object's toString is the format you want to use.
On Thu, Dec 8, 2016 at 11:18 AM Robert Bradshaw <[email protected]> wrote: > No less typed than any other Python program :). To add our typechecks > one would write > > import apache_beam as beam, re > with beam.Pipeline() as p: > (p > | beam.io.textio.ReadFromText("playing_cards.tsv") > | beam.Map(lamdba s: re.split("\\W+", > s)).with_input_types(str).with_output_types(str) > | beam.combiners.Count.PerElement() > | beam.Map(lambda (w, c): "%s: %d" % (w, c)) > | beam.io.textio.WriteToText("output/stringcounts") > > and the rest is implicit. > > > On Wed, Dec 7, 2016 at 4:13 PM, Dan Halperin <[email protected]> wrote: > > Is the Python one actually fully type-checked, or could it fail at > runtime > > b/c of a typo? > > > > (If latter, what would the minimal type-checked Python WordCount look > like?) > > > > > > On Thu, Dec 8, 2016 at 4:32 AM, Robert Bradshaw <[email protected]> > wrote: > >> > >> On Wed, Dec 7, 2016 at 12:19 PM, Jesse Anderson <[email protected]> > >> wrote: > >> > >> > Only gets beaten on the KV to string conversion. JB is going to change > >> > that. > >> > >> That and the imports/python creation boilerplate. But yes, very similar. > >> > >> > On Wed, Dec 7, 2016, 11:05 AM Robert Bradshaw <[email protected]> > >> > wrote: > >> >> > >> >> Nice. Of course for ultimate conciseness, you should have gone with > >> >> Python > >> >> :) > >> >> > >> >> import apache_beam as beam, re > >> >> with beam.Pipeline() as p: > >> >> (p > >> >> | beam.io.textio.ReadFromText("playing_cards.tsv") > >> >> | beam.Map(lamdba s: re.split("\\W+", s)) > >> >> | beam.combiners.Count.PerElement() > >> >> | beam.Map(lambda (w, c): "%s: %d" % (w, c)) > >> >> | beam.io.textio.WriteToText("output/stringcounts") > >> >> > >> >> > >> >> > >> >> On Wed, Dec 7, 2016 at 10:14 AM, Jean-Baptiste Onofré < > [email protected]> > >> >> wrote: > >> >> > Good idea Neelesh ! > >> >> > > >> >> > definitively something we can add to the beam-samples (great > >> >> > complement > >> >> > to > >> >> > what I have on my github). > >> >> > > >> >> > Regards > >> >> > JB > >> >> > > >> >> > On 12/07/2016 07:10 PM, Neelesh Salian wrote: > >> >> >> > >> >> >> Perhaps we can add this to our examples. > >> >> >> Thank you Jesse. :) > >> >> >> > >> >> >> On Wed, Dec 7, 2016 at 10:07 AM, Jean-Baptiste Onofré > >> >> >> <[email protected] > >> >> >> <mailto:[email protected]>> wrote: > >> >> >> > >> >> >> Awesome ! > >> >> >> > >> >> >> Thanks Jesse ! > >> >> >> > >> >> >> Regards > >> >> >> JB > >> >> >> > >> >> >> On 12/07/2016 06:22 PM, Jesse Anderson wrote: > >> >> >> > >> >> >> I wrote a post on the smallest WordCount > >> >> >> < > http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/ > >> >> >> > >> >> >> <http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/>> > >> >> >> I > >> >> >> could > >> >> >> write. I go through everything line by line and talk about > >> >> >> some > >> >> >> of the > >> >> >> newest DoFNs that allow you to easily run regular > >> >> >> expressions > >> >> >> in a > >> >> >> distributed way. > >> >> >> > >> >> >> Thanks, > >> >> >> > >> >> >> Jesse > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Jean-Baptiste Onofré > >> >> >> [email protected] <mailto:[email protected]> > >> >> >> http://blog.nanthrax.net > >> >> >> Talend - http://www.talend.com > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Neelesh Srinivas Salian > >> >> >> Customer Operations Engineer > >> >> >> > >> >> >> * > >> >> >> * > >> >> >> * > >> >> >> * > >> >> > > >> >> > > >> >> > -- > >> >> > Jean-Baptiste Onofré > >> >> > [email protected] > >> >> > http://blog.nanthrax.net > >> >> > Talend - http://www.talend.com > > > > >
