Re: tf.Transform library for using TensorFlow with Beam

Amit Sela Fri, 24 Feb 2017 15:33:56 -0800

That's great! many people have asked me about that and I'm glad to see this
happening.
Anyone know if there's something at work for the Java SDK (assuming I don't
want to wait for Fn API support) ?


On Fri, Feb 24, 2017 at 8:44 AM Jean-Baptiste Onofré <[email protected]>
wrote:

> Fantastic !
>
> That's a great addition and awesome to see that with Beam !
>
> Regards
> JB
>
> On 02/24/2017 02:51 AM, Robert Bradshaw wrote:
> > One thing I'm really excited about this library is that it allows one to
> > more easily express transforms on columnar data (which is useful beyond
> > just ML). For example, if your input elements have two fields "x" and "y"
> > then you can write functions like
> >
> > def preprocessing_fn(inputs):
> >     x_centered = tft.map(lambda x, mean: x - mean, inputs['x'],
> > tft.mean(inputs['x']))
> >     y_normalized = tft.scale_to_0_1(inputs['y'])
> >     return {
> >         'x_centered': x_centered,
> >         'y_normalized': y_normalized,
> >         'x_centered_times_y_normalized': tft.map(operations.mul,
> > x_centered, y_normalized)
> >     }
> >
> > # Read PCollection of dicts with 'x' and 'y' keys and numeric values
> > input = p | Read(...)
> >
> > # output will contain dicts with 'x_centered', 'y_normalized', and
> > 'x_centered_times_y_normalized' keys
> > # with the expected values, and fn can be used to transform other data
> > using the
> > # statistics (mean, mins, and maxes) without re-analysis.
> > output, fn = (input, schema) |
> > beam_impl.AnalyzeAndTransformDataset(preprocessing_fn)
> >
> > This automatically injects the relevant global aggregations (which can be
> > interleaved) and builds up tensorflow graphs to apply the transformations
> > very efficiently.
> >
> >
> > On Thu, Feb 23, 2017 at 4:55 PM, Davor Bonaci <[email protected]> wrote:
> >
> >> Beam and TensorFlow coming together -- a big deal for us!
> >>
> >> On Thu, Feb 23, 2017 at 3:49 PM, Ahmet Altay <[email protected]>
> >> wrote:
> >>
> >>> Hi all,
> >>>
> >>> Yesterday, there was an announcement from TensorFlow community about
> the
> >>> new tf.Transform library [1]. It is a library that allows users to
> define
> >>> pre-processing pipelines and run using large scale data processing
> >>> frameworks. It is a library specifically designed to work with Apache
> >> Beam.
> >>> It is great to see Python SDK getting a larger ecosystem and increased
> >>> usage.
> >>>
> >>> Also worth mentioning is, PMC member Robert Bradshaw was one of the
> >>> contributors to this new library.
> >>>
> >>> Thank you,
> >>> Ahmet
> >>>
> >>> [1] https://research.googleblog.com/2017/02/preprocessing-for-machine-
> >>> learning-with.html
> >>>
> >>
> >
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: tf.Transform library for using TensorFlow with Beam

Reply via email to