Yes you're right. I believe this is the use case that I'm after. So if I understand correctly, transforms that do aggregations just assume that the batch of data being aggregated is passed as part of a tensor column. Is it possible to hook up a lookup call to another Tensorflow Serving servable for a join in batch mode? Will a saved model when loaded into a tensorflow serving model actually have the definitions of the metadata when retrieved using the tensorflow serving metadata api? Thanks,Ron On Tuesday, January 16, 2018, 6:16:01 PM PST, Charles Chen <c...@google.com> wrote: This sounds similar to the use case for tf.Transform, a library that depends on Beam: https://github.com/tensorflow/transform On Tue, Jan 16, 2018 at 5:51 PM Ron Gonzalez <zlgonza...@yahoo.com> wrote:
Hi, I was wondering if anyone has encountered or used Beam in the following manner: 1. During machine learning training, use Beam to create the event table. The flow may consist of some joins, aggregations, row-based transformations, etc... 2. Once the model is created, deploy the model to some scoring service via PMML (or some other scoring service). 3. Enable the SAME transformations used in #1 by using a separate engine but thereby guaranteeing that it will transform the data identically as the engine used in #1. I think this is a pretty interesting use case where Beam is used to guarantee portability across engines and deployment (batch to true streaming, not micro-batch). What's not clear to me is with respect to how batch joins would translate during one-by-one scoring (probably lookups) or how aggregations given that some kind of history would need to be stored (and how much is kept is configurable too). Thoughts? Thanks,Ron