Is using Python from Java via ExternalTransform working and tested? On Tue, Jan 21, 2020 at 6:50 AM Reza Rokni <[email protected]> wrote:
> +1 for using cross language transforms. > > On Thu, 16 Jan 2020 at 01:23, Ahmet Altay <[email protected]> wrote: > >> >> >> On Wed, Jan 15, 2020 at 8:12 AM Kamil Wasilewski < >> [email protected]> wrote: >> >>> Based on your feedback, I think it'd be fine to deal with the problem as >>> follows: >>> * for Python: put the transforms into `sdks/python/apache_beam/io/gcp/ai` >>> * for Java: create a `google-cloud-platform-ai` module in >>> `sdks/java/extensions` folder >>> >>> As for cross language, we expect those transforms to be quite simple, so >>> the cost of implementing them twice is not that high. >>> >> >> One option would be to implement inference in a library like tfx_bsl [1]. >> It comes with a generalized Beam transform that can do inference either >> from a saved model file or by using a service endpoint. The service >> endpoint API option is there and could support cloud AI APIs. If we utilize >> tfx_bsl, we will leverage the existing TFX integration and would avoid >> creating a parallel set of transforms. Then for Java, we could enable the >> same interface with cross language transform and offer a unified inference >> API for both languages. >> >> [1] >> https://github.com/tensorflow/tfx-bsl/blob/a9f5b6128309595570cc6212f8076e7a20063ac2/tfx_bsl/beam/run_inference.py#L78 >> >> >> >>> >>> Thanks for your input, >>> Kamil >>> >>> On Wed, Jan 15, 2020 at 7:58 AM Alex Van Boxel <[email protected]> wrote: >>> >>>> If it's in Java also be careful to align with the current google cloud >>>> IO's, certainly it's dependencies. The google IO's are not depending on the >>>> the newest client libraries and that's something we're sometimes struggling >>>> with when we depend on our own client libraries. So make sure to align >>>> them. >>>> >>>> Also note that although gRPC is vendored, the google IO's do still have >>>> their own dependency on gRPC and this is the biggest reason for trouble. >>>> >>>> _/ >>>> _/ Alex Van Boxel >>>> >>>> >>>> On Wed, Jan 15, 2020 at 1:18 AM Luke Cwik <[email protected]> wrote: >>>> >>>>> It depends on what language the client libraries are exposed in. For >>>>> example, if the client libraries are in Java, sdks/java/extensions makes >>>>> sense while if its Python then integrating it within the gcp extension >>>>> within sdks/python/apache_beam makes sense. >>>>> >>>>> Adding additional dependencies is ok depending on the licensing and >>>>> the process is slightly different for each language. >>>>> >>>>> For transforms that are complicated, there is a cross language effort >>>>> going on so that one can execute one language's transforms within another >>>>> languages pipeline which may remove the need to write the transforms more >>>>> then once. >>>>> >>>>> On Tue, Jan 14, 2020 at 7:43 AM Ismaël Mejía <[email protected]> >>>>> wrote: >>>>> >>>>>> Nice idea, IO looks like a good place for them but there is another >>>>>> path that could fit this case: `sdks/java/extensions`, some module like >>>>>> `google-cloud-platform-ai` in that folder or something like that, no? >>>>>> >>>>>> In any case great initiative. +1 >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jan 14, 2020 at 4:22 PM Kamil Wasilewski < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> We’d like to implement a set of PTransforms that would allow users >>>>>>> to use some of the Google Cloud AI services in Beam pipelines. >>>>>>> >>>>>>> Here's the full list of services and functionalities we’d like to >>>>>>> integrate Beam with: >>>>>>> >>>>>>> * Video Intelligence [1] >>>>>>> >>>>>>> * Cloud Natural Language [2] >>>>>>> >>>>>>> * Cloud AI Platform Prediction [3] >>>>>>> >>>>>>> * Data Masking/Tokenization [4] >>>>>>> >>>>>>> * Inspecting image data for sensitive information using Cloud Vision >>>>>>> [5] >>>>>>> >>>>>>> However, we're not sure whether to put those transforms directly >>>>>>> into Beam, because they would require some additional GCP dependencies. >>>>>>> One >>>>>>> of our ideas is a separate library, that depends on Beam and that can be >>>>>>> installed optionally, stored somewhere in the beam repository (e.g. in >>>>>>> the >>>>>>> BEAM_ROOT/extras directory). Do you think it is a reasonable approach? >>>>>>> Or >>>>>>> maybe it is totally fine to put them into SDKs, just like other IOs? >>>>>>> >>>>>>> If you have any other thoughts, do not hesitate to let us know. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Kamil >>>>>>> >>>>>>> [1] https://cloud.google.com/video-intelligence/ >>>>>>> >>>>>>> [2] https://cloud.google.com/natural-language/ >>>>>>> >>>>>>> [3] https://cloud.google.com/ml-engine/docs/prediction-overview >>>>>>> >>>>>>> [4] >>>>>>> https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#dlptexttobigquerystreaming >>>>>>> >>>>>>> [5] https://cloud.google.com/vision/ >>>>>>> >>>>>> > > -- > > This email may be confidential and privileged. If you received this > communication by mistake, please don't forward it to anyone else, please > erase all copies and attachments, and please let me know that it has gone > to the wrong person. > > The above terms reflect a potential business arrangement, are provided > solely as a basis for further discussion, and are not intended to be and do > not constitute a legally binding obligation. No legally binding obligations > will be created, implied, or inferred until an agreement in final form is > executed in writing by all parties involved. > -- Michał Walenia Polidea <https://www.polidea.com/> | Software Engineer M: +48 791 432 002 <+48791432002> E: [email protected] Unique Tech Check out our projects! <https://www.polidea.com/our-work>
