Is using Python from Java via ExternalTransform working and tested?

On Tue, Jan 21, 2020 at 6:50 AM Reza Rokni <[email protected]> wrote:

> +1 for using cross language transforms.
>
> On Thu, 16 Jan 2020 at 01:23, Ahmet Altay <[email protected]> wrote:
>
>>
>>
>> On Wed, Jan 15, 2020 at 8:12 AM Kamil Wasilewski <
>> [email protected]> wrote:
>>
>>> Based on your feedback, I think it'd be fine to deal with the problem as
>>> follows:
>>> * for Python: put the transforms into `sdks/python/apache_beam/io/gcp/ai`
>>> * for Java: create a `google-cloud-platform-ai` module in
>>> `sdks/java/extensions` folder
>>>
>>> As for cross language, we expect those transforms to be quite simple, so
>>> the cost of implementing them twice is not that high.
>>>
>>
>> One option would be to implement inference in a library like tfx_bsl [1].
>> It comes with a generalized Beam transform that can do inference either
>> from a saved model file or by using a service endpoint. The service
>> endpoint API option is there and could support cloud AI APIs. If we utilize
>> tfx_bsl, we will leverage the existing TFX integration and would avoid
>> creating a parallel set of transforms. Then for Java, we could enable the
>> same interface with cross language transform and offer a unified inference
>> API for both languages.
>>
>> [1]
>> https://github.com/tensorflow/tfx-bsl/blob/a9f5b6128309595570cc6212f8076e7a20063ac2/tfx_bsl/beam/run_inference.py#L78
>>
>>
>>
>>>
>>> Thanks for your input,
>>> Kamil
>>>
>>> On Wed, Jan 15, 2020 at 7:58 AM Alex Van Boxel <[email protected]> wrote:
>>>
>>>> If it's in Java also be careful to align with the current google cloud
>>>> IO's, certainly it's dependencies. The google IO's are not depending on the
>>>> the newest client libraries and that's something we're sometimes struggling
>>>> with when we depend on our own client libraries. So make sure to align 
>>>> them.
>>>>
>>>> Also note that although gRPC is vendored, the google IO's do still have
>>>> their own dependency on gRPC and this is the biggest reason for trouble.
>>>>
>>>>  _/
>>>> _/ Alex Van Boxel
>>>>
>>>>
>>>> On Wed, Jan 15, 2020 at 1:18 AM Luke Cwik <[email protected]> wrote:
>>>>
>>>>> It depends on what language the client libraries are exposed in. For
>>>>> example, if the client libraries are in Java, sdks/java/extensions makes
>>>>> sense while if its Python then integrating it within the gcp extension
>>>>> within sdks/python/apache_beam makes sense.
>>>>>
>>>>> Adding additional dependencies is ok depending on the licensing and
>>>>> the process is slightly different for each language.
>>>>>
>>>>> For transforms that are complicated, there is a cross language effort
>>>>> going on so that one can execute one language's transforms within another
>>>>> languages pipeline which may remove the need to write the transforms more
>>>>> then once.
>>>>>
>>>>> On Tue, Jan 14, 2020 at 7:43 AM Ismaël Mejía <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Nice idea, IO looks like a good place for them but there is another
>>>>>> path that could fit this case: `sdks/java/extensions`, some module like
>>>>>> `google-cloud-platform-ai` in that folder or something like that, no?
>>>>>>
>>>>>> In any case great initiative. +1
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 14, 2020 at 4:22 PM Kamil Wasilewski <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> We’d like to implement a set of PTransforms that would allow users
>>>>>>> to use some of the Google Cloud AI services in Beam pipelines.
>>>>>>>
>>>>>>> Here's the full list of services and functionalities we’d like to
>>>>>>> integrate Beam with:
>>>>>>>
>>>>>>> * Video Intelligence [1]
>>>>>>>
>>>>>>> * Cloud Natural Language [2]
>>>>>>>
>>>>>>> * Cloud AI Platform Prediction [3]
>>>>>>>
>>>>>>> * Data Masking/Tokenization [4]
>>>>>>>
>>>>>>> * Inspecting image data for sensitive information using Cloud Vision
>>>>>>> [5]
>>>>>>>
>>>>>>> However, we're not sure whether to put those transforms directly
>>>>>>> into Beam, because they would require some additional GCP dependencies. 
>>>>>>> One
>>>>>>> of our ideas is a separate library, that depends on Beam and that can be
>>>>>>> installed optionally, stored somewhere in the beam repository (e.g. in 
>>>>>>> the
>>>>>>> BEAM_ROOT/extras directory). Do you think it is a reasonable approach? 
>>>>>>> Or
>>>>>>> maybe it is totally fine to put them into SDKs, just like other IOs?
>>>>>>>
>>>>>>> If you have any other thoughts, do not hesitate to let us know.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Kamil
>>>>>>>
>>>>>>> [1] https://cloud.google.com/video-intelligence/
>>>>>>>
>>>>>>> [2] https://cloud.google.com/natural-language/
>>>>>>>
>>>>>>> [3] https://cloud.google.com/ml-engine/docs/prediction-overview
>>>>>>>
>>>>>>> [4]
>>>>>>> https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#dlptexttobigquerystreaming
>>>>>>>
>>>>>>> [5] https://cloud.google.com/vision/
>>>>>>>
>>>>>>
>
> --
>
> This email may be confidential and privileged. If you received this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person.
>
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and do
> not constitute a legally binding obligation. No legally binding obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>


-- 

Michał Walenia
Polidea <https://www.polidea.com/> | Software Engineer

M: +48 791 432 002 <+48791432002>
E: [email protected]

Unique Tech
Check out our projects! <https://www.polidea.com/our-work>

Reply via email to