Hello,
 my 2 cents (and not sure if it makes sense for your usecase)
What about the python process read from BigTable and  store in a bucket as
csv?  Then you can read the csv from java>?

hth
 marco

On Fri, Jan 7, 2022 at 7:31 AM Chamikara Jayalath <[email protected]>
wrote:

> Irrespective of whether the Java transform is defined by a user or
> available in Beam Java SDK, the APIs for using such a transform from Python
> are the same.
> In other words, there's no special support for using arbitrary Java
> transforms in Beam from Python pipelines. We have to use the API mentioned
> in the documentation I linked above to use Java transforms from Python in
> either case.
>
> To set expectations correctly, using a complex Java IO connector transform
> such as BigTableIO.Read from Python can be a bit involved. For example,
> (1) We have to make sure that options needed to instantiate the transform
> (for example, BigTableOptions) can be correctly instantiated on the Python
> side.
> (2) Seems like Bigtable read transform currently has output type
> "com.google.bigtable.v2.Row". This has to be mapped to a cross-language
> compatible type so that Python can understand it (for example, Beam Rows).
>
> Thanks,
> Cham
>
>
>
>
>
>
>
>
> On Thu, Jan 6, 2022 at 10:32 PM Sayak Paul <[email protected]> wrote:
>
>> My question still remains same. I am not yet sure how to use an existing
>> Java transform (like BigTable IO reader in Java) from a Python pipeline.
>> The examples take a user-defined sample transform and then show their
>> usage.
>>
>> On Fri, 7 Jan, 2022, 11:10 Chamikara Jayalath, <[email protected]>
>> wrote:
>>
>>> Actually this is the correct link for multi-language Python
>>> documentation:
>>> https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines
>>> We also have a quickstart guide which might be a better starting point:
>>> https://beam.apache.org/documentation/sdks/python-multi-language-pipelines/
>>>
>>> We haven't looked into developing a cross-language wrapper for the Java
>>> BigTable connector yet. I created
>>> https://issues.apache.org/jira/browse/BEAM-13607 for tracking this.
>>> It's great if you can contribute to this.
>>>
>>> Thanks,
>>> Cham
>>>
>>>
>>> On Thu, Jan 6, 2022 at 8:35 PM Sayak Paul <[email protected]> wrote:
>>>
>>>> Luke, I studied the resources you provided. However, it's still a
>>>> little unclear to me as to how I could use the BigTableIO
>>>> <https://beam.apache.org/releases/javadoc/2.1.0/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.html>
>>>>  in
>>>> Java from a Python pipeline. The examples and documentation first implement
>>>> a demo class in Java and then show how to use it.
>>>>
>>>> I was wondering if there was a guide on using the existing connectors
>>>> (i.e., without defining them first) from Python pipelines. I am probably
>>>> mistaken somewhere so happy to rectify myself if that's the case.
>>>>
>>>> Sayak Paul | sayak.dev
>>>>
>>>>
>>>>
>>>> On Thu, Jan 6, 2022 at 10:35 PM Sayak Paul <[email protected]>
>>>> wrote:
>>>>
>>>>> Thanks!
>>>>>
>>>>> On Thu, 6 Jan, 2022, 22:27 Luke Cwik, <[email protected]> wrote:
>>>>>
>>>>>> +1 on using cross language to get the Java Bigtable connector that
>>>>>> already exists.
>>>>>>
>>>>>> You could also take a look at this other xlang documentation[1] and
>>>>>> look at an existing implementation such as kafka[2] that is xlang.
>>>>>>
>>>>>> Finally there was support added to use many transforms in Java using
>>>>>> the class name and builder methods[3].
>>>>>>
>>>>>> 1: https://beam.apache.org/documentation/patterns/cross-language/
>>>>>> 2:
>>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/kafka.py
>>>>>> 3: https://issues.apache.org/jira/browse/BEAM-12769
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 6, 2022 at 4:41 AM Sayak Paul <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi folks,
>>>>>>>
>>>>>>> My project needs reading data from Cloud BigTable. We are aware that
>>>>>>> an IO connector for BigTable is available in the Java SDK. So we could
>>>>>>> probably make use of the cross-language capabilities
>>>>>>> <https://beam.apache.org/documentation/programming-guide/#1311-creating-cross-language-java-transforms>
>>>>>>> of Beam and make it work. I am, however, looking for
>>>>>>> guidance/resources/pointers that could be beneficial to build a Beam
>>>>>>> pipeline in Python that reads data from Cloud BigTable. Any relevant 
>>>>>>> clue
>>>>>>> would be greatly appreciated.
>>>>>>>
>>>>>>> Sayak Paul | sayak.dev
>>>>>>>
>>>>>>>

Reply via email to