There's some work needed to make the Java connector available as a
cross-language transform for Python. More specifically,

(1) Add a Java builder and registrar to register Java transforms with the
expansion service (see [1] and [2] for Kafka)
(2) Add a Python wrapper (see [3] for Kafka)

Thanks,
Cham

[1]
https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L396
[2]
https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1429
[3]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/kafka.py

On Wed, Jun 17, 2020 at 8:57 AM Shashanka Balakuntala <
[email protected]> wrote:

> Hi All,
> In regards with this discussion, I created a JIRA issue[1]. Now since
> there is a talk here on cross-platform connector, should I just close the
> issue with a link to Java Snowflake connector, or does anyone think writing
> python based connector has some advantage in terms of performance or
> usability. Please let me know what you guys think, so that i can take the
> necessary step on this.
>
> [1] - https://issues.apache.org/jira/browse/BEAM-9466
>
> *Regards*
>   Shashanka Balakuntala Srinivasa
>
>
>
> On Wed, Mar 11, 2020 at 2:25 AM Chamikara Jayalath <[email protected]>
> wrote:
>
>>
>>
>> On Tue, Mar 10, 2020 at 1:18 PM Tyler Akidau <[email protected]> wrote:
>>
>>> On Tue, Mar 10, 2020 at 1:27 AM Elias Djurfeldt <
>>> [email protected]> wrote:
>>>
>>>> From what I can tell, the only difference is that the Python connector
>>>> is a pure Python implementation and doesn't rely on ODBC or JDBC (it's just
>>>> a pip installable). Whereas the Java version needs JDBC. But that seems to
>>>> be the only difference.
>>>>
>>>
>>> Correct me if I'm wrong, but this sounds like a concern around having to
>>> install Java dependencies for the cross-language transform. If so, I think
>>> the question is: how frictionless can we make the user experience here? If
>>> it can be relatively straightforward, even for a Python user with zero Java
>>> familiarity, it's going to be a win from a maintainability perspective to
>>> only have one implementation (Java, in this case) to keep up to date, as
>>> Cham pointed out. Kasia, do you have a sense yet for what the experience
>>> for a Python user would be for using the Python-wrapped Java SnowflakeIO
>>> connector?
>>>
>>
>> There are many aspects related to usability of cross-language transforms
>> that are currently being worked on. We are doing some of the usability
>> improvements to cross-language Kafka. But the end goal is to make using
>> cross-language transforms seamless as possible to end users. For example,
>> (1) Expansion service can be started up automatically if users have Java
>> installed in their system.
>> (2) Native language wrappers can be aware of the immediate dependencies
>> needed for the expansion service.
>> (3) Additional dependencies can be obtained as a part of the new
>> environment
>> <https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L1280>
>> received through the cross-language transform expansion protocol.
>>
>> Also we need to add better support for converting arbitrary Java types to
>> arbitrary Python types using Row coder (
>> https://issues.apache.org/jira/browse/BEAM-8732).
>>
>> So hopefully, the user experience of using cross-language Java transforms
>> from Python can be as seamless as "just install JRE and use the transforms
>> in Python xyz_io.py".
>>
>> There might be additional Snowflake specific considerations I'm not aware
>> of.
>>
>> Thanks,
>> Cham
>>
>>
>>>
>>> -Tyler
>>>
>>>
>>>>
>>>> I don't know enough about the Java side of Beam (or Java in general
>>>> really) to say if that's an issue or not though :)
>>>>
>>>> Cheers,
>>>>
>>>> On Mon, 9 Mar 2020 at 18:06, Chamikara Jayalath <[email protected]>
>>>> wrote:
>>>>
>>>>> Thank you. Elias and Shashanka, do you think the Python connector (and
>>>>> API) can offer some additional benefits that a Java cross-language
>>>>> <https://beam.apache.org/roadmap/connectors-multi-sdk/> connector
>>>>> cannot ? It's fine to develop Java and Python versions if it makes sense
>>>>> but if cross-language Java version offers the same benefits as Python just
>>>>> having one implementation will reduce maintenance burden.
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>> On Mon, Mar 9, 2020 at 5:41 AM Katarzyna Kucharczyk <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Me and my colleague Dariusz we are working currently on Java
>>>>>> connector and we are planning to use cross-language to add Python as 
>>>>>> well.
>>>>>> The proposal should arrive on dev-list in the nearest future.
>>>>>> Also we would be happy to help if needed in current work of yours.
>>>>>>
>>>>>> Cheers,
>>>>>> Kasia
>>>>>>
>>>>>> On Mon, Mar 9, 2020 at 9:41 AM Elias Djurfeldt <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Cool Shashanka! Feel free to tag me in the JIRA and update me on any
>>>>>>> progress / ponderings.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Elias
>>>>>>>
>>>>>>> On Sat, 7 Mar 2020 at 03:43, Chamikara Jayalath <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Absolutely. Please create a JIRA and coordinate with Elias and any
>>>>>>>> others that would like to contribute to this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Cham
>>>>>>>>
>>>>>>>> On Fri, Mar 6, 2020 at 10:46 AM Shashanka Balakuntala <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Chamikara and Elias,
>>>>>>>>> This seems like an interesting feature. Can I start working on
>>>>>>>>> this?
>>>>>>>>> *Regards*
>>>>>>>>>   Shashanka Balakuntala Srinivasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Mar 7, 2020 at 12:00 AM Chamikara Jayalath <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I don't think we have this but contributions are welcome.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Cham
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 3, 2020 at 4:46 AM Elias Djurfeldt <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I've stumbled upon a use case where I might need a SnowflakeIO
>>>>>>>>>>> in Python. Has anyone worked on this before or are there any 
>>>>>>>>>>> discussions
>>>>>>>>>>> surrounding it?
>>>>>>>>>>>
>>>>>>>>>>> There is a Snowflake Python library available [1], so looks
>>>>>>>>>>> feasible to implement in Beam.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://docs.snowflake.net/manuals/user-guide/python-connector.html
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Elias
>>>>>>>>>>>
>>>>>>>>>>

Reply via email to