Re: [DISCUSS] Dependency management in Apache Beam Python SDK

2022-08-25 Thread Brian Hulette via dev
Thanks for writing this up Valentyn! I'm curious Jarek, does Airflow take any dependencies on popular libraries like pandas, numpy, pyarrow, scipy, etc... which users are likely to have their own dependency on? I think these dependencies are challenging in a different way than the client

Re: [DISCUSS] Dependency management in Apache Beam Python SDK

2022-08-25 Thread Valentyn Tymofieiev via dev
Hi Jarek, Thanks a lot for detailed feedback and sharing the Airflow story, this is exactly what I was hoping to hear in response from the mailing list! 600+ dependencies is very impressive, so I'd be happy to chat more and learn from your experience. On Wed, Aug 24, 2022 at 5:50 AM Jarek

Re: [ANNOUNCE] Apache Beam 2.41.0 Released

2022-08-25 Thread Pablo Estrada via dev
Thank you Kiley! On Thu, Aug 25, 2022 at 10:55 AM Kiley Sok wrote: > The Apache Beam team is pleased to announce the release of version 2.41.0. > > Apache Beam is an open source unified programming model to define and > execute data processing pipelines, including ETL, batch and stream >

[ANNOUNCE] Apache Beam 2.41.0 Released

2022-08-25 Thread Kiley Sok
The Apache Beam team is pleased to announce the release of version 2.41.0. Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch and stream (continuous) processing. See https://beam.apache.org You can download the release

Beam Dependency Check Report (2022-08-25)

2022-08-25 Thread Apache Jenkins Server
<<< text/html; charset=UTF-8: Unrecognized >>>

Re: SingleStore IO

2022-08-25 Thread John Casey via dev
Hi Adalbert, The nature of scheduling work with splittable DoFns is such that trying to start all splits at the same time isn't really supported. In addition, the general assumption of splitting work in Beam is that a split can be retried in isolation from other splits, which doesn't look

SingleStore IO

2022-08-25 Thread Adalbert Makarovych
Hello, I'm working on the SingleStore IO connector and would like to discuss it with Beam developers. It would be great if the connector can use SingleStore parallel read . In the

Java object serialization error, java.io.InvalidClassException: org.apache.spark.deploy.ApplicationDescription; local class incompatible

2022-08-25 Thread Elliot Metsger
Howdy folks, super-new to Beam, and attempting to get a simple example working with Go, using the portable runner and Spark. There seems to be an incompatibility between Java components, and I’m not quite sure where the disconnect is, but at the root it seems to be an incompatibility with object

Beam High Priority Issue Report (70)

2022-08-25 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/22854 [Bug]: Type