tvalentyn opened a new issue, #37854:
URL: https://github.com/apache/beam/issues/37854

   ### What needs to happen?
   
   We recently started to allow newer versions of GRPC. 
https://github.com/apache/beam/pull/37817.
   
   I wanted to regenerate containers to start using newer versions of GRPC and 
clean up some tech debt we had in our code because grpc version was restricted, 
notably envoy-data-plane was set to use some really old version.
   
   I ran into dependency hell in some of our test suites. The crux of it seems 
to be:
   
   - our ml_test dependency requires tensorflow-transform
   - tensorflow-transform doesn't support protobuf 6
   - beam tft extra requires an even older tensorflow-transform that doesn't 
support protobuf 5. Our ml_tests suites (e.g. precommit XLang YAML)  end up 
using protobuf 3 due to various dependency constraints.
   - grpcio-status==1.78.0 requires protobuf 6:  protobuf 5 will reach EOL at 
the end of march 26 weeks.
   - If we build Beam GRPC stubs with protobuf 5+, these stubs no longer work 
with protobuf 3: 
https://github.com/apache/beam/pull/37822/changes/531176cbbfe680e688567949f34ebc320a17074e#r2932955415,
 in other words install pip install apache-beam[tft] will not work.
   
   We need to upgrade to newer versions of TFT, which is currently blocked by 
https://github.com/tensorflow/transform/issues/347 .
   
   ### Issue Priority
   
   Priority: 2 (default / most normal work should be filed as P2)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Infrastructure
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to