lazarillo commented on issue #32781: URL: https://github.com/apache/beam/issues/32781#issuecomment-2422409550
I will think more on my biggest pain points going cross-language, and provide a specific example when I remember / experience it. But for now, I'll give one pain point: I am _really_ bad at Java. I am a machine learning / data engineer. If I could, I'd work with modern languages like Go or Rust. But I need to use a language that is well supported for my career. So that means either Java, Python, or Scala. (Less so Scala these days, but it was quite big when Spark initially started.) I chose Python because it's easiest to work with, but also because I _vehemently_ believe that code should be easy to read, and that object oriented is a choice that is _only sometimes_ the right choice. So Python seemed better than Java. Lastly, the JVM was far more useful before Docker came onto the scene. So even that perk lost its luster. I very much rely upon discovered code when documentation is insufficient. I can read just about any Python code, no matter how complex, and understand what it is doing and how I can interact with it (and whether I should or not). But with Java (or Go), I am too ignorant of the language: I need good docs in order to know what to do and how to interact with the SDK. I think the Apache Beam docs are _great_. But Beam is such a complex beast that often I learn what I really need to know by going straight to the source code (which is linked to throughout the documentation, which is awesome). Because of all this, I try to stay with the pure Python code as much as possible. It allows me more confidence to create my pipeline. Even a "simple" pipeline is quite complex, I've found in my experience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
