Hi All, I have been recently reading about Apache Beam and am interested in exploring how it fits into our stack.
We currently have our hive and spark pipelines. We have the late data arrival issues and have to reprocess couple of steps to ensure the data is consumed. Couple of questions on top of my mind are - 1. Does Beam use the existing cluster or needs its own cluster ? 2. How Beam fits with the existing Hive and Spark jobs ? What changes might be required in the jobs for starting with Beam ? Best, Yash
