aromanenko-dev commented on PR #22446: URL: https://github.com/apache/beam/pull/22446#issuecomment-1246923780
@mosche I'm totally agree that the current name is not very practical in a way that it's quite long and, even worse, very confusing since it contains a `Streaming` word in its name but this runner doesn't support streaming mode at all (we know the reasons but it is what it is). So, it would be better to rename it, though, I'm not sure about `SparkSqlRunner` as a new name. IMHO, it may be also confusing and give some false expectations that it supports only Spark (or Beam?) SQL pipelines. I'd suggest the name `SparkDatasetRunner` since it's based on Spark Dataset API. This name is quite short and gives the basic idea of what to expect from this runner. Old runner could be called `SparkRDDRunner` but let's keep it as it is - just `SparkRunner`. On the other hand, this renaming will require many incompatible changes, starting from new packages and artifacts names. However, I'm pretty sure that the most users, that run Beam pipelines on Spark, still use the old classical Spark(RDD)Runner. We can check it out on user@ and twitter, if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
