I'm concerned about the term "Classic" bringing a negative connotation to it.
On Mon, Jul 22, 2024 at 5:11 PM Hyukjin Kwon <gurwls...@apache.org> wrote: > Yeah that's what I intended. Thanks for clarification. > > Let me start the vote > > > On Tue, 23 Jul 2024 at 08:14, Sadha Chilukoori <sage.quoti...@gmail.com> > wrote: > >> Hi Dongjoon, >> >> *To be clear, is the proposal aiming to make us to say like A instead of >> B in our documentation?* >> >> *A. Since `Spark Connect` mode has no RDD API, we need to use `Spark >> Classic` mode instead.* >> *B. Since `Spark Connect` mode has no RDD API, we need to use `Spark >> without Spark Connect` mode instead*. >> >> >> Correct, the thread is recommending to use option A, consistently in all >> the documentation. >> >> -Sadha >> >> On Mon, Jul 22, 2024, 10:25 AM Dongjoon Hyun <dongj...@apache.org> wrote: >> >>> Thank you for opening this thread, Hyukjin. >>> >>> In this discussion thread, we have three terminologies, (1) ~ (3). >>> >>> > Spark Classic (vs. Spark Connect) >>> >>> 1. Spark >>> 2. Spark Classic (= A proposal for Spark without Spark Connect) >>> 3. Spark Connect >>> >>> As Holden and Jungtaek mentioned, >>> >>> - (1) is definitely the existing code base which includes all (including >>> RDD API, Spark Thrift Server, Spark Connect and so on). >>> >>> - (3) is is a very specific use case to a user when a Spark binary >>> distribution is used with `--remote` option (or enabling the related >>> features). Like Spark Thrift Server, after query planning steps, there is >>> no fundamental difference in the execution code side in Spark clusters or >>> Spark jobs. >>> >>> - (2) By the proposed definition, (2) `Spark Classic` is not (1) >>> `Spark`. Like `--remote`, it's one of runnable modes. >>> >>> To be clear, is the proposal aiming to make us to say like A instead of >>> B in our documentation? >>> >>> A. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>> Classic` mode instead. >>> B. Since `Spark Connect` mode has no RDD API, we need to use `Spark >>> without Spark Connect` mode instead. >>> >>> Dongjoon. >>> >>> >>> >>> On 2024/07/22 12:59:54 Sadha Chilukoori wrote: >>> > +1 (non-binding) for classic. >>> > >>> > On Mon, Jul 22, 2024 at 3:59 AM Martin Grund >>> <mar...@databricks.com.invalid> >>> > wrote: >>> > >>> > > +1 for classic. It's simple, easy to understand and it doesn't have >>> the >>> > > negative meanings like legacy for example. >>> > > >>> > > On Sun, Jul 21, 2024 at 23:48 Wenchen Fan <cloud0...@gmail.com> >>> wrote: >>> > > >>> > >> Classic SGTM. >>> > >> >>> > >> On Mon, Jul 22, 2024 at 1:12 PM Jungtaek Lim < >>> > >> kabhwan.opensou...@gmail.com> wrote: >>> > >> >>> > >>> I'd propose not to change the name of "Spark Connect" - the name >>> > >>> represents the characteristic of the mode (separation of layer for >>> client >>> > >>> and server). Trying to remove the part of "Connect" would just make >>> > >>> confusion. >>> > >>> >>> > >>> +1 for Classic to existing mode, till someone comes up with better >>> > >>> alternatives. >>> > >>> >>> > >>> On Mon, Jul 22, 2024 at 8:50 AM Hyukjin Kwon <gurwls...@apache.org >>> > >>> > >>> wrote: >>> > >>> >>> > >>>> I was thinking about a similar option too but I ended up giving >>> this up >>> > >>>> .. It's quite unlikely at this moment but suppose that we have >>> another >>> > >>>> Spark Connect-ish component in the far future and it would be >>> challenging >>> > >>>> to come up with another name ... Another case is that we might >>> have to cope >>> > >>>> with the cases like Spark Connect, vs Spark (with Spark Connect) >>> and Spark >>> > >>>> (without Spark Connect) .. >>> > >>>> >>> > >>>> On Sun, 21 Jul 2024 at 09:59, Holden Karau < >>> holden.ka...@gmail.com> >>> > >>>> wrote: >>> > >>>> >>> > >>>>> I think perhaps Spark Connect could be phrased as “Basic* Spark” >>> & >>> > >>>>> existing Spark could be “Full Spark” given the API limitations >>> of Spark >>> > >>>>> connect. >>> > >>>>> >>> > >>>>> *I was also thinking Core here but we’ve used core to refer to >>> the RDD >>> > >>>>> APIs for too long to reuse it here. >>> > >>>>> >>> > >>>>> Twitter: https://twitter.com/holdenkarau >>> > >>>>> Books (Learning Spark, High Performance Spark, etc.): >>> > >>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> > >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> > >>>>> >>> > >>>>> >>> > >>>>> On Sat, Jul 20, 2024 at 8:02 PM Xiao Li <gatorsm...@gmail.com> >>> wrote: >>> > >>>>> >>> > >>>>>> Classic is much better than Legacy. : ) >>> > >>>>>> >>> > >>>>>> Hyukjin Kwon <gurwls...@apache.org> 于2024年7月18日周四 16:58写道: >>> > >>>>>> >>> > >>>>>>> Hi all, >>> > >>>>>>> >>> > >>>>>>> I noticed that we need to standardize our terminology before >>> moving >>> > >>>>>>> forward. For instance, when documenting, 'Spark without Spark >>> Connect' is >>> > >>>>>>> too long and verbose. Additionally, I've observed that we use >>> various names >>> > >>>>>>> for Spark without Spark Connect: Spark Classic, Classic Spark, >>> Legacy >>> > >>>>>>> Spark, etc. >>> > >>>>>>> >>> > >>>>>>> I propose that we consistently refer to it as Spark Classic >>> (vs. >>> > >>>>>>> Spark Connect). >>> > >>>>>>> >>> > >>>>>>> Please share your thoughts on this. Thanks! >>> > >>>>>>> >>> > >>>>>> >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau