Hi Dongjoon, *To be clear, is the proposal aiming to make us to say like A instead of B in our documentation?*
*A. Since `Spark Connect` mode has no RDD API, we need to use `Spark Classic` mode instead.* *B. Since `Spark Connect` mode has no RDD API, we need to use `Spark without Spark Connect` mode instead*. Correct, the thread is recommending to use option A, consistently in all the documentation. -Sadha On Mon, Jul 22, 2024, 10:25 AM Dongjoon Hyun <dongj...@apache.org> wrote: > Thank you for opening this thread, Hyukjin. > > In this discussion thread, we have three terminologies, (1) ~ (3). > > > Spark Classic (vs. Spark Connect) > > 1. Spark > 2. Spark Classic (= A proposal for Spark without Spark Connect) > 3. Spark Connect > > As Holden and Jungtaek mentioned, > > - (1) is definitely the existing code base which includes all (including > RDD API, Spark Thrift Server, Spark Connect and so on). > > - (3) is is a very specific use case to a user when a Spark binary > distribution is used with `--remote` option (or enabling the related > features). Like Spark Thrift Server, after query planning steps, there is > no fundamental difference in the execution code side in Spark clusters or > Spark jobs. > > - (2) By the proposed definition, (2) `Spark Classic` is not (1) `Spark`. > Like `--remote`, it's one of runnable modes. > > To be clear, is the proposal aiming to make us to say like A instead of B > in our documentation? > > A. Since `Spark Connect` mode has no RDD API, we need to use `Spark > Classic` mode instead. > B. Since `Spark Connect` mode has no RDD API, we need to use `Spark > without Spark Connect` mode instead. > > Dongjoon. > > > > On 2024/07/22 12:59:54 Sadha Chilukoori wrote: > > +1 (non-binding) for classic. > > > > On Mon, Jul 22, 2024 at 3:59 AM Martin Grund > <mar...@databricks.com.invalid> > > wrote: > > > > > +1 for classic. It's simple, easy to understand and it doesn't have the > > > negative meanings like legacy for example. > > > > > > On Sun, Jul 21, 2024 at 23:48 Wenchen Fan <cloud0...@gmail.com> wrote: > > > > > >> Classic SGTM. > > >> > > >> On Mon, Jul 22, 2024 at 1:12 PM Jungtaek Lim < > > >> kabhwan.opensou...@gmail.com> wrote: > > >> > > >>> I'd propose not to change the name of "Spark Connect" - the name > > >>> represents the characteristic of the mode (separation of layer for > client > > >>> and server). Trying to remove the part of "Connect" would just make > > >>> confusion. > > >>> > > >>> +1 for Classic to existing mode, till someone comes up with better > > >>> alternatives. > > >>> > > >>> On Mon, Jul 22, 2024 at 8:50 AM Hyukjin Kwon <gurwls...@apache.org> > > >>> wrote: > > >>> > > >>>> I was thinking about a similar option too but I ended up giving > this up > > >>>> .. It's quite unlikely at this moment but suppose that we have > another > > >>>> Spark Connect-ish component in the far future and it would be > challenging > > >>>> to come up with another name ... Another case is that we might have > to cope > > >>>> with the cases like Spark Connect, vs Spark (with Spark Connect) > and Spark > > >>>> (without Spark Connect) .. > > >>>> > > >>>> On Sun, 21 Jul 2024 at 09:59, Holden Karau <holden.ka...@gmail.com> > > >>>> wrote: > > >>>> > > >>>>> I think perhaps Spark Connect could be phrased as “Basic* Spark” & > > >>>>> existing Spark could be “Full Spark” given the API limitations of > Spark > > >>>>> connect. > > >>>>> > > >>>>> *I was also thinking Core here but we’ve used core to refer to the > RDD > > >>>>> APIs for too long to reuse it here. > > >>>>> > > >>>>> Twitter: https://twitter.com/holdenkarau > > >>>>> Books (Learning Spark, High Performance Spark, etc.): > > >>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > > >>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > >>>>> > > >>>>> > > >>>>> On Sat, Jul 20, 2024 at 8:02 PM Xiao Li <gatorsm...@gmail.com> > wrote: > > >>>>> > > >>>>>> Classic is much better than Legacy. : ) > > >>>>>> > > >>>>>> Hyukjin Kwon <gurwls...@apache.org> 于2024年7月18日周四 16:58写道: > > >>>>>> > > >>>>>>> Hi all, > > >>>>>>> > > >>>>>>> I noticed that we need to standardize our terminology before > moving > > >>>>>>> forward. For instance, when documenting, 'Spark without Spark > Connect' is > > >>>>>>> too long and verbose. Additionally, I've observed that we use > various names > > >>>>>>> for Spark without Spark Connect: Spark Classic, Classic Spark, > Legacy > > >>>>>>> Spark, etc. > > >>>>>>> > > >>>>>>> I propose that we consistently refer to it as Spark Classic (vs. > > >>>>>>> Spark Connect). > > >>>>>>> > > >>>>>>> Please share your thoughts on this. Thanks! > > >>>>>>> > > >>>>>> > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >