Arrow Compute/C++ Engine for Relational Operations? On Fri, May 20, 2022 at 9:31 AM Aldrin <akmon...@ucsc.edu.invalid> wrote:
> Perhaps: > "Acero (aˈsɛɾo): *A* *c*ompute *e*ngine for Ar*ro*w" - Most similar to > Will's 3rd option, but enforces that it purposefully sounds like "ACE > Arrow" or something to that effect. Then, it's also easy to use a > shortened, canonical name--compute engine. > > Optionally "C++" can be inserted ("A C++ compute...") > > > Aldrin Montana > Computer Science PhD Student > UC Santa Cruz > > > On Thu, May 19, 2022 at 6:07 PM Will Jones <will.jones...@gmail.com> > wrote: > > > > > > > A relatively obscure name at least makes it easy to search for. I guess > > > we'll want to write a blog post to help get the name into search > rankings > > > and officially 'introduce' what contributors have been up to? > > > > > > Yes. I think the name is very comparable to Gandiva in this respect. > > > > To Antoine’s point, we may wish to have a canonical way to refer to the > > engine when introducing it, both to help understanding meaning and > > pronunciation. Given the unique name, I think people shouldn't have too > > hard a time remembering the purpose once familiar. Here's my initial > > attempt at that (but I'm sure there are others who have a better > > description): > > > > “Apache Arrow Acero (aˈsɛɾo)” - provides association with Arrow, but not > > its purpose, so not great. > > "The Acero (aˈsɛɾo) Query Engine" - provides meaning, but not association > > with Arrow > > "Acero (aˈsɛɾo): A C++ Arrow-based modular query engine" - a tag line > > provides opportunity for mentioning Arrow and purpose. This seems to be > > what Gandiva went for in the original blog post [1]. > > > > [1] https://arrow.apache.org/blog/2018/12/05/gandiva-donation/ > > > > > > On Thu, May 19, 2022 at 16:37 David Li <lidav...@apache.org> wrote: > > > > > I like Acero. A relatively obscure name at least makes it easy to > search > > > for. I guess we'll want to write a blog post to help get the name into > > > search rankings and officially 'introduce' what contributors have been > up > > > to? > > > > > > We could also come up with a backronym if we really want justification > > for > > > the name. > > > > > > On Thu, May 19, 2022, at 19:29, Sutou Kouhei wrote: > > > > I'm OK with "Acero". > > > > > > > > In <CAJPUwMDHAv= > qqkjjwhmifkddwlt4i59v9heqtxuhggdnseu...@mail.gmail.com > > > > > > > "Re: [DISCUSS] "Naming" the Arrow C++ execution engine subproject?" > > > > on Thu, 19 May 2022 12:02:25 -0700, > > > > Wes McKinney <wesmck...@gmail.com> wrote: > > > > > > > >> Any more thoughts about names? How should we decide? The “Acero” > name > > > seems > > > >> like it does not generate any obvious conflicts. > > > >> > > > >> On Tue, May 10, 2022 at 12:14 PM Andy Grove <andygrov...@gmail.com> > > > wrote: > > > >> > > > >>> I like Acero too. I like it because (as a non-Spanish speaker, at > > > least) it > > > >>> has no obvious meaning or connotation and once the community starts > > to > > > use > > > >>> this name for the project, that is the meaning that it will come to > > > have. > > > >>> Just like Gandiva (a word I was not familiar with when I learned > > about > > > the > > > >>> project). I do strongly prefer names like this over acronyms > because > > > it is > > > >>> easier for the meaning to change over time as well. > > > >>> > > > >>> On Tue, May 10, 2022 at 12:50 PM Eduardo Ponce < > edponc...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> > As a Spanish speaking person, I cannot think of a misleading or > bad > > > >>> > connotation for the word "acero". The word is generally used to > > > refer to > > > >>> > either steel materials (actual definition) or as a > simile/metaphor > > > >>> > comparing to something very strong. We can view this as a > self-laud > > > on > > > >>> the > > > >>> > robust and powerful functionality of the Arrow C++ compute > engine. > > > >>> > In terms of rhyming "acero" and Arrow, it depends on your accent. > > For > > > >>> > example, I do not consider them to rhyme. > > > >>> > Also, I do not think we need to treat it as an acronym, it can > > > simply be > > > >>> a > > > >>> > name. > > > >>> > > > > >>> > ~Eduardo > > > >>> > > > > >>> > On Tue, May 10, 2022 at 2:29 PM Will Jones < > > will.jones...@gmail.com> > > > >>> > wrote: > > > >>> > > > > >>> > > "Acero" has a nice ring to it. Almost as if you said "ACE > Arrow" > > > really > > > >>> > > fast. And maybe the steel / iron meaning gives a sort of > > > close-to-metal > > > >>> > > vibes (similar to what Rust's name invokes), though I'm not a > > > Spanish > > > >>> > > speaker with a meaningful understanding of the words' > > connotations. > > > >>> > > > > > >>> > > On Tue, May 10, 2022 at 11:06 AM Wes McKinney < > > wesmck...@gmail.com > > > > > > > >>> > wrote: > > > >>> > > > > > >>> > > > A couple of other names derivative from the Ace- vibe: > > > >>> > > > > > > >>> > > > Acero ("steel" or sometimes "sword" in Spanish but apparently > > > also > > > >>> > > > "maple" in Italian). Also rhymes with Arrow but not sure if > > this > > > is > > > >>> > > > good or bad > > > >>> > > > Acera ("pavement" or "sidewalk" in Spanish) > > > >>> > > > > > > >>> > > > On Tue, May 10, 2022 at 9:53 AM Will Jones < > > > will.jones...@gmail.com> > > > >>> > > > wrote: > > > >>> > > > > > > > >>> > > > > I think it is important to give the C++ execution engine a > > > separate > > > >>> > > name, > > > >>> > > > > as has been said by Wes and Jacques. Two reason for that > IMO: > > > >>> > > > > > > > >>> > > > > 1. The more things we lend the Arrow brand outside of the > > > format, > > > >>> > the > > > >>> > > > > harder it becomes for outside users to grasp what "Arrow" > is. > > > >>> > > > > 2. Giving the C++ engine a name under the Arrow umbrella > > > gives it > > > >>> > > undue > > > >>> > > > > weight relative to other Arrow-based engines (such as > > > DataFusion, > > > >>> > > > Polars), > > > >>> > > > > which may not generate good faith in the Arrow community. > > > >>> > > > > > > > >>> > > > > If the "ACE" name has stuck, one option might be to simply > > > adopt > > > >>> the > > > >>> > > word > > > >>> > > > > "Ace" and call it the "Ace Query Engine". "Ace" both taking > > > meaning > > > >>> > > from > > > >>> > > > > the modern "a person who excels at some activity" or the > > > original > > > >>> > > > "playing > > > >>> > > > > card ... with a single pip" [1] (as an indication of > > > >>> > > single-noded-ness). > > > >>> > > > > > > > >>> > > > > Antoine did point out the ACE name is taken by a C++ > library. > > > The > > > >>> > "Ace" > > > >>> > > > > name is also used by the javascript library [2], but I > think > > > is a > > > >>> > > general > > > >>> > > > > enough work that no single library has much specific claim > to > > > it. > > > >>> > > > > > > > >>> > > > > Some other names I thought of: > > > >>> > > > > Arrow Recurve > > > >>> > > > > Ace Archer > > > >>> > > > > Arrow Ricochet > > > >>> > > > > > > > >>> > > > > [1] https://en.wikipedia.org/wiki/Ace > > > >>> > > > > [2] https://ace.c9.io/ > > > >>> > > > > > > > >>> > > > > On Tue, May 10, 2022 at 12:44 AM Antoine Pitrou < > > > >>> anto...@python.org> > > > >>> > > > wrote: > > > >>> > > > > > > > >>> > > > > > > > > >>> > > > > > Do we have to give it a particular name at all? Most of > the > > > C++ > > > >>> > > > > > subcomponents simply have a description ("the datasets > > > layer", > > > >>> > etc.). > > > >>> > > > > > There are probably more important topics to spend our > time > > > on. > > > >>> > > > > > > > > >>> > > > > > Regards > > > >>> > > > > > > > > >>> > > > > > Antoine. > > > >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > Le 09/05/2022 à 21:44, Ian Cook a écrit : > > > >>> > > > > > > Reflecting on this discussion six weeks after Wes’s > > initial > > > >>> > > message: > > > >>> > > > I > > > >>> > > > > > > like the “ACE” name. I have been using it to refer to > the > > > Arrow > > > >>> > C++ > > > >>> > > > > > > execution engine in verbal conversations with > > > contributors, and > > > >>> > it > > > >>> > > > has > > > >>> > > > > > > been a much-needed convenient monosyllabic shorthand > for > > a > > > part > > > >>> > of > > > >>> > > > the > > > >>> > > > > > > Arrow project that has not previously had a clear and > > > memorable > > > >>> > > name. > > > >>> > > > > > > > > > >>> > > > > > > I agree with Sasha that it would be ideal to use some > > > >>> > metaphorical > > > >>> > > or > > > >>> > > > > > > symbolic Archery-adjacent name prefaced with “Arrow,” > but > > > no > > > >>> such > > > >>> > > > name > > > >>> > > > > > > has evolved organically to date. And it’s not for lack > of > > > >>> > trying—a > > > >>> > > > few > > > >>> > > > > > > months back I floated the idea to some people that we > > > should > > > >>> call > > > >>> > > it > > > >>> > > > > > > “Chiron” after the centaur from Greek mythology > > associated > > > with > > > >>> > > > > > > archery, but it never caught on :) Since there is no > > clear > > > >>> > > consensus > > > >>> > > > > > > about which such creative name we might invent now, I > > think > > > >>> > > adopting > > > >>> > > > a > > > >>> > > > > > > creative name would require strong advocacy and > > > >>> > consensus-building > > > >>> > > > > > > work from someone central to the project, and this has > > not > > > >>> > emerged. > > > >>> > > > > > > Thus, a more literal descriptive name seems like our > best > > > >>> choice. > > > >>> > > > > > > > > > >>> > > > > > > If we do go with “ACE” as the acronym, then we will > need > > to > > > >>> > > establish > > > >>> > > > > > > what that stands for. If we make the full name clear to > > the > > > >>> > > community > > > >>> > > > > > > and we use it alongside the acronym on the website, > that > > > should > > > >>> > > help > > > >>> > > > > > > with problems of Googlability of the acronym. > > > >>> > > > > > > > > > >>> > > > > > > That raises the question of what the “C” stands for. I > > > agree > > > >>> with > > > >>> > > > > > > Jacques that it is less than ideal to have the “C” > stand > > > for > > > >>> > > > “Compute” > > > >>> > > > > > > because it could create a misleading and undesirable > > > >>> connotation > > > >>> > of > > > >>> > > > > > > primacy. I also agree with Andy that it is less than > > ideal > > > for > > > >>> > the > > > >>> > > > “C” > > > >>> > > > > > > to stand for “C++” because it is intended to be used > from > > > other > > > >>> > > > > > > languages. I am unsure how we should weigh these two > > > concerns. > > > >>> > More > > > >>> > > > > > > input on this question would be appreciated. > > > >>> > > > > > > > > > >>> > > > > > > Ian > > > >>> > > > > > > > > > >>> > > > > > > On Mon, Apr 18, 2022 at 5:31 PM Jacques Nadeau < > > > >>> > jacq...@apache.org > > > >>> > > > > > > >>> > > > > > wrote: > > > >>> > > > > > >> > > > >>> > > > > > >> I'm -0.9 on Arrow Compute engine. It makes it sound > like > > > it is > > > >>> > THE > > > >>> > > > > > >> canonical Arrow one, second classing Datafusion and > > > Gandiva. > > > >>> > > > > > >> > > > >>> > > > > > >> No strong feelings on other names. Naming in general > is > > an > > > >>> > > extremely > > > >>> > > > > > >> subjective process... > > > >>> > > > > > >> > > > >>> > > > > > >> > > > >>> > > > > > >> > > > >>> > > > > > >> On Thu, Mar 31, 2022, 2:33 PM Weston Pace < > > > >>> > weston.p...@gmail.com> > > > >>> > > > > > wrote: > > > >>> > > > > > >> > > > >>> > > > > > >>> I'm +1 for "arrow compute engine". In the docs we > > > currently > > > >>> > > refer > > > >>> > > > to > > > >>> > > > > > >>> it as the "streaming execution engine". I do like > the > > > word > > > >>> > > > > > >>> "streaming" as it is the difference between the > engine > > > and > > > >>> the > > > >>> > > > general > > > >>> > > > > > >>> "compute" module but the word is also overloaded and > we > > > can > > > >>> > > easily > > > >>> > > > > > >>> include the word "streaming" in the first sentence of > > > >>> whatever > > > >>> > > > > > >>> description we have for the engine. > > > >>> > > > > > >>> > > > >>> > > > > > >>>> I'd personally like to see such a word for the query > > > engine > > > >>> > > > (otherwise > > > >>> > > > > > >>> we'd > > > >>> > > > > > >>>> have to call Arrow Flight "Arrow Wire Protocol" 😅). > > > Even > > > >>> > > > something > > > >>> > > > > > like > > > >>> > > > > > >>>> "Arrow Archer" or "Arrow Bow" would be sufficient > for > > > me. > > > >>> > > > > > >>> > > > >>> > > > > > >>> I do like the idea of calling it just "bow" and I'm > not > > > >>> against > > > >>> > > > either > > > >>> > > > > > >>> of these names (+0). I think I still lean towards > > > something > > > >>> > more > > > >>> > > > > > >>> plain and descriptive (arrow wire protocol has a nice > > > ring to > > > >>> > > > it...) > > > >>> > > > > > >>> > > > >>> > > > > > >>> On Tue, Mar 29, 2022 at 9:10 AM Sasha Krassovsky > > > >>> > > > > > >>> <krassovskysa...@gmail.com> wrote: > > > >>> > > > > > >>>> > > > >>> > > > > > >>>> In my view, the Arrow project has the core format > > > >>> > specification > > > >>> > > > > > (called > > > >>> > > > > > >>>> Arrow), and then ancillary libraries for actually > > > *doing* > > > >>> > stuff > > > >>> > > > with > > > >>> > > > > > >>> Arrow > > > >>> > > > > > >>>> data, such as Arrow Flight and the query engine > > (within > > > the > > > >>> > > > `arrow` > > > >>> > > > > > >>>> subdirectory in particular). I think these ancillary > > > >>> libraries > > > >>> > > > should > > > >>> > > > > > all > > > >>> > > > > > >>>> follow a similar naming convention. Seems like the > > > precedent > > > >>> > set > > > >>> > > > by > > > >>> > > > > > Arrow > > > >>> > > > > > >>>> Flight is "Arrow <mildly archery-related, > descriptive > > > >>> word>", > > > >>> > so > > > >>> > > > I'd > > > >>> > > > > > >>>> personally like to see such a word for the query > > engine > > > >>> > > (otherwise > > > >>> > > > > > we'd > > > >>> > > > > > >>>> have to call Arrow Flight "Arrow Wire Protocol" 😅). > > > Even > > > >>> > > > something > > > >>> > > > > > like > > > >>> > > > > > >>>> "Arrow Archer" or "Arrow Bow" would be sufficient > for > > > me. > > > >>> > > > > > >>>> > > > >>> > > > > > >>>> Sasha Krassovsky > > > >>> > > > > > >>>> > > > >>> > > > > > >>>> > > > >>> > > > > > >>>> > > > >>> > > > > > >>>> On Tue, Mar 29, 2022 at 9:25 AM Gavin Ray < > > > >>> > > ray.gavi...@gmail.com> > > > >>> > > > > > wrote: > > > >>> > > > > > >>>> > > > >>> > > > > > >>>>> "Arrow Compute Engine" sounds quite nice to me, tbh > > > >>> > > > > > >>>>> Agreeing with the points made above about ACE being > > > >>> difficult > > > >>> > > to > > > >>> > > > > > >>> google, > > > >>> > > > > > >>>>> and AQE being a loaded term in query engines > already. > > > >>> > > > > > >>>>> > > > >>> > > > > > >>>>> > > > >>> > > > > > >>>>> On Tue, Mar 29, 2022 at 10:07 AM Andy Grove < > > > >>> > > > andygrov...@gmail.com> > > > >>> > > > > > >>> wrote: > > > >>> > > > > > >>>>> > > > >>> > > > > > >>>>>> Just my 2 cents on this. If you were to call it > > ACE, I > > > >>> would > > > >>> > > > make > > > >>> > > > > > >>> the C > > > >>> > > > > > >>>>>> stand for "Compute" rather than C++ since it is > > > intended > > > >>> to > > > >>> > be > > > >>> > > > used > > > >>> > > > > > >>> from > > > >>> > > > > > >>>>>> other languages, such as Python. > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> The problem with ACE is that is a common word and > it > > > will > > > >>> > make > > > >>> > > > it > > > >>> > > > > > >>> hard to > > > >>> > > > > > >>>>>> Google for documentation. Even the combination of > > > Arrow > > > >>> and > > > >>> > > ACE > > > >>> > > > > > >>> already > > > >>> > > > > > >>>>> has > > > >>> > > > > > >>>>>> plenty of results. > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> Also, I saw in the linked doc a reference to AQE > > (for > > > >>> Arrow > > > >>> > > > Query > > > >>> > > > > > >>>>> Engine). > > > >>> > > > > > >>>>>> I would not recommend using this since many people > > > know > > > >>> AQE > > > >>> > as > > > >>> > > > > > >>> Adaptive > > > >>> > > > > > >>>>>> Query Execution (especially Spark users). > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> "Arrow Compute Engine" in full doesn't sound bad > > > perhaps? > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> With DataFusion, I made a list of words related to > > the > > > >>> > project > > > >>> > > > > > (data, > > > >>> > > > > > >>>>>> query, compute, engine, etc) and then a list of > > > completely > > > >>> > > > unrelated > > > >>> > > > > > >>>>> words > > > >>> > > > > > >>>>>> and then looked at the combinations to see what > > > sounded > > > >>> good > > > >>> > > to > > > >>> > > > me. > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> Andy. > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>> On Mon, Mar 28, 2022 at 4:31 PM Antoine Pitrou < > > > >>> > > > anto...@python.org> > > > >>> > > > > > >>>>> wrote: > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> ACE is already the name of a well-known C++ > > library, > > > >>> though > > > >>> > > > I'm not > > > >>> > > > > > >>>>> sure > > > >>> > > > > > >>>>>>> how widely used it is nowadays : > > > >>> > > > > > >>>>>>> http://www.dre.vanderbilt.edu/~schmidt/ACE.html > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> I would name it "execution engine" or "Arrow C++ > > > >>> execution > > > >>> > > > engine" > > > >>> > > > > > >>> in > > > >>> > > > > > >>>>>> full. > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> Regards > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> Antoine. > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>>> Le 29/03/2022 à 00:15, Wes McKinney a écrit : > > > >>> > > > > > >>>>>>>> hi all, > > > >>> > > > > > >>>>>>>> > > > >>> > > > > > >>>>>>>> There has been a steady stream of work over the > > last > > > >>> year > > > >>> > > and > > > >>> > > > a > > > >>> > > > > > >>> half > > > >>> > > > > > >>>>>>>> or so to create a set of query engine building > > > blocks in > > > >>> > C++ > > > >>> > > > to > > > >>> > > > > > >>>>>>>> evaluate queries against Arrow Datasets and > input > > > >>> streams, > > > >>> > > > which > > > >>> > > > > > >>> can > > > >>> > > > > > >>>>>>>> be of use to applications that are already > > building > > > on > > > >>> top > > > >>> > > of > > > >>> > > > the > > > >>> > > > > > >>>>>>>> Arrow C++ project. This effort has a smaller > > surface > > > >>> area > > > >>> > > than > > > >>> > > > > > >>>>>>>> DataFusion since SQL parsing and query > > optimization > > > are > > > >>> > > being > > > >>> > > > > > >>> left to > > > >>> > > > > > >>>>>>>> other tools. > > > >>> > > > > > >>>>>>>> > > > >>> > > > > > >>>>>>>> I thought it would be useful to have a name for > > this > > > >>> > > > subproject > > > >>> > > > > > >>>>>>>> similar to how we have Gandiva, Plasma, > > DataFusion, > > > and > > > >>> > > other > > > >>> > > > > > >>> named > > > >>> > > > > > >>>>>>>> Apache Arrow subprojects. We had discussed > > creating > > > a > > > >>> > > project > > > >>> > > > > > >>> like > > > >>> > > > > > >>>>>>>> this a few years ago [1], but since there are > now > > > >>> multiple > > > >>> > > > > > >>>>>>>> Arrow-native or Arrow-compatible query engines > in > > > the > > > >>> > wild, > > > >>> > > it > > > >>> > > > > > >>> would > > > >>> > > > > > >>>>>>>> be helpful to disambiguate. > > > >>> > > > > > >>>>>>>> > > > >>> > > > > > >>>>>>>> One simple name is ACE — Arrow C++ Engine. I'm > not > > > very > > > >>> > good > > > >>> > > > at > > > >>> > > > > > >>>>> naming > > > >>> > > > > > >>>>>>>> things, so if there are other suggestions from > the > > > >>> > > community I > > > >>> > > > > > >>> would > > > >>> > > > > > >>>>>>>> love to hear them! > > > >>> > > > > > >>>>>>>> > > > >>> > > > > > >>>>>>>> Thanks, > > > >>> > > > > > >>>>>>>> Wes > > > >>> > > > > > >>>>>>>> > > > >>> > > > > > >>>>>>>> [1]: > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>> > > > >>> > > > > > >>> > > > >>> > > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > > > > https://docs.google.com/document/d/10RoUZmiMQRi_J1FcPeVAUAMJ6d_ZuiEbaM2Y33sNPu4/edit#heading=h.2k6k5a4y9b8y > > > >>> > > > > > >>>>>>> > > > >>> > > > > > >>>>>> > > > >>> > > > > > >>>>> > > > >>> > > > > > >>> > > > >>> > > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > > > >