One more suggestion for the bucket:
"Apache Arrow is a computational platform for efficient in-memory data
representation and processing."

On Mon, May 17, 2021 at 2:49 PM Wes McKinney <wesmck...@gmail.com> wrote:

> I think less is better in the description, but unfortunately the
> association of Arrow as being "just a data format" has been actively
> harmful in some ways to community growth. We have a data format, yes,
> but we are also creating a computational platform to go hand-in-hand
> with the data format to make it easier to build fast applications that
> use the data format. So the description needs to capture both of these
> ideas.
>
> On Mon, May 17, 2021 at 12:15 PM Julian Hyde <jhyde.apa...@gmail.com>
> wrote:
> >
> > I think that the “cross-language development platform for” is noise.
> (I’m sure that JPEG developers think that JPEG is a “cross-language
> development platform” too. But it isn’t. It is an image format.)
> >
> > "Apache Arrow is data format for efficient in-memory processing.”
> >
> > I’ll note that In marketing speak, we are developing a high-concept
> pitch [1] here. Every company needs a name, a brand, a high-concept pitch,
> and 3- or 4-sentence description. But every Apache project needs these too.
> It’s worth spending the time on the description, also, and then use them in
> all the places that we describe Arrow.
> >
> > Julian
> >
> > [1] https://www.growthink.com/content/whats-your-high-concept-pitch
> >
> >
> >
> > > On May 17, 2021, at 7:38 AM, Eduardo Ponce <edponc...@gmail.com>
> wrote:
> > >
> > > I agree with Nate's and Brian's suggestions, but would like to add
> that we
> > > can make it a one-liner for more conciseness and consistency with other
> > > Apache projects.
> > > Apologies if it seems I am going around the suggestions loop again.
> > >
> > > "Apache Arrow is a cross-language development platform enabling
> efficient
> > > in-memory data processing and transport."
> > >
> > >
> > >
> > >
> > > On Mon, May 17, 2021 at 10:11 AM Brian Hulette <bhule...@apache.org>
> wrote:
> > >
> > >> Thank you for bringing this up Dominik. I sampled some of the
> descriptions
> > >> for other Apache projects I frequent, the ones with a meaningful
> > >> description have a single sentence:
> > >>
> > >> github.com/apache/spark - Apache Spark - A unified analytics engine
> for
> > >> large-scale data processing
> > >> github.com/apache/beam - Apache Beam is a unified programming model
> for
> > >> Batch and Streaming
> > >> github.com/apache/avro - Apache Avro is a data serialization system
> > >>
> > >> Several others (Flink, Hadoop, ...) just have  "[Mirror of] Apache
> <name>"
> > >> as the description.
> > >>
> > >> +1 for Nate's suggestion "Apache Arrow is a cross-language development
> > >> platform for in-memory data. It enables systems to process and
> transport
> > >> data more efficiently."
> > >>
> > >> On Mon, May 17, 2021 at 5:23 AM Wes McKinney <wesmck...@gmail.com>
> wrote:
> > >>
> > >>> It's probably best for description to limit mentions of specific
> > >>> features. There are some high level features mentioned in the
> > >>> description now ("computational libraries and zero-copy streaming
> > >>> messaging and interprocess communication"), but now in 2021 since the
> > >>> project has grown so much, it could leave people with a limited view
> > >>> of what they might find here.
> > >>>
> > >>> On Mon, May 17, 2021 at 12:14 AM Mauricio Vargas
> > >>> <mauri...@ursacomputing.com> wrote:
> > >>>>
> > >>>> How about
> > >>>> 'Apache Arrow is a cross-language development platform for in-memory
> > >>> data.
> > >>>> It enables systems to process and transport data efficiently,
> > >> providing a
> > >>>> simple and fast library for partitioning of large tables'?
> > >>>>
> > >>>> Sorry the delay, long election day
> > >>>>
> > >>>> On Sun, May 16, 2021, 2:27 PM Nate Bauernfeind <
> > >>> natebauernfe...@deephaven.io>
> > >>>> wrote:
> > >>>>
> > >>>>> Suggestion: faster -> more efficiently
> > >>>>>
> > >>>>> "Apache Arrow is a cross-language development platform for
> in-memory
> > >>>>> data. It enables systems to process and transport data more
> > >>> efficiently."
> > >>>>>
> > >>>>> On Sun, May 16, 2021 at 11:35 AM Wes McKinney <wesmck...@gmail.com
> >
> > >>> wrote:
> > >>>>>
> > >>>>>> Here's what there now:
> > >>>>>>
> > >>>>>> "Apache Arrow is a cross-language development platform for
> > >> in-memory
> > >>>>>> data. It specifies a standardized language-independent columnar
> > >>> memory
> > >>>>>> format for flat and hierarchical data, organized for efficient
> > >>>>>> analytic operations on modern hardware. It also provides
> > >>> computational
> > >>>>>> libraries and zero-copy streaming messaging and interprocess
> > >>>>>> communication…"
> > >>>>>>
> > >>>>>> How about something shorter like
> > >>>>>>
> > >>>>>> "Apache Arrow is a cross-language development platform for
> > >> in-memory
> > >>>>>> data. It enables systems to process and transport data faster."
> > >>>>>>
> > >>>>>> Suggestions / refinements from others welcome
> > >>>>>>
> > >>>>>>
> > >>>>>> On Sat, May 15, 2021 at 9:12 PM Dominik Moritz <domor...@cmu.edu>
> > >>> wrote:
> > >>>>>>>
> > >>>>>>> Super minor issue but could someone make the description on
> > >> GitHub
> > >>>>>> shorter?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> GitHub puts the description into the title of the page and makes
> > >> it
> > >>>>> hard
> > >>>>>> to find it in URL autocomplete.
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>>
> > >>>
> > >>
> >
>

Reply via email to