Animesh,

Thanks for your thoughtful response.

I think we’re now on the same page about the opportunities for collaboration. 
And I saw that Wes posted to this thread too. I hope you find ways to make 
Arrow and Crail work well together.

Julian


> On Sep 5, 2018, at 3:49 AM, Animesh Trivedi <[email protected]> wrote:
> 
> Hi Julian,
> 
> Thanks for posting your thoughts.
> 
> [As a Crail committer]: We agree that the notion of "we" creates confusion.
> The Crail blog follows the trend in community projects, where a blogpost
> falls in one of the two categories. The first type where a developer talks
> about recent improvements, features, performance evaluation, etc. The
> second type is where "a user" presents how they used the system for their
> use-case. The Albis blog post falls into the second category. We can (and
> should for future references) definitely categorize and mark it clear that
> way. And we would encourage the community, whoever tries Crail please reach
> out to us to present your story on the Crail blog. Crail is committed to
> provide the best possible performance to all its users, be it Albis, Arrow,
> ORC, or Parquet.
> 
> [As a developer of Albis and user of Crail]: I understand your sentiment
> regarding the format wars, and it is not the aim of Albis to establish yet
> another file format. Albis started as a prototype to quickly "explore"
> various design choices for storing relational data for a variety of
> scenarios with high-performance storage/networking devices - the kind of
> devices Crail targets. This is something that I cannot easily do with
> Arrow, ORC, or Parquet with HDFS (or something similar) within a reasonable
> effort and time-frame as they all have already chosen certain design points
> and trade-offs. Crail and Albis are not tied (or are preferred over other
> choices) to each other, though since it is coming from a same set of
> developers, I can see why the confusion arises. Having said this, I will be
> happy to contribute back to the Arrow community about the findings from
> Albis, and would appreciate any help with that. I had a brief discussion
> with Julien Le Dem at last DataWorks summit in San Jose about Albis as
> well. I have not done a through investigation of Arrow over Crail, but
> perhaps something that can be picked-up now as a starting point.
> 
> I hope this clarifies the confusion. We will fix the blog post.
> 
> Thanks,
> --
> Animesh
> 
> On Tue, Sep 4, 2018 at 9:59 PM Julian Hyde <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> I just read the blog post [1] about Crail and file formats. (I have to
>> declare my interests up front: I have been a huge supporter of Apache
>> Arrow, and I am a PMC member. I’m speaking here as an Arrow contributor and
>> enthusiast, not as a mentor of Crail.)
>> 
>> I am a bit troubled about the endorsement of Albis in a Crail blog post.
>> For example, "we have developed a new file format called Albis”. Since the
>> blog post is not signed, I take it that “We” means the authors of the paper
>> [2] mentioned in the blog post. But I hope that “we” does not mean “we as
>> Crail committers and PMC members".
>> 
>> I know that there are different forces at play if you work for a
>> corporation, or are a researcher, or are an idealistic open source. As a
>> researcher, you need to invent new stuff and prove that it is better than
>> everything that has been done before.
>> 
>> But I’ve been through the file format wars — ORC vs Parquet — driven in
>> large part by two competing vendors. It was sickening, and a huge waste of
>> effort. Please, please don’t let this happen again. If you want to make
>> Crail successful, you should make it absolutely clear to the Arrow, ORC and
>> Parquet communities that you will help to make Crail work as well as it
>> possibly can
>> 
>> Also, on paper Albis looks very similar to Arrow, and the performance gap
>> is fairly narrow. If you have found insights that would improve Arrow, I
>> encourage you to share them and make Arrow better. It may be good research
>> practice to accentuate the differences between the two, but it’s good open
>> source practice to find consensus between technologies, and merge
>> communities. There is a lot of work to be done, and too few people to do it.
>> 
>> Lastly, I know I seem to be giving mixed messages here. I do believe that
>> content about Crail will help drive engagement and build community
>> (controversial content even more so). I am delighted that the Crail team is
>> writing blog posts and posting them to Twitter. But be careful not to
>> alienate communities that could help Crail gain widespread adoption.
>> 
>> Julian
>> 
>> [1] http://crail.incubator.apache.org/blog/2018/08/sql-p1.html <
>> http://crail.incubator.apache.org/blog/2018/08/sql-p1.html 
>> <http://crail.incubator.apache.org/blog/2018/08/sql-p1.html>>
>> 
>> [2] https://www.usenix.org/conference/atc18/presentation/trivedi 
>> <https://www.usenix.org/conference/atc18/presentation/trivedi> <
>> https://www.usenix.org/conference/atc18/presentation/trivedi 
>> <https://www.usenix.org/conference/atc18/presentation/trivedi>>

Reply via email to