Thanks Julian. Sounds worth a listen. 

Donald E. Foss (mobile-US ET)

> On Nov 19, 2016, at 1:48 PM, Julian Hyde <jh...@apache.org> wrote:
> 
> Matei Zaharia just spoke at the AMPlab seminar [1], and showed a couple of 
> slides about Weld. In the video of the day [2], his talk starts at 4:05:00, 
> and he starts talking about Weld at 4:28:30.
> 
> The essence is an intermediate language for row-level expressions, with the 
> ability to do limited iteration, with the goal of making it easier to pass 
> data between UDFs written in different languages. Sounds familiar? I would 
> presume that an implementation of the language would be strongly tied to a 
> memory format. Or maybe it allows multiple possible implementations, one of 
> which would be Arrow in Java.
> 
> The slide listed Pandas as one of the supported front ends, so I wondered if 
> Wes knew something about the project.
> 
> I have been thinking of doing something similar in the Calcite / Drill / 
> Arrow world. In Calcite we have RexNodes as an expression language, and we 
> have a Java code generator that can target data represented as Java arrays, 
> and another variant that can target data represented as Java structs. Drill 
> of course has a code generator that can target data in Arrow. I have been 
> thinking for a while of abstracting the code generators so that the person 
> implementing, say, the Filter+Project for “select x + y … where x > 5” 
> doesn’t have to get their hands dirty with code generation. There are a lot 
> of optimizations to be done, e.g. remembering that you’ve already made sure 
> that x is not null.
> 
> Julian
> 
> [1] https://amplab.cs.berkeley.edu/endofproject/ 
> <https://amplab.cs.berkeley.edu/endofproject/>
> 
> [2] https://youtu.be/KAacs9jYPHU <https://youtu.be/KAacs9jYPHU>
> 
> 
> 
>> On Nov 19, 2016, at 4:31 AM, Donald Foss <donald.f...@gmail.com> wrote:
>> 
>> Did you find that at https://cs.stanford.edu/~matei/? 
>> <https://cs.stanford.edu/~matei/?>  That’s the only thing I can find via 
>> Google about it.  Do you have more detail or a link to the paper itself?  I 
>> get the feeling that it is not yet fully complete despite 21 November 
>> camera-ready CIDR 2017 deadline.
>> 
>> For those who aren’t familiar with CIDR, it is a conference that occurs 
>> every other year.  This year’s agenda/program may be found at 
>> http://cidrdb.org/cidr2017/program.html 
>> <http://cidrdb.org/cidr2017/program.html>.  CIDR is not an acronym for 
>> network subnet masks—the first thing I thought of, Classless Inter Domain 
>> Routing, but Conference on Innovative Data Systems Research, which focuses 
>> primarily on systems.  I hate to admit this, but I’m unfamiliar with the 
>> conference, however that appears that it is because I’ve been out of 
>> academia for far too long, and this conference seems to be the presentation 
>> of quite a few interesting papers.  Just judging by title, a poor, yet 
>> humorous judge indeed, I like:
>> - “Dependency-Driven Analytics: A Compass for Uncharted Data Oceans” (Donald 
>> - Why just data lakes when you can have data oceans?)
>> - “My Weak Consistency is Strong” (Donald - Great title, reminds me of Star 
>> Wars and the “Force”)
>> - “SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale 
>> Machine Learning” (Donald - Another brilliant backronym.)
>> 
>> The Weld paper is the last paper to be presented on 10 January 2017 between 
>> 2:30 and 4:05 (UTC-8).
>> 
>> On a side note, looking down that page a little, I love the title of the 
>> last paper in 2016, Yggdrasil: An Optimized System for Training Deep 
>> Decision Trees at Scale 
>> <https://cs.stanford.edu/~matei/papers/2016/nips_yggdrasil.pdf>.  When I see 
>> Yggdrasil, the first thing I think of is a really big tree and Norse 
>> mythology.  It’s a great name.  I’m going to read some of his other papers 
>> this weekend.
>> 
>> Donald Foss
>> donald.f...@gmail.com
>> ------ __o
>> ----_`\<,_
>> ---(_)/ (_)
>> 
>> The information in this email is confidential and may be legally privileged. 
>> It is intended solely for the addressee. Access to this e-mail by anyone 
>> else is unauthorized.
>> 
>>> On Nov 18, 2016, at 4:42 PM, Julian Hyde <jh...@apache.org> wrote:
>>> 
>>> Anyone know anything about Matei Zaharia’s Weld project?
>>> 
>>>    • S. Palkar, J. Thomas, A. Shanbhag, H. Pirk, M. Schwarzkopf, S. 
>>> Amarasinghe and M. Zaharia. Weld: A Common Runtime for High Performance 
>>> Data Analytics, to appear at CIDR 2017.
>>> 
>>> It seems to have similar goals to Arrow.
>>> 
>>> Julian
>>> 
>> 
> 

Reply via email to