Great points Julian, especially about algebra. Couldn’t agree more.

In fact, we have been strong advocates of the viewpoint that it is all about 
the algebraic framework. Furthermore, we have argued that the relational 
algebraic framework is the right “core” to build a platform. With it you can go 
well beyond warehousing/SQL but also (with small extensions) build:

#1: JSON document stores (see Argo 
<http://pages.cs.wisc.edu/~chasseur/pubs/argo-short.pdf>), 

#2: Iterative graph analytics (see Grail 
<http://www.cs.wisc.edu/~jignesh/publ/Grail.pdf>), 

#3: Relational learning (see QuickFOIL 
<http://www.cs.wisc.edu/~jignesh/publ/QuickFoil.pdf>), and

#4: Biological data management (see Periscope/SQ 
<http://www.vldb.org/conf/2007/papers/demo/p1406-tata.pdf> and Periscope/GQ 
<http://www.vldb.org/pvldb/1/1454184.pdf>).

If all of that is not enough, there are nice synergies between deeper 
integration of common classes of machine learning and relational data 
representation. A key idea here is factorized learning, which my student Arun 
Kumar (co-advised with Naughton) introduced last year 
<http://pages.cs.wisc.edu/~arun/orion/LearningOverJoinsSIGMOD.pdf>. Arun will 
present a far deeper follow-on paper 
<http://pages.cs.wisc.edu/~arun/hamlet/OptFSSIGMOD.pdf> on this topic at SIGMOD 
in a few weeks. Interestingly, many other papers are starting to build on these 
initial ideas. There is still a bunch of theory to figure out, as a research 
community, we are collectively getting very close to nailing that.

In my keynote @ SIGMOD last year 
<http://dl.acm.org/citation.cfm?doid=2723372.2723374>, I talked about how 
theory (see papers above) has shown that with an extended relational algebraic 
core these seemingly different applications converge to a platform that is 
powered by a relational core. This converged platform is the long-term vision 
for Quickstep. Yup — I hear you, I need to write this up for the community. You 
are right and I’m adding it to my list :-) 

We have shown prototypes for all of the above, but haven’t put it all together. 
That is the hard part, and we are at the start of that journey. That effort is 
also revealing all kinds of interesting systems research issues — so good for 
the students on the project. Potentially exciting times ahead!

Cheers,
Jignesh 

> On Jun 14, 2016, at 2:32 PM, Julian Hyde <[email protected]> wrote:
> 
> Having that representation reduces coupling in your architecture, so is 
> useful even if you don’t decide to use a library for SQL parsing/planning. 
> But I think once you have it you will realize that all of the interesting 
> problems for the project happen after the query has been converted to algebra.
> 
> Julian

Reply via email to