On 2017-12-15 10:57, Riccardo Tommasini <[email protected]> wrote: 
> A final remark on efficiency. All the mentioned features are very interesting 
> but not computationally nice. Idk what’s calcite position on this, but in a 
> big data community i think we should be careful. SPARQL did some crazy stuff 
> and is still paying them (see gutierrez papers)

I don't worry too much about efficiency per se, but I focus on making the 
algebra clean enough that we can recognize the simple cases. I would make the 
algebra simple and moderately powerful at first, and make it more 
expressive/powerful later only if we have use cases that need it.

It's analogous to joins. Calcite's join operator can express theta-joins, 
left/right/full outer, semi-joins, and joins over sorted/bucketed data sets, 
but we can easily recognize an inner equi-join when we see one. Most of the 
transformation rules are written on inner equi-joins first, and generalized to 
other kinds of joins when we're sure it's safe. I imagine us taking a similar 
path with iteration, e.g. when is it safe to push a filter into, and through, 
an iteration?

Lastly, because it's algebra we take a 'RISC' approach. The "Iterate" operator 
doesn't have to do everything; we can have other operators such as Aggregate, 
Filter, Project before, after and inside it.

Reply via email to