Thanks Sasha! A nice advantage about parentheses is that most editors can track and highlight the sections between them. Also, those parentheses can be optional when we detect new lines (in the case some users don't want to deal with many parentheses); in that case, we would just need to ask indentation.
Percy On Thu, Nov 3, 2022 at 12:47 PM Sasha Krassovsky <krassovskysa...@gmail.com> wrote: > Hi Percy, > Thanks for the input! New lines would be no problem at all, they’d just be > treated the same as any other whitespace. One thing to point out about the > function call style when written that way is that it looks a lot like the > list style, it’s just that there are more parentheses to keep track of, > though it does make it more obvious what delineates a subtree. > > Sasha > > > > 3 нояб. 2022 г., в 10:35, Percy Camilo Triveño Aucahuasi < > percy.camilo...@gmail.com> написал(а): > > > > Hi Sasha, > > > > I like the function call-style variant. Quick question about the parser: > > Do you think we can parse with new lines too? that way it would be even > > more similar to a json-like/declarative approach and could mitigate a bit > > the nesting issue (which would make it easier to read as well) for > instance: > > > > sink( > > project( > > filter( > > source( > > …) > > …) > > …) > > …) > > > > Percy > > > > > >> On Tue, Oct 18, 2022 at 5:54 PM Sasha Krassovsky < > krassovskysa...@gmail.com> > >> wrote: > >> > >> Hi everyone, > >> We recently had some discussions about parsing expressions. I currently > >> have a PR [1] up for that taking into account the feedback. Next I > wanted > >> to tackle something for ExecPlans, as manually specifying one using > code is > >> currently cumbersome. I’m currently deciding between 2 variants: > >> > >> - Function call-style: This would be a similar syntax to the > expressions, > >> where we would have something along the lines of > >> `sink(project(filter(source(…)…)…)…)`. The problem with this syntax is > that > >> it involves tons of nesting, which although an improvement over > handwriting > >> the C++ code, is still cumbersome to write. On the other hand, this > syntax > >> is pretty intuitive and meshes well with the expression syntax. A minor > >> modification could be to make the last argument rather than the first be > >> the input to a node, which would at least keep a node’s parameters > >> together. > >> > >> - List style: This syntax completely eliminates nesting and would > probably > >> be easier to write but has a steeper learning curve. Essentially, since > we > >> know how many inputs each type of node takes, we can implicitly > reconstruct > >> a tree from a list of node names (formally, we are converting from/to a > >> pre-order traversal of the query tree). For example, it would look > >> something like: > >> > >> ``` > >> sink > >> project <list of names/expressions> > >> filter <expression> > >> source > >> ``` > >> > >> The key is that we know that a source takes no inputs, and so source > nodes > >> are leaf nodes. To take an example with a join, it could be something > like > >> > >> ``` > >> order_by_sink <sort key> > >> hash_join <join arguments> > >> filter <expression> > >> source > >> filter <expression> > >> source > >> ``` > >> > >> Since we know that a join always takes two arguments, we interpret the > >> first (filter source) slice as the first argument and the second as the > >> second argument. It should be noted that the current C++ code already > >> resembles this kind of syntax, it just has much more clutter. > >> > >> Thanks! > >> Sasha Krassovsky > >> > >> [1] https://github.com/apache/arrow/pull/14287 >