Concretely, the best thing you can do is log a Jira case and contribute a 
feature to Calcite. You will have to figure out how to give it similar 
‘packaging’ to similar tools, but it will get scrutiny and improvement.

You should also write tests that are adversarial (e.g. https://xkcd.com/327/).

The parser has an entry point, I believe, for expressions. Use that entry 
point. Splice the resulting AST into a SelectNode and then validate it.

> On Mar 18, 2025, at 11:37 AM, Kristian Lein-Mathisen <kristianl...@gmail.com> 
> wrote:
> 
> Hi Mihai and Julian,
> and thanks for getting back to me.
> 
> I'm glad to hear that I'm on the right track by not using Rules and the
> Planner for this, thanks for clarifying.
> 
> I got it working, and now my tests are passing. Thinking I was being smart,
> I changed your suggestion from
>    "SELECT " + filterExpression + " FROM " + table
> into
>    "SELECT * FROM " + table + " WHERE " + filterExpression
> since that felt more in line with what I'm doing. But that breaks down when
> filterExpression is, for example, just "true" as the optimizer removes the
> LogicalFilter altogether. So I'm going to stick with your approach.
> 
> I have to admit that concatenating strings like this is a bit
> uncomfortable, however. Will this be subject to SQL injections? For a
> malicous user's table, she might specify this filter expression:
>   * FROM sensitive_table --
> But that will luckily not give her access to sensitive_table since we're
> only extracting the first RexNode and not the resulting TableScan. But
> perhaps there are other tricks I'm not considering?
> 
> This is a good starting point, thank you for helping out!
> 
> K.
> 
> 
> On Mon, Mar 17, 2025 at 7:25 PM Julian Hyde <jhyde.apa...@gmail.com> wrote:
> 
>> It seems that your requirement is for an ‘expression parser’ that works in
>> the context of a table, and goes from SQL text via AST and validation to a
>> RexNode.
>> 
>> You can achieve this by generating a query, “select “ + expression + “
>> from “ + table, and sending it through the parse/validate/SQL-to-rel
>> process, after which you will surely receive a plan that looks like this:
>> 
>>  Project(rex)
>>    TableScan(table)
>> 
>> From this plan you can mine ‘rex’ and splice it into other queries. If you
>> do this ahead of time, the RexNode will refer to types in a different type
>> factory (a soluble problem).
>> 
>> Julian
>> 
>>> On Mar 17, 2025, at 9:35 AM, Mihai Budiu <mbu...@gmail.com> wrote:
>>> 
>>> I don't think the planner is the right place; in general planner
>> transformations preserve semantics, whereas your transformation won't
>> necessarily. Planner rules are also designed for optimization, and here
>> there is nothing to optimize.
>>> 
>>> The visitor-based APIs are probably the main tool you need.
>>> https://www.feldera.com/blog/calcite-irs#9099c1483ca6
>>> 
>>> Mihai
>>> 
>>> ________________________________
>>> From: Kristian Lein-Mathisen <kristianl...@gmail.com>
>>> Sent: Monday, March 17, 2025 7:57 AM
>>> To: dev@calcite.apache.org <dev@calcite.apache.org>
>>> Subject: Parsing expression inside RelShuttle
>>> 
>>> Hi!
>>> 
>>> I'm new to Calcite and I'd like to use it to transform incoming SQL
>> queries
>>> to conform to some generic policy rules, like wrapping all table-scans
>> with
>>> a row-level filter. I hope this is the right place to ask questions like
>>> this. Here's an example of what I'm trying to acheive:
>>> 
>>> -- admin says: ALTER TABLE orders SET TBLPROPERTIES (policy.filter =
>>> 'country IN user_countries')
>>> SELECT * FROM orders -- <-- from user
>>> SELECT * FROM (SELECT * FROM orders WHERE country IN user_countries())
>> --
>>> <-- to backend
>>> 
>>> This is fairly simple to acheive when you have a static filter
>> expression.
>>> Here's what I've got working:
>>> 
>>>   public static RelNode policyFilter(RelNode root, RelBuilder builder) {
>>>       return root.accept(new RelShuttleImpl() {
>>>           @Override
>>>           public RelNode visit(TableScan scan) {
>>>               RelNode n = super.visit(scan);
>>>               RexNode x = builder.in(builder.field("country"),
>>>                       builder.literal("se"),
>>>                       builder.literal("nl"),
>>>                       builder.literal("dk"));
>>>               return builder
>>>                       .push(n)
>>>                       .filter(x)
>>>                       .build();
>>>           }
>>>       });
>>>   }
>>> 
>>> This works, provided all my tables have a "country" column. What I'd like
>>> to do, however, is allow user-defined filter expressions: make "RexNode
>> x"
>>> from a per-table String. I'd like this to be evaluated in the context of
>>> the current table (so you can reference column names) and session (so
>>> current_user(), for example, is available). I've looked at the Planner,
>> but
>>> it doesn't seem it's exposing enough of its internals (I don't have
>> access
>>> to its validator.validateExpression, for example).
>>> 
>>> Note that I'm new to Calcite, so I could be going around this completely
>>> wrong. Looking at other uses of Calcite, I'm getting the impression that
>> I
>>> should be extending RelNodes and adding some Rules. I've also assumed
>> that
>>> I should be doing this with RelNodes instead of SqlNodes.
>>> 
>>> For the record, Kyuubi does something very similar using Spark here:
>>> 
>> https://github.com/apache/kyuubi/blob/master/extensions/spark/kyuubi-spark-authz/src/main/scala/org/apache/kyuubi/plugin/spark/authz/rule/rowfilter/RuleApplyRowFilter.scala#L42
>>> 
>>> Thanks for your feedback,
>>> - Kris
>> 
>> 

Reply via email to