I can convert the Logical RelNode to Enumerable RelNode, and then I added
another step to convert Enumerable RelNode to Elasticsearch RelNode:
def toEsPhysicalPlan(root):
print(f"toEsPhysicalPlan root={root}")
planner = root.getCluster().getPlanner()
planner.clear()
for rule in ElasticsearchRules.RULES:
planner.addRule(rule)
program = Programs.of(RuleSets.ofList(planner.getRules()))
traits = root.getTraitSet().replace(ElasticsearchRel.CONVENTION)
return program.run(planner, root, traits, ImmutableList.of(),
ImmutableList.of())
Now the program.run failed with below exception, what can be the reason? Is
there any example that shows how to convert a SQL to Elasticsearch Json? BTW,
my input schema/table is not of ElasticsearchSchema/Table type.
org.apache.calcite.plan.RelOptPlanner.CannotPlanException:
org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough
rules to produce a node with desired properties: convention=ELASTICSEARCH,
sort=[].
Missing conversion is EnumerableProject[convention: ENUMERABLE -> ELASTICSEARCH]
There is 1 empty subset: rel#67:RelSubset#3.ELASTICSEARCH.[], the relevant part
of the original plan is as follows
65:EnumerableProject(BugId=[$0], ARB=[$1])
60:EnumerableTableScan(subset=[rel#64:RelSubset#2.ENUMERABLE.[]],
table=[[root, table_bugx]])
Root: rel#67:RelSubset#3.ELASTICSEARCH.[]
Original rel:
EnumerableProject(subset=[rel#58:RelSubset#1.ENUMERABLE.[]], BugId=[$0],
ARB=[$1]): rowcount = 100.0, cumulative cost = {100.0 rows, 200.0 cpu, 0.0 io},
id = 62
EnumerableTableScan(subset=[rel#61:RelSubset#0.ENUMERABLE.[]], table=[[root,
table_bugx]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0
io}, id = 60
Sets:
Set#2, type: RecordType(BIGINT BugId, VARCHAR ARB)
rel#64:RelSubset#2.ENUMERABLE.[], best=rel#60
rel#60:EnumerableTableScan.ENUMERABLE.[](table=[root,
table_bugx]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
Set#3, type: RecordType(BIGINT BugId, VARCHAR ARB)
rel#66:RelSubset#3.ENUMERABLE.[], best=rel#65
rel#65:EnumerableProject.ENUMERABLE.[](input=RelSubset#64,inputs=0..1),
rowcount=100.0, cumulative cost={200.0 rows, 301.0 cpu, 0.0 io}
rel#67:RelSubset#3.ELASTICSEARCH.[], best=null
rel#68:AbstractConverter.ELASTICSEARCH.[](input=RelSubset#66,convention=ELASTICSEARCH,sort=[]),
rowcount=100.0, cumulative cost={inf}
Graphviz:
digraph G {
root [style=filled,label="Root"];
subgraph cluster2{
label="Set 2 RecordType(BIGINT BugId, VARCHAR ARB)";
rel60 [label="rel#60:EnumerableTableScan\ntable=[root,
table_bugx]\nrows=100.0, cost={100.0 rows, 101.0 cpu, 0.0
io}",color=blue,shape=box]
subset64 [label="rel#64:RelSubset#2.ENUMERABLE.[]"]
}
subgraph cluster3{
label="Set 3 RecordType(BIGINT BugId, VARCHAR ARB)";
rel65
[label="rel#65:EnumerableProject\ninput=RelSubset#64,inputs=0..1\nrows=100.0,
cost={200.0 rows, 301.0 cpu, 0.0 io}",color=blue,shape=box]
rel68
[label="rel#68:AbstractConverter\ninput=RelSubset#66,convention=ELASTICSEARCH,sort=[]\nrows=100.0,
cost={inf}",shape=box]
subset66 [label="rel#66:RelSubset#3.ENUMERABLE.[]"]
subset67 [label="rel#67:RelSubset#3.ELASTICSEARCH.[]",color=red]
}
root -> subset67;
subset64 -> rel60[color=blue];
subset66 -> rel65[color=blue]; rel65 -> subset64[color=blue];
subset67 -> rel68; rel68 -> subset66;
}
Thanks,
Jusitn
From: Justin Huang
Sent: Monday, September 27, 2021 1:56 PM
To: [email protected]
Subject: Need help: exception when using Elasticsearch adapter
Hi Calcite developers,
I tried to convert a simple 'SELECT * FROM xxx' statement to Elasticsearch DSL
using the Elasticsearch adapter.
The toPhysicalPlan(relNode) failed when running program.run. Can anyone shed
some lights on this? Any requirements on the Table/Schema? The table I used in
my class is a DaskTable
<https://github.com/dask-contrib/dask-sql/blob/main/planner/src/main/java/com/dask/sql/schema/DaskTable.java>
(from dask-sql project) which implements the interface
ProjectableFilterableTable.
Thanks,
Justin
I am using Calcite in Python using with Jpype module, and the function sql2dsl
converts RelNode to Elasticsearch DSL:
def toPhysicalPlan(root):
rules = (
EnumerableRules.ENUMERABLE_PROJECT_RULE,
EnumerableRules.ENUMERABLE_FILTER_RULE,
EnumerableRules.ENUMERABLE_AGGREGATE_RULE,
EnumerableRules.ENUMERABLE_SORT_RULE,
EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE
)
planner = root.getCluster().getPlanner()
planner.clear()
for rule in rules:
planner.addRule(rule)
program = Programs.of(RuleSets.ofList(planner.getRules()))
traits = root.getTraitSet().replace(EnumerableConvention.INSTANCE)
return program.run(planner, root, traits, ImmutableList.of(),
ImmutableList.of())
def toElasticsearchQuery(root):
relImplementor =
EnumerableRelImplementor(root.getCluster().getRexBuilder(), ImmutableMap.of())
elasticsearchImplementor = ElasticsearchRel.Implementor()
elasticsearchImplementor.visitChild(0, root)
for x in elasticsearchImplementor.list:
print(f"x: {x}")
def sql2dsl(relNode):
esPhysicalPlan = toPhysicalPlan(relNode)
esJson = toElasticsearchQuery(esPhysicalPlan)
print(f"esJson={str(esJson)}")
Here are the error messages:
org.apache.calcite.plan.RelOptPlanner.CannotPlanException:
org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough
rules to produce a node with desired properties: convention=ENUMERABLE, sort=[].
Missing conversion is LogicalTableScan[convention: NONE -> ENUMERABLE]
There is 1 empty subset: rel#51:RelSubset#0.ENUMERABLE.[], the relevant part of
the original plan is as follows
47:LogicalTableScan(table=[[xxx]])
Root: rel#51:RelSubset#0.ENUMERABLE.[]
Original rel:
LogicalTableScan(subset=[rel#51:RelSubset#0.ENUMERABLE.[]], table=[[xxx]]):
rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 47
Sets:
Set#0, type: RecordType(VARCHAR name, BIGINT age, VARCHAR position, BIGINT
BugId)
rel#50:RelSubset#0.NONE.[], best=null
rel#47:LogicalTableScan.NONE.[](table=[xxx]), rowcount=100.0,
cumulative cost={inf}
rel#51:RelSubset#0.ENUMERABLE.[], best=null
rel#52:AbstractConverter.ENUMERABLE.[](input=RelSubset#50,convention=ENUMERABLE,sort=[]),
rowcount=100.0, cumulative cost={inf}
Graphviz:
digraph G {
root [style=filled,label="Root"];
subgraph cluster0{
label="Set 0 RecordType(VARCHAR name, BIGINT age, VARCHAR
position, BIGINT BugId)";
rel47 [label="rel#47:LogicalTableScan\ntable=[xxx]\nrows=100.0,
cost={inf}",shape=box]
rel52
[label="rel#52:AbstractConverter\ninput=RelSubset#50,convention=ENUMERABLE,sort=[]\nrows=100.0,
cost={inf}",shape=box]
subset50 [label="rel#50:RelSubset#0.NONE.[]"]
subset51 [label="rel#51:RelSubset#0.ENUMERABLE.[]",color=red]
}
root -> subset51;
subset50 -> rel47;
subset51 -> rel52; rel52 -> subset50;
}