Hi Tim, it looks like your physical converter rule for a Join node does not convert it's inputs to your custom FLOWDB convention. And because of it the PhysicalJoin is trying to get input rows from the LogicalScan. You have:
PhysicalJoin[FLOWDB] LogicalTableScan[NONE] <- logical rels have infinite cost LogicalTableScan[NONE] <- logical rels have infinite cost but it should be PhysicalJoin[FLOWDB] PhysicalTableScan[FLOWDB] PhysicalTableScan[FLOWDB] In order to achieve it you need to convert both inputs of the PhysicalJoin node to the FLOWDB convention using RelOptRule#convert() and RelTraitSet#replace(FLOWDBConvention.INSTANCE) methods. You can find examples in any join converter rule, i.e. BindableJoinRule#convert [1] [1] https://github.com/apache/calcite/blob/3755eb5871860f1fd5dc51990129784caa8ac0a4/core/src/main/java/org/apache/calcite/interpreter/Bindables.java#L476 -- Kind Regards Roman Kondakov On 11.04.2020 14:22, Tim Fox wrote: > Hi All, > > I have recently started using Calcite as the query parser/planner for a > side project. I have created a set of RelNodes corresponding to my physical > nodes, and a set of rules. I have created my own convention. > > All works well for queries without a join - my physical nodes are > created fine (aggregates, projections, filters, table scans, all ok). > > When I try and transform to my physical plan where the query contains a > join, I get the following error: > > "There are not enough rules to produce a node with desired properties: > convention=FLOWDB, sort=[]. All the inputs have relevant nodes, however the > cost is still infinite." > > (full error output at bottom of the post) > > I stumbled upon this post when googling this: > > https://issues.apache.org/jira/browse/CALCITE-3255 > > I have checked and I am specifying my convention when transforming to the > physical plan, and my rules seem to be set up ok. > > There is one comment in the above linked issue that is perhaps relevant > > "You should also supply the metadata in you convention nodes, so that our > metadata system can compute the cumulative cost correctly." > > But I don't really understand what this means. Can someone explain to a > newb like me what metadata is required and how I provide it? > > Many thanks, > > full error report: > > _INITIAL: There are not enough rules to produce a node with desired > properties: convention=FLOWDB, sort=[]. All the inputs have relevant nodes, > however the cost is still infinite. > > Root: rel#76:Subset#3.FLOWDB.[] > > Original rel: > > LogicalProject(subset=[rel#76:Subset#3.FLOWDB.[]], sensor_id=[$0], > temp=[$2], name=[$4], country=[$5]): rowcount = 1500.0, cumulative cost = > {1500.0 rows, 6000.0 cpu, 0.0 io}, id = 74 > > LogicalJoin(subset=[rel#73:Subset#2.NONE.[]], condition=[=($1, $3)], > joinType=[left]): rowcount = 1500.0, cumulative cost = {1500.0 rows, 0.0 > cpu, 0.0 io}, id = 72 > > LogicalTableScan(subset=[rel#70:Subset#0.NONE.[]], > table=[[latest_sensor_readings]]): rowcount = 100.0, cumulative cost = > {100.0 rows, 101.0 cpu, 0.0 io}, id = 65 > > LogicalTableScan(subset=[rel#71:Subset#1.NONE.[]], > table=[[current_locations]]): rowcount = 100.0, cumulative cost = {100.0 > rows, 101.0 cpu, 0.0 io}, id = 66 > > Sets: > > Set#0, type: RecordType(VARCHAR sensor_id, BIGINT location_id, DOUBLE temp) > > rel#70:Subset#0.NONE.[], best=null, importance=0.7290000000000001 > > rel#65:LogicalTableScan.NONE.[](table=[latest_sensor_readings]), > rowcount=100.0, cumulative cost={inf} > > rel#84:Subset#0.FLOWDB.[], best=rel#83, importance=0.36450000000000005 > > rel#83:PhysicalTableScan.FLOWDB.[](table=[latest_sensor_readings]), > rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > > Set#1, type: RecordType(BIGINT location_id, VARCHAR name, VARCHAR country) > > rel#71:Subset#1.NONE.[], best=null, importance=0.7290000000000001 > > rel#66:LogicalTableScan.NONE.[](table=[current_locations]), rowcount=100.0, > cumulative cost={inf} > > rel#82:Subset#1.FLOWDB.[], best=rel#81, importance=0.36450000000000005 > > rel#81:PhysicalTableScan.FLOWDB.[](table=[current_locations]), > rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > > Set#2, type: RecordType(VARCHAR sensor_id, BIGINT location_id, DOUBLE temp, > BIGINT location_id0, VARCHAR name, VARCHAR country) > > rel#73:Subset#2.NONE.[], best=null, importance=0.81 > > rel#72:LogicalJoin.NONE.[](left=RelSubset#70,right=RelSubset#71,condition==($1, > $3),joinType=left), rowcount=1500.0, cumulative cost={inf} > > rel#78:Subset#2.FLOWDB.[], best=null, importance=0.9 > > rel#80:PhysicalJoin.FLOWDB.[](left=RelSubset#70,right=RelSubset#71,condition==($1, > $3),joinType=left), rowcount=1500.0, cumulative cost={inf} > > Set#3, type: RecordType(VARCHAR sensor_id, DOUBLE temp, VARCHAR name, > VARCHAR country) > > rel#75:Subset#3.NONE.[], best=null, importance=0.9 > > rel#74:LogicalProject.NONE.[](input=RelSubset#73,inputs=0,exprs=[$2, $4, > $5]), rowcount=1500.0, cumulative cost={inf} > > rel#76:Subset#3.FLOWDB.[], best=null, importance=1.0 > > rel#77:AbstractConverter.FLOWDB.[](input=RelSubset#75,convention=FLOWDB,sort=[]), > rowcount=1500.0, cumulative cost={inf} > > rel#79:PhysicalProject.FLOWDB.[](input=RelSubset#78,inputs=0,exprs=[$2, $4, > $5]), rowcount=1500.0, cumulative cost={inf} > > Graphviz: > > digraph G { > > root [style=filled,label="Root"]; > > subgraph cluster0{ > > label="Set 0 RecordType(VARCHAR sensor_id, BIGINT location_id, DOUBLE > temp)"; > > rel65 > [label="rel#65:LogicalTableScan\ntable=[latest_sensor_readings]\nrows=100.0, > cost={inf}",shape=box] > > rel83 > [label="rel#83:PhysicalTableScan\ntable=[latest_sensor_readings]\nrows=100.0, > cost={100.0 rows, 101.0 cpu, 0.0 io}",color=blue,shape=box] > > subset70 [label="rel#70:Subset#0.NONE.[]"] > > subset84 [label="rel#84:Subset#0.FLOWDB.[]"] > > } > > subgraph cluster1{ > > label="Set 1 RecordType(BIGINT location_id, VARCHAR name, VARCHAR country)"; > > rel66 > [label="rel#66:LogicalTableScan\ntable=[current_locations]\nrows=100.0, > cost={inf}",shape=box] > > rel81 > [label="rel#81:PhysicalTableScan\ntable=[current_locations]\nrows=100.0, > cost={100.0 rows, 101.0 cpu, 0.0 io}",color=blue,shape=box] > > subset71 [label="rel#71:Subset#1.NONE.[]"] > > subset82 [label="rel#82:Subset#1.FLOWDB.[]"] > > } > > subgraph cluster2{ > > label="Set 2 RecordType(VARCHAR sensor_id, BIGINT location_id, DOUBLE temp, > BIGINT location_id0, VARCHAR name, VARCHAR country)"; > > rel72 > [label="rel#72:LogicalJoin\nleft=RelSubset#70,right=RelSubset#71,condition==($1, > $3),joinType=left\nrows=1500.0, cost={inf}",shape=box] > > rel80 > [label="rel#80:PhysicalJoin\nleft=RelSubset#70,right=RelSubset#71,condition==($1, > $3),joinType=left\nrows=1500.0, cost={inf}",shape=box] > > subset73 [label="rel#73:Subset#2.NONE.[]"] > > subset78 [label="rel#78:Subset#2.FLOWDB.[]"] > > } > > subgraph cluster3{ > > label="Set 3 RecordType(VARCHAR sensor_id, DOUBLE temp, VARCHAR name, > VARCHAR country)"; > > rel74 [label="rel#74:LogicalProject\ninput=RelSubset#73,inputs=0,exprs=[$2, > $4, $5]\nrows=1500.0, cost={inf}",shape=box] > > rel77 > [label="rel#77:AbstractConverter\ninput=RelSubset#75,convention=FLOWDB,sort=[]\nrows=1500.0, > cost={inf}",shape=box] > > rel79 > [label="rel#79:PhysicalProject\ninput=RelSubset#78,inputs=0,exprs=[$2, $4, > $5]\nrows=1500.0, cost={inf}",shape=box] > > subset75 [label="rel#75:Subset#3.NONE.[]"] > > subset76 [label="rel#76:Subset#3.FLOWDB.[]"] > > } > > root -> subset76; > > subset70 -> rel65; > > subset84 -> rel83[color=blue]; > > subset71 -> rel66; > > subset82 -> rel81[color=blue]; > > subset73 -> rel72; rel72 -> subset70[label="0"]; rel72 -> > subset71[label="1"]; > > subset78 -> rel80; rel80 -> subset70[label="0"]; rel80 -> > subset71[label="1"]; > > subset75 -> rel74; rel74 -> subset73; > > subset76 -> rel77; rel77 -> subset75; > > subset76 -> rel79; rel79 -> subset78; >
