Its definitely getting closer, Thanks Jinfeng! I end up with this set of plans <https://gist.github.com/jyates/f11eb44a44af715b483859f497b9ea89#file-expanded-table-planning>. However, the only column that is returned is * and it contains just the joined id column, rather than the full column expansion.
This is in contrast to the raw file join physical plan <https://gist.github.com/jyates/f11eb44a44af715b483859f497b9ea89#file-raw-sql-physical-plan> which has an expanded projection condition. I'm not sure how I can force the top projection to select correctly. Do i inject it as a wrapper scan around the created join or create a new rule that takes (project, join) and replaces it with the correct project condition? Naturally, I'd have to type that rule/join to not just be the logical so it can convert only this generated base case. Thanks! --Jesse ps. Updated the gist <https://gist.github.com/jyates/f11eb44a44af715b483859f497b9ea89#file-drill-to-rel-table-example-java> to cover the offset changes for the degenerate 2 table case. On Wed, Jun 1, 2016 at 10:26 PM Jinfeng Ni <[email protected]> wrote: > Seems to me that the reason you run into problem when build the join > condition is that you are trying to compose the condition while adding > the fields on-the-fly in [1]. This is different from what Drill is > doing in Calcite library. For Drill, before construct the join > condition (RexNode) t1.id = t2.id, SqlValidator will make sure those > fields exists in the table. For dynamic table, such check will lead > to one additional field in table's rowType. After validation, when > Calcite builds join condition in SqlToRelConverter, it already knows > the field list for each input table (at that time, table's rowType > should be immutable), and hence it can adjust the reference to the > right table field, by adding the # of LHS fields to the index. That > is, =($1, $1) would become =($1, $3). > > I feel you may follow such logic, by calling field(table1, fieldName) > on each side of join, before construct the join condition. > > > [1] > https://gist.github.com/jyates/f11eb44a44af715b483859f497b9ea89#file-drill-to-rel-table-example-java-L94-L95 > > On Wed, Jun 1, 2016 at 9:50 PM, Jesse Yates <[email protected]> > wrote: > > I'm building a layer "on top"[1] that hides the details of accessing > > underlying "partitions" (stored as individual tables) and picks the right > > tables based on the query (time partitioned, so prune tables which won't > > fulfill the query). > > > > More concretely, lets take a set of files, each one fulfilling the role > of > > "table": > > /drill > > /table1.json > > /table2.json > > or in drill parlance, *dfs.`/drill/table1.json`* and > *dfs.`/drill/table2.json`. > > T*he query that I want to make externally is:* SELECT * FROM EVENTS *and > I > > want to translate that internally to > > > > *SELECT * FROM dfs.`/drill/table1.json` AS t1 JOIN > > dfs.`/drill/table2.json`AS t2 ON t1.id <http://t1.id> = t2.id > > <http://t2.id>.*Executing that expanded (second) query from the > top-level > > java.sql.Connection executes fine, hence my thoughts about the sub-table > > not actually being found. > > > > Here is what I am trying to do in my table handler - the 'EVENTS' table ( > > gist <https://gist.github.com/jyates/f11eb44a44af715b483859f497b9ea89>). > > > > Does that help at all? > > > > Thanks, > > Jesse > > > > [1] Where on top in this case includes components inside Drill. > > > > On Wed, Jun 1, 2016 at 9:27 PM Jinfeng Ni <[email protected]> wrote: > > > >> I'm not sure if I understand your problem correctly. Are you trying to > >> build some non-SQL interface on top of Drill, to join a set of dynamic > >> tables? Can you give more concrete example? > >> > >> When Drill handles join over two dynamic tables, except for * column > >> query, the dynamic tables have a list of fields defined, since those > >> fields are referred in the query (even though the planner does not > >> know each field's type). Therefore, the join condition will never be > >> =($1,$1); it would be resolved to reference to left/right tables's > >> fields. > >> > >> > >> > >> On Wed, Jun 1, 2016 at 7:43 PM, Jesse Yates <[email protected]> wrote: > >> > Hi all, > >> > > >> > I'm trying to rewrite a query of a table (ala Table#toRel) to join a > set > >> of > >> > dynamic (sub-)tables on a couple of known columns but am getting stuck > >> > building the condition. The sub-tables are not part of the original > >> query, > >> > but rather selected on-the-fly at logical query time. > >> > > >> > I can't use UNION-ALL because, outside of a couple of known columns, > the > >> > remainder are completely dynamic. > >> > > >> > Using RelBuilder I can construct the join via a series of scans and > then > >> > joins on the known fields[1]. However, this only creates RelInputRefs > >> which > >> > are not at all associated with the current ref numbering because the > the > >> > sub-tables are not present in the original query. Thus, we get > conditions > >> > like: > >> > (=($1,$1), which looks appears TRUE but actually should reference the > >> > left/right tables' fields. > >> > > >> > I tried playing around with RelRangeRef and manually managing the > field > >> > offsets in query (similar to BlackBoard), but that call gets > translated > >> > into an actually ALWAYS-TRUE condition and also fails the > >> > JoinUtil#checkCartesianJoin case. > >> > > >> > If I construct the query via standard SQL at the top level (using > known > >> > tables), everything works fine, I think because of the ref-numbering > to > >> > which I cannot get access in #toRel(). > >> > > >> > Any thoughts on the right way to go about this? > >> > > >> > Thanks much, > >> > Jesse Yates > >> > > >> > > >> > [1] Actually, this meant digging into RelBuilder > >> > < > >> > https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L309 > >> > > >> > since dynamic tables require the field names to already be set and > then > >> > uses the column name from the list to get the field index > >> >
