Hi all, as discussed in earlier exchanges (see [1]), I started to work on implementing MATCH_RECOGNIZE based on Julian (Hydes) work [2]. I think I made some progress and resolved some of the problems but I’m at a stage now where I’d need some advice from more seasoned Calcite dev’s on how to continue.
Basically, I improved the Matching (based on a DFA now) and ensured that we have the full path of symbols (not just rows) as base for the “Emitter”. The Matcher code should also be working now EXCEPT for the PREV / NEXT Commands. I think we could handle them pretty easily as I introduced a “MemoryEnumerable”, i.e., an Enumerable which keeps all records in a window around the current record (history and future). I think this should work here, as we have NO unbounded windows like for regular window functions. So, basically the Matcher gets for each step a “Memory” Object which has a “get(n)” method to get the current (n = 0), present (n < 0) or future (n > 0) row. So, an expression of the Form “PREV(UP.$4, $5)” should be converted to something like “row.get($5).$4”. I have no real clue how to do this the “right” way, perhaps by a custom InputGetter which automatically introduces the “.get()”? When this is implemented the Matcher should be finished (?). Another thing the current implementation is missing is the ordering inside the partitions (which is similar to window functions). Do you think we can simply reuse the code from there? Generally, MATCH_RECOGNIZE could be implemented as regular WINDOW Function in the situation where one output row is generated for each input row, but I think this does not help us much as there also is the other “MODE” where it outputs a single row for each (possibly arbitrary long) match. Parallel to this mail I submitted a PR to merge my branch back to Julians work to have a common “checkpoint” for the next steps. I would really value if someone could step in (Julian?) with either implement parts of the problems stated above or give me some hints on how to address this properly so that I can try to go further. I don’t know what the usual way is but if it helps perhaps we can arrange a Screen sharing session or something to walk through the new code, if necessary. Best JulianF [1] https://lists.apache.org/thread.html/98f67c4534c32b544e48d54abca19f0e89fe8a163e5d5b822d80e6f0@%3Cdev.calcite.apache.org%3E [2] https://github.com/julianhyde/calcite/tree/1935-match-recognize
