I have been musing on this JIRA: Path - multiple symbol matches per row https://issues.apache.org/jira/browse/MADLIB-943
and become concerned with combinatorial explosion, even for a modest number of symbol hits per row. For n symbols per row and m rows in a partition, number of symbol combinations per partition is n^m. e.g., for n=2 and m=50 this results in ~1e15 symbol combinations which we certainly don't want to traverse. Does anyone have experience or an opinion on this topic? In the current version of MADlib.path() http://madlib.incubator.apache.org/docs/latest/group__grp__path.html a given row can only match one symbol. If a row matches multiple symbols, the symbol that comes first in the symbol definition list will take precedence. In some examples I have seen around https://aster-community.teradata.com/community/learn-aster/blog/2015/07/01/super-sweet-npath-examples-with-source-code it seems that multiple symbols per row are used. Question is do we need to address this at all? Frank
