zhtaoxiang commented on PR #12856: URL: https://github.com/apache/pinot/pull/12856#issuecomment-2048899423
> > > It also replaces *? with * for custom metric rules as the surrounding pattern matching is well defined. > > > > > > Can you please elaborate more on this? Maybe you can use an example to explain why it is needed? > > @zhtaoxiang As @Jackie-Jiang pointed out in point 3 [here](https://github.com/apache/pinot/pull/12739#discussion_r1554045076), Let's say a metric is of the format `somePrefix.tableName.someMetric` Case 1 using `*?` : pattern = `somePrefix\\.([^.]*?)\\.someMetric` The parser will do conservative iterations of > > 1 . `somePrefix.t` -> fails as `.` is not encountered after it 2 . `somePrefix.ta` -> fails as `.` is not encountered after it . . n . `somePrefix.tableName` -> success as `.` is encountered after it > > Case 1 using `*` : pattern = `somePrefix\\.([^.]*)\\.someMetric` The parser will match `somePrefix.tableName` in the first iteration itself as its greedy and only breaks at the `.` in `.someMetric` I get the log for `*?`. For Case 1 using `*`, if it's greedy, I feel `*` should match `tableName.someMetric` first, then `tableName.someMetri`, then `tableName.someMetr`. If my understanding is correct, isn't greedy matching slower? Because in most cases, the suffix will be much longer than the prefix, so greedy is slower. For example, if the pattern is `somePrefix\\.([^.]*?)\\.someMetric\\.suffix1\\.suffix2\\.suffix3`, and the metric is `somePrefix.sometable.someMetric.suffix1.suffix2.suffix3`, we need to try 11 times; if the pattern is `somePrefix\\.([^.]*)\\.someMetric\\.suffix1\\.suffix2\\.suffix3`, we need to try 25 times. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
