Daniel Naber <daniel.na...@languagetool.org> wrote: > Hi, > > there's a regex that makes tests quite slow in PatternTestTools.java: > > CHAR_SET_PATTERN = > Pattern.compile("(\\(\\?-i\\))?.*(?<!\\\\)\\[^?([^\\]]+)\\]") > > I don't fully understand it, does it need to be that complicated? If I > simplify it like this: > > CHAR_SET_PATTERN = Pattern.compile("\\[^?([^\\]]+)\\]"); > > The tests become much faster (45 second -> 8 seconds for Polish when > running just the disambiguation tests). > > Regards > Daniel
Hi Daniel I have not had the time to look at what this regexp is used for, but glancing at the regexp, I see that it contains a zero-width negative lookbehind, i.e. the (?<!…) part. This can be very slow I slow I think. At least in Vim, regexp, zero-with lookbehind are documented as very slow (see :help \@<! in Vim). I suspect that it's the same for other regexp engines. Perhaps the regexp can be written in such a way to avoid the zero-width lookbehind. Regards Dominique ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel