[Resurrecting a semi-dead thread]
Hi Shevek,

What about using a bytecode-emitting solution (something like JOni, from JRuby)?

Some work along these lines was also done in Nashorn:

http://mail.openjdk.java.net/pipermail/nashorn-dev/2013-May/001063.html

http://cr.openjdk.java.net/~hannesw/8012269/

Thanks to Charlie Nutter for helping to jog my memory about what prior
art exists for this type of approach. If there's someone here who
worked on the relevant bit of Nashorn they might be able to speak to
this point much better than I can from my very basic experience here.

Cheers,

Ben
On Thu, 15 Nov 2018 at 02:08, Shevek <[email protected]> wrote:
>
> Dear wizards, please advise.
>
> I need to offer a user configuration feature for pattern matching, to
> exclude objects from my billion object sort-merge (which is now working
> fairly well, thank you all).
>
> What we're mostly trying to do is exclude any record which contains any
> one of a number of substrings. The computer science textbooks give
> various fast-string-searching algorithms with pre-computed tables, any
> of which would suit our use case, but I don't see a practical
> implementation of any of them floating around...
>
> Current practical options:
> * java.util.regex, precompiled patterns
>    - Reputedly slow at matching, but our patterns are simple.
>    - We are using a single regex containing a large alternation.
>    - perf says this regex matcher is 50% of our runtime.
>    - WHY isn't Matcher.usePattern() allocation-free? It totally could
> be. This means that the int[] array allocation is a major drain on the GC.
>    - If we use ThreadLocal<Matcher> instead, the ThreadLocal can't be
> static, and hits the previously discussed issue with blowing out the
> ThreadLocalMap.
> * rej2
>    - Not tried yet - has anyone tried this?
> * brics
>    - Trying and failing on brics may make sense before falling back to
> java regex.
> * Groovy Closure
>    - Faster than regex/pattern assuming you have a better strategy for
> matching. But now you're down to repeated contains() calls.
>
> What are the suggestions?
>
> Thank you.
>
> S.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to