Hi David,

however there’s no public signup for the ASF Jira anymore. I’m hoping my
> report here will suffice. If not, please create an account for me and I’ll
> file a proper ticket for it.
>

If you provide this information we can set up an account for you (sorry
about this two-step process; our JIRA has been under constant spam lately):

- Name
- Email
- Preferred user name:

Hopefully, I’ve explained it clearly enough. Please reach out with any
> further questions.
>

I think you explained really well. If you could copy/paste it into a JIRA
that would be great, as you would also get notifications for updates from
JIRA/Git/etc there, or in this email thread.

It would be great too if you were able to provide a use case and/or unit
test. While working on the fix, we can also check if it'd possible to
include a benchmark test case with JMH to prevent future regressions.

Cheers

-Bruno


On Wed, 8 Feb 2023 at 23:41, David Becker <dbec...@employers.com> wrote:

> I would like to report a performance regression when using
> StringSubstitutor with large strings that our application experienced after
> upgrading to v1.9+, however there’s no public signup for the ASF Jira
> anymore. I’m hoping my report here will suffice. If not, please create an
> account for me and I’ll file a proper ticket for it.
>
> As of v1.9, StringSubstitutor no longer pre-converts the TextStringBuilder
> to a char[], see this (
> https://github.com/apache/commons-text/commit/248af06171e14648e00ce0873c5f95e03041a6c7
> ) commit, and opts to reuse the TextStringBuilder API instead. A new
> default method (
> https://github.com/apache/commons-text/blob/master/src/main/java/org/apache/commons/text/matcher/StringMatcher.java#L146
> ) was added to StringMatcher that takes CharSequence (which
> TextStringBuilder implements) to handle the conversion. However, it calls
> CharSequenceUtils.toCharArray(buffer), which is not aware of
> TextStringBuilder and cannot optimize the conversion to char[] since
> CharSequence has no way to do so, and it’s not a String (which does). When
> using a custom StringMatcher implementation that does not override this
> default method (as the stock matchers do), it results in a full copy of the
> CharSequence being made, which adds up very quickly when the text is large
> and lots of replacements are being made.
>
> Methods of ours which used to take 3 seconds, now take upwards of a
> minute. Fortunately, our custom matcher is a simple OrStringMatcher (not
> provided out-of-the-box) which delegates to stock StringMatchers created by
> StringMatcherFactory.stringMatcher(…) which have their own optimized
> implementation of the method, so we were able to resolve this ourselves by
> overriding the method and delegating it directly to the optimized
> implementation of the stock matchers – thus bypassing the
> CharSequenceUtils.toCharArray(buffer) penalty completely.
>
> But others may not be as fortunate. Perhaps the default method could be
> made aware of TextStringBuilder and use its package protected getBuffer()
> method instead? Or maybe there’s a better way to solve it.
>
> Hopefully, I’ve explained it clearly enough. Please reach out with any
> further questions.
>
> --
> David Becker
> Senior IT Engineer
>
>
> *******************************************************************************************************************************************************************
> Notice: This e-mail, including any attachment(s) and link(s), is
> confidential, proprietary and intended solely for the above-named
> individual(s). It may constitute non-public information and may contain
> information subject to certain legal privileges. If you are the intended
> recipient, your use of any confidential, proprietary or personal
> information may be restricted by federal and state privacy or other laws.
> Any unauthorized use of this communication by others is strictly prohibited
> and may be unlawful. If you have received this e-mail in error, do not open
> any attachment(s) or link(s). Please notify the sender immediately by
> replying to sender and then delete both this e-mail and any attachment(s).
> Thank you.
>
> EMPLOYERS® provides workers compensation insurance through Employers
> Preferred Insurance Company, Employers Assurance Company, Employers
> Compensation Insurance Company and Employers Insurance Company of Nevada.
> EIG Services, Inc. (in California, dba EIG Insurance Services) is an
> affiliated agency and adjuster.
>
> *******************************************************************************************************************************************************************
>

Reply via email to