[jira] [Commented] (LANG-935) Possible performance improvement on string escape functions

Fabian Lange (JIRA) Sat, 14 Mar 2015 02:55:17 -0700

    [ 
https://issues.apache.org/jira/browse/LANG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361694#comment-14361694
 ]


Fabian Lange commented on LANG-935:
-----------------------------------

Thomas,
I can guarantee you it outperforms in all the benchmarks from this test, all 
unit tests (which are way to few) and in all internal usages I have found.

If you say it performs slower in a very specific use case, then I am happy to 
address this, but I have not found any so far. If you look at it from an 
algorithmic complexitx point, you will find that my patch is significantly 
better.
Imagine this, the old lets take a hash code will also have to iterate all 
chars, because non of the substring string hashcodes is pre-populate. In fact 
it turns out the current implementation performs worse with character iteration 
in ALL cases, just because the current algorithm requires this.
My patch does skip a lot iteration in many frequent use case completely.

So please back up you objection with a concrete example we can benchmark. Yes 
many equally sized strings with same first character are not benefitting as 
much as the other use cases, but they in fact do.

> Possible performance improvement on string escape functions
> -----------------------------------------------------------
>
>                 Key: LANG-935
>                 URL: https://issues.apache.org/jira/browse/LANG-935
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.text.translate.*
>    Affects Versions: 3.1
>            Reporter: Peter Wall
>            Priority: Minor
>              Labels: performance
>             Fix For: Patch Needed
>
>         Attachments: tempproject1.zip
>
>
> The escape functions for HTML etc. use the same code and the same 
> initialisation tables for the escape and unescape functions, and while this 
> is an elegant approach it leads to a number of deficiencies:
> 1. The code is very much less efficient than it could be
> 2. A new output string is created even when no conversion is required
> 3. No mapping is provided for characters that do not have a specific 
> representation (for example HTML 0x101 should become &amp;#257; )
> The proposal is to use a new mapping technique to address these issues



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (LANG-935) Possible performance improvement on string escape functions

Reply via email to