[ 
https://issues.apache.org/jira/browse/CODEC-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581834#comment-13581834
 ] 

Julius Davies commented on CODEC-166:
-------------------------------------


Here's a before & after run on my machine.  I also threw in my patch for a run. 
 It doesn't conflict with TN's patch!  (Nice!)


{noformat}
Trunk
-----------------------------------------
TINY DATA new byte[12]
encode 18.0 MB/s    decode 25.0 MB/s
encode 18.0 MB/s    decode 24.0 MB/s

MEDIUM DATA new byte[1234]
encode 137.0 MB/s    decode 186.0 MB/s
encode 137.0 MB/s    decode 182.0 MB/s
{noformat}



{noformat}
TN Patch
-----------------------------------------
TINY DATA new byte[12]
encode 159.0 MB/s    decode 139.0 MB/s
encode 147.0 MB/s    decode 140.0 MB/s

MEDIUM DATA new byte[1234]
encode 311.0 MB/s    decode 212.0 MB/s
encode 299.0 MB/s    decode 221.0 MB/s
{noformat}


{noformat}
TN Patch + ApacheModifiedMiGBase64 Patch
-----------------------------------------
TINY DATA new byte[12]
encode 275.0 MB/s    decode 178.0 MB/s
encode 279.0 MB/s    decode 178.0 MB/s

MEDIUM DATA new byte[1234]
encode 553.0 MB/s    decode 261.0 MB/s
encode 558.0 MB/s    decode 263.0 MB/s
{noformat}


I find it kind of weird that my patch consistently runs about 8% faster than 
MiGBase64 on encode.  Doesn't make sense.   I do worse on decode because I 
handle the "AA==AA==" situation (padding in the middle), and normally MiGBase64 
doesn't check for that.


And here's an interesting number to keep in mind.  We're all somewhat close to 
this (off by a factor of 5 or 10).  Surely this is a reasonable (and 
unattainable) upper bound:

{noformat}
Just counting bytes 1-by-1...
encode 2570.0 MB/s    decode 2328.0 MB/s
encode 2243.0 MB/s    decode 2243.0 MB/s
{noformat}



                
> Base64 could be faster
> ----------------------
>
>                 Key: CODEC-166
>                 URL: https://issues.apache.org/jira/browse/CODEC-166
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.7
>            Reporter: Julius Davies
>            Assignee: Julius Davies
>             Fix For: 1.8
>
>         Attachments: base64bench.zip, CODEC-166.patch, CODEC-166_speed.patch
>
>
> Our Base64 consistently performs 3 times slower compared to MiGBase64 and 
> iHarder in the byte[] and String encode() methods.
> We are pretty good on decode(), though a little slower (approx. 33% slower) 
> than MiGBase64.
> We always win in the Streaming methods (MiGBase64 doesn't do streaming).  
> Yay!  :-) :-) :-)
> I put together a benchmark.  Here's a typical run:
> {noformat}
>   LARGE DATA new byte[12345]
> iHarder...
> encode 486.0 MB/s    decode 158.0 MB/s
> encode 491.0 MB/s    decode 148.0 MB/s
> MiGBase64...
> encode 499.0 MB/s    decode 222.0 MB/s
> encode 493.0 MB/s    decode 226.0 MB/s
> Apache Commons Codec...
> encode 142.0 MB/s    decode 146.0 MB/s
> encode 138.0 MB/s    decode 150.0 MB/s
> {noformat}
> I believe the main approach we can consider to improve performance is to 
> avoid array copies at all costs.   MiGBase64 even counts the number of valid 
> Base64 characters ahead of time on decode() to precalculate the result's size 
> and avoid any array copying!
> I suspect this will mean writing out separate execution paths for the String 
> and byte[] methods, and keeping them out of the streaming logic, since the 
> streaming logic is founded on array copy.
> Unfortunately this means we will diminish internal reuse of the streaming 
> implementation, but I think it's the only way to improve performance, if we 
> want to.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to