Hi Axel,

On May 29, 2005, at 8:00 AM, Axel Wei� wrote:

Hi all,

I've re-implemented the iconv transcode methods that return a newly
allocated string (the fixed-buffer variants are untouched) and attached
a patch to this e-mail.

Thanks!

The new implementation works as follows:
- Start with a string of (scalable) fixed initial size.

What if you make this initial buffer a static buffer? And allocate the exact size needed at the end? That keeps the memory cost constant, at the expense of a guaranteed extra string copy.

For cases where you exceed the static buffer size, continue to do your size-doubling on overflow; for the cases where you exceed the static buffer size, you'll continue to use excess memory.

I think this tradeoff would be worth it...

-jdb

- Do the transcode via wctomb and mbtowc, that means the transcoding is
done per-symbol.
- Each time the allocated buffer appears to be too small, it is
re-allocated with double size.

I expect the following pros and cons:

Pros:
- Double speed (if not even better;) with small strings
- Complexity O(n) with huge strings. The overall speed performance with
large strings should always be better than before.
- String handling is done explicitly and locally, showing the maximum
possible performance, compared to some general string handling.

Cons:
There is a significant memory penalty, depending on the initial string
size:
- With very small strings (e.g. len=1), the memory eaten is several times
larger than the required minimum.
- With large strings, the statistically expected memory usage is 1.5
times larger than the required minimum. It is guaranteed, not to be
worse than a factor of 2.0.
- String handling is done explicitly and locally, and should be
generalized, once the advantages have been proven.

The disadvantages of over-consuming memory could be eliminated by a
final, additional string allocation (with the exact required size) and a
copy operation. However, I did not implement it yet, for speed
performance reasons. Finally, I'd like to offer some options to the user
(e.g. through configure), so she can select between the tradeoffs of
memory consumption vs. speed.

The iconv transcoder could be some kind of prototype for re-organizing
the other transcoders, too. I can't do much benchmarking here, so I hope that some people will compare my new implementation to the old one, and
report their results.

If it is of interest here, I could post a recently made complexity
analysis of the above algorithm.

Cheers,
            Axel

--
Humboldt-Universit�t zu Berlin
Institut f�r Informatik
Signalverarbeitung und Mustererkennung
Dipl.-Inf. Axel Wei�
Rudower Chaussee 25
12489 Berlin-Adlershof
+49-30-2093-3050
** www.freesp.de **

<iconv-transcoder.diff>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to