Hi Gerd,

1) always release memory for mdr19, not only if it was filled for the device.
2) check if sort key has zero length, if yes, don't allocate new buffer
3) don't use MultiSortKey for mdr7 if --x-split-name-index is not used
4) count occurences of the generated key strings, cache sort keys for those 
keys which occur more than 3 times.
5) create smaller copy of byte array if number of allocated bytes was too high

@Steve: Please review, I think 1-4 are okay, 5) may cause problems if my 
understanding of Sort.fillkey() is wrong.

Looks OK.

I agree to add cache in Mdr7, don't know why I omitted it
to begin with.

But does the complex counted cache improve memory use over the simple
case?  As the cache is temporary, after the routine returns there will
be a greater memory use from all the keys that are not de-duplicated.
It seems to me that the temporary memory required during
preWriteImpl() would also be larger unless there are a vast number of
single use keys.

I was also surprised to see that we require such long sort keys, even with 
--latin1.
I see many 0 bytes in the created keys, maybe that can be optimized further?

Its possible, see for example:

  http://www.unicode.org/reports/tr10/#Implementation_Notes

As I was working out how the Srt files worked, I gradually came to
realise that it was pretty much the same way that Java does collation.
I changed the syntax of the resource/sort/*.txt files to more closely
match the language that is used by RuleBasedCollator().  There is also
the icu4j project http://site.icu-project.org/

I can't remember how close either of them were to working as needed
for mkgmap. In the end my code was faster, so I stayed with it.
Its possible that the built in java code could be made to work and
it might be faster now, or perhaps I missed a trick to make it so
at the time.

..Steve

_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to