Ulf Zibis wrote:
Am 24.04.2010 01:09, schrieb Xueming Shen:

I changed the data file "format" a bit, so now the overal uniName.dat is less than 88k (last version is 122+k), but the I can no long use cpLen as the capacity for the hashmap. I'm now using a hardcoded 20000 for 5.2.

Again, is 88k the compressed or the uncompressed size ?

Yes, it's the size of compressed data. Your smart "save one more byte" suggestion will save
400+byte, a tiny 0.5%, unfortunately:-)


-- Is it faster, first copying the whole date in a byte[], and then using ByteBuffer.getInt etc. against directly using DataInputStream methods?
The current impl use neither ByteBuffer nor DataInputStream now, so no compare here. Yes, to use DataInputStream will definitely makes code look better (no more those "ugly" shifts), but it also will slow down thing a little since it adds one more layer. But speed
may not really a concern here.

-- You could create a very long String with the whole data and then use subString for the individual strings which could share the same backing char[].

The disadvantage of using a big buffer String to hold everything then have the individual names to substring from it is that it might simply break the softreference logic here. The big char[] will never been gc-ed as long as there is still one single name object (substring-ed from it) is still walking around in system somewhere.
I don't think the vm/gc is that smart, isn't it?

But this will definitely be faster, given the burden of creating a String from bytes (we put in the optimization
earlier, so this operation should be faster now compared to 6u).

-Sherman

Reply via email to