On Feb 15, Artur Biesiadowski wrote:
> Jochen Hoenicke wrote:
> [...]
> > It is probably slower if
> > many different methods are called on the same character (since I don't
> > have a cached CharAttr), but faster if they are called on similar
> > characters (characters in the same block).
>
> Unfortunately cachedBlock in your Character is possibly dangerous - if
> thread will switch after compares and before return you can get a wrong
> result.
I haven't thought of concurrency, you are right here. But I think
there is a simple fix. Just copy it to a local variable, before
checking if it's the right block:
--- Character.java~ Thu Dec 30 20:02:41 1999
+++ Character.java Wed Feb 16 15:44:18 2000
@@ -577,4 +577,5 @@
private static int getBlock(char ch) {
- if (ch >= blocks[cachedBlock] && ch <= blocks[cachedBlock+1])
- return cachedBlock;
+ int lastCached = cachedBlock;
+ if (ch >= blocks[lastCached] && ch <= blocks[lastCached+1])
+ return lastCached;
// simple binary search
> Maybe later we can do something faster, looking at space/speed benefits
> - like creating constant size blocks, which would be accesible by just
> shifting char. Something like
>
> charData = data[block[ch>>11]+(ch&0x1f)];
I have tested how big the arrays would be for each shift:
shift: 0 data array: 158 block array: 65536
shift: 1 data array: 558 block array: 32768
shift: 2 data array: 1528 block array: 16384
shift: 3 data array: 2944 block array: 8192
shift: 4 data array: 4288 block array: 4096
shift: 5 data array: 6144 block array: 2048
shift: 6 data array: 7808 block array: 1024
shift: 7 data array: 9088 block array: 512
shift: 8 data array: 10496 block array: 256
shift: 9 data array: 13312 block array: 128
shift: 10 data array: 18432 block array: 64
shift: 11 data array: 30720 block array: 32
shift: 12 data array: 45056 block array: 16
shift: 13 data array: 65536 block array: 8
I think the optimum is:
shift = 6,
data : byte[7808], block:char[1024],
flags : byte[158], lowercase,uppercase,numValue : char[158]
Total size: 10962 bytes (+/- Unicode Version Number)
The data would be accessed with three array accesses, e.g:
flags[data[block[ch>>6] + (ch & 0x3f)]]
toUpperCase would be:
return (char) (ch + uppercase[data[block[ch>>6] + (ch & 0x3f)]]);
Jochen