On Feb 15, Artur Biesiadowski wrote:
> Jochen Hoenicke wrote:
> [...]
> > It is probably slower if
> > many different methods are called on the same character (since I don't
> > have a cached CharAttr), but faster if they are called on similar
> > characters (characters in the same block).  
> 
> Unfortunately cachedBlock in your Character is possibly dangerous - if
> thread will switch after compares and before return you can get a wrong
> result. 

I haven't thought of concurrency, you are right here.  But I think
there is a simple fix.  Just copy it to a local variable, before
checking if it's the right block:

--- Character.java~     Thu Dec 30 20:02:41 1999
+++ Character.java      Wed Feb 16 15:44:18 2000
@@ -577,4 +577,5 @@
   private static int getBlock(char ch) {
-    if (ch >= blocks[cachedBlock] && ch <= blocks[cachedBlock+1])
-      return cachedBlock;
+    int lastCached = cachedBlock;
+    if (ch >= blocks[lastCached] && ch <= blocks[lastCached+1])
+      return lastCached;
     // simple binary search

> Maybe later we can do something faster, looking at space/speed benefits
> - like creating constant size blocks, which would be accesible by just
> shifting char. Something like
> 
> charData = data[block[ch>>11]+(ch&0x1f)];

I have tested how big the arrays would be for each shift:

shift:  0   data array:   158  block array: 65536
shift:  1   data array:   558  block array: 32768
shift:  2   data array:  1528  block array: 16384
shift:  3   data array:  2944  block array:  8192
shift:  4   data array:  4288  block array:  4096
shift:  5   data array:  6144  block array:  2048
shift:  6   data array:  7808  block array:  1024
shift:  7   data array:  9088  block array:   512
shift:  8   data array: 10496  block array:   256
shift:  9   data array: 13312  block array:   128
shift: 10   data array: 18432  block array:    64
shift: 11   data array: 30720  block array:    32
shift: 12   data array: 45056  block array:    16
shift: 13   data array: 65536  block array:     8

I think the optimum is: 
  shift = 6,  
  data  : byte[7808], block:char[1024],
  flags : byte[158], lowercase,uppercase,numValue : char[158]
Total size: 10962 bytes (+/- Unicode Version Number)  

The data would be accessed with three array accesses, e.g:  
  flags[data[block[ch>>6] + (ch & 0x3f)]]

toUpperCase would be:
  return (char) (ch + uppercase[data[block[ch>>6] + (ch & 0x3f)]]);

  Jochen

Reply via email to