[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965625#action_12965625
 ] 

Thomas Koch commented on ZOOKEEPER-324:
---------------------------------------

- The immutability of a path represented as byte[] can be guaranteed by 
wrapping the byte[] in a Path class (and never handing the byte[] itself out of 
the class). I could change my Path class from ZOOKEEPER-849 to use byte[] 
internally.

- Are you sure to use UTF8 as encoding, not UTF16(UCS-2), which is the internal 
String encoding in the JVM? It may be faster do convert to and from Strings?

- Actually I'm not sure, whether UTF16 is guaranteed to be the internal 
encoding, just read it here:
http://web.archive.org/web/20040411230912/http://www.i18nfaq.com/java.html#4

- I suppose that the wire encoding should be the same as the internal encoding 
in the server? Avro and jute use UTF8.

- When using a cache for byte[] reuse, there are actual two possible ways:
  - cache the full path
  - cache every path part separately, e.g. /hello/world/there would be saved as 
a List<byte[]> and use three cache entries: "hello", "world", "there"
  The second option may save memory but be more CPU intensive.

- I could provide a cache for Path reuse in my Path class.




> do not materialize strings in the server
> ----------------------------------------
>
>                 Key: ZOOKEEPER-324
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-324
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>            Reporter: Benjamin Reed
>
> We convert paths and authentication information to strings rather than byte[] 
> even though we could work just as well with byte[] for our needs since we 
> don't really interpret the strings. we are just doing basic pattern matching. 
> the only really string manipulations we do with paths is to look for '/', but 
> we could easily to that with byte[] since we use utf8 encoding for the 
> strings. by not materializing the strings we save time doing the 
> serializations and also space since most (almost all) of our strings are 
> ASCII and thus just one byte.
> we could probably get by without even changing the jute spec if we make the 
> generated classes check for a flag to see whether strings should be treated 
> as byte[] or String.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to