Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
Please don't use reflection to reach into private internals of classes you don't maintain. We have public and private for a reason. Furthermore, this access may throw a SecurityException if a SecurityManager is in use. On Mon, May 31, 2010 at 11:25 AM, David Dabbs dmda...@gmail.com wrote: The approach I found worked the best: 1. Copy the string into a pre-allocated and re-used char[] array. This is needed since the JDK does not permit access to the String's char[] ,to enforce immutability. This is a performance loss VS the JDK, which can access the char[] directly Evan, you may access a String's internals via reflection in a safe, albeit potentially implementation-specific way. See class code below. As long as your java.lang.String uses value for the char[] and offset for the storage offset, this should work. No sun.misc.Unsafe used. Only tested/used on JDK6. David /*** **/ import java.lang.reflect.*; public final class ReflectionUtils { /** * There is no explicit Modifier constant for package-privete, so 0 is used. */ public static final int MODIFIER_PACKAGE_PRIVATE = 0x; /** Field object for accessing Sting::value character storage. */ public static final Field STRING_VALUE_FIELD = getFieldAccessible(String.class, value); /** Field object for accessing Sting::offset, the first index used in the value[] char storage. */ public static final Field STRING_OFFSET_FIELD = getFieldAccessible(String.class, offset); /** * Package private String constructor which shares value array for speed. * * Use when a number of String objects share the same char[] storage. * * See String(int offset, int count, char value[]). */ public static final Constructor? STRING_PP_CTOR = getConstructor(String.class, MODIFIER_PACKAGE_PRIVATE, int.class, int.class, char[].class); /** * To avoid violating final field semantics, take care to only _read_ * the char[] value returned. */ public static char[] getChars(final String s) { try { // use reflection to read the char[] value from the string. . . return (char[]) STRING_VALUE_FIELD.get(s); } catch (Throwable t) { return null; } } public static String sharedString(final char[] chars, final int offset, final int length) { try { return (String) STRING_PP_CTOR.newInstance(offset, length, chars); } catch (InstantiationException e) { e.printStackTrace(); } catch (IllegalAccessException e) { e.printStackTrace(); } catch (InvocationTargetException e) { e.printStackTrace(); } catch (Throwable t) { t.printStackTrace(); } return null; } public static Field getFieldAccessible(final Class? clazz, final String fieldName) { Field fld = null; try { fld = clazz.getDeclaredField(fieldName); fld.setAccessible(true); } catch (NoSuchFieldException e) { e.printStackTrace(); } catch (SecurityException e) { e.printStackTrace(); } return fld; } public static Constructor? getConstructor(final Class? clazz, final int searchModifier, final Class?... paramTypes) { if(clazz==null) { throw new IllegalArgumentException(A class parameter is required); } try { // // There is no explicit Modifier accessor constant for package-privete, so 0 is used. // for (Constructor? ctor : clazz.getDeclaredConstructors()) { if (searchModifier == (ctor.getModifiers() (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED))) { // // access modifier matches. . . // final Class?[] parameterTypes = ctor.getParameterTypes(); if (parameterTypes.length == paramTypes.length) { // // same number of parameters. . . // for (int i = 0; i parameterTypes.length; i++) { if (!parameterTypes[i].equals(paramTypes[i])) { // one parameter not of correct type, so bail. . . // note ctor variable used as success marker below ctor = null; break; } else { //Type[] gpType = ctor.getGenericParameterTypes(); //for (int j = 0; j gpType.length; j++) { //char ch = (gpType[j].equals(paramTypes[i]) ? '*' : ' '); //
RE: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
Even with the extra call to access the offset, I would think there would be some advantage to not making the data copies, which generate garbage cruft. Am interested in your patch whenever it surfaces. I seem to remember you saying that using an Encoder/Decoder didn't pay off when the number of strings to en/decode was small. Did the same hold true when using a ThreadLocal? David -Original Message- From: Evan Jones [mailto:ev...@mit.edu] Sent: Monday, May 31, 2010 4:32 PM To: David Dabbs Cc: Protocol Buffers Subject: Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements On May 31, 2010, at 14:25 , David Dabbs wrote: you may access a String's internals via reflection in a safe, albeit potentially implementation-specific way. See class code below. As long as your java.lang.String uses value for the char[] and offset for the storage offset, this should work. No sun.misc.Unsafe used. Only tested/used on JDK6. Good idea! Unfortunately, this isn't much faster for small strings. It is faster if you just get the value char[] array. However, when I modified my implementation to get both the char[] value and int offset, it ended up being about the same speed for my test data set, which is composed mostly of short UTF-8 and ASCII strings. Unfortunately, a correct implementation will need to get both values. Since this is also somewhat dangerous, it doesn't seem like a great idea for my data. At any rate: I'll try to find some time to try and prepare a protocol buffer patch with my encode to a temporary ByteBuffer trick, which does make things a bit faster. I won't necessarily advocate this patch to be included, but after having wasted this much time on this stuff, I'll certainly try to maintain the patch for a while, in case others are interested. Evan -- Evan Jones http://evanjones.ca/ No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.819 / Virus Database: 271.1.1/2908 - Release Date: 05/31/10 13:25:00 -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Jun 1, 2010, at 2:29 , David Dabbs wrote: Even with the extra call to access the offset, I would think there would be some advantage to not making the data copies, which generate garbage cruft. However, the way I am doing it doesn't generate any garbage: I keep a temporary char[] buffer around to use with String.getChars(). The cost is copying chars VS using reflection to access two fields. With the small strings I tested (average ~30 bytes per string), the copy is a bit cheaper than the reflection access. I assume that for larger strings, the reflection approach will probably be better. Which reminds me: I really need to test this with larger strings to make sure it isn't dog slow in that case. I seem to remember you saying that using an Encoder/Decoder didn't pay off when the number of strings to en/decode was small. Did the same hold true when using a ThreadLocal? From memory, the ThreadLocal appears to be very cheap, and not make much performance difference, but I should double check this as well. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
RE: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
The approach I found worked the best: 1. Copy the string into a pre-allocated and re-used char[] array. This is needed since the JDK does not permit access to the String's char[] ,to enforce immutability. This is a performance loss VS the JDK, which can access the char[] directly Evan, you may access a String's internals via reflection in a safe, albeit potentially implementation-specific way. See class code below. As long as your java.lang.String uses value for the char[] and offset for the storage offset, this should work. No sun.misc.Unsafe used. Only tested/used on JDK6. David /*** **/ import java.lang.reflect.*; public final class ReflectionUtils { /** * There is no explicit Modifier constant for package-privete, so 0 is used. */ public static final int MODIFIER_PACKAGE_PRIVATE = 0x; /** Field object for accessing Sting::value character storage. */ public static final Field STRING_VALUE_FIELD = getFieldAccessible(String.class, value); /** Field object for accessing Sting::offset, the first index used in the value[] char storage. */ public static final Field STRING_OFFSET_FIELD = getFieldAccessible(String.class, offset); /** * Package private String constructor which shares value array for speed. * * Use when a number of String objects share the same char[] storage. * * See String(int offset, int count, char value[]). */ public static final Constructor? STRING_PP_CTOR = getConstructor(String.class, MODIFIER_PACKAGE_PRIVATE, int.class, int.class, char[].class); /** * To avoid violating final field semantics, take care to only _read_ * the char[] value returned. */ public static char[] getChars(final String s) { try { // use reflection to read the char[] value from the string. . . return (char[]) STRING_VALUE_FIELD.get(s); } catch (Throwable t) { return null; } } public static String sharedString(final char[] chars, final int offset, final int length) { try { return (String) STRING_PP_CTOR.newInstance(offset, length, chars); } catch (InstantiationException e) { e.printStackTrace(); } catch (IllegalAccessException e) { e.printStackTrace(); } catch (InvocationTargetException e) { e.printStackTrace(); } catch (Throwable t) { t.printStackTrace(); } return null; } public static Field getFieldAccessible(final Class? clazz, final String fieldName) { Field fld = null; try { fld = clazz.getDeclaredField(fieldName); fld.setAccessible(true); } catch (NoSuchFieldException e) { e.printStackTrace(); } catch (SecurityException e) { e.printStackTrace(); } return fld; } public static Constructor? getConstructor(final Class? clazz, final int searchModifier, final Class?... paramTypes) { if(clazz==null) { throw new IllegalArgumentException(A class parameter is required); } try { // // There is no explicit Modifier accessor constant for package-privete, so 0 is used. // for (Constructor? ctor : clazz.getDeclaredConstructors()) { if (searchModifier == (ctor.getModifiers() (Modifier.PUBLIC | Modifier.PRIVATE | Modifier.PROTECTED))) { // // access modifier matches. . . // final Class?[] parameterTypes = ctor.getParameterTypes(); if (parameterTypes.length == paramTypes.length) { // // same number of parameters. . . // for (int i = 0; i parameterTypes.length; i++) { if (!parameterTypes[i].equals(paramTypes[i])) { // one parameter not of correct type, so bail. . . // note ctor variable used as success marker below ctor = null; break; } else { //Type[] gpType = ctor.getGenericParameterTypes(); //for (int j = 0; j gpType.length; j++) { //char ch = (gpType[j].equals(paramTypes[i]) ? '*' : ' '); //System.out.format(%7c%s[%d]: %s%n, ch, GenericParameterType, j, gpType[j]); //} } } if (ctor != null) { // all ctor parameter types match, so call ctor.setAccessible(true)
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On May 31, 2010, at 14:25 , David Dabbs wrote: you may access a String's internals via reflection in a safe, albeit potentially implementation-specific way. See class code below. As long as your java.lang.String uses value for the char[] and offset for the storage offset, this should work. No sun.misc.Unsafe used. Only tested/used on JDK6. Good idea! Unfortunately, this isn't much faster for small strings. It is faster if you just get the value char[] array. However, when I modified my implementation to get both the char[] value and int offset, it ended up being about the same speed for my test data set, which is composed mostly of short UTF-8 and ASCII strings. Unfortunately, a correct implementation will need to get both values. Since this is also somewhat dangerous, it doesn't seem like a great idea for my data. At any rate: I'll try to find some time to try and prepare a protocol buffer patch with my encode to a temporary ByteBuffer trick, which does make things a bit faster. I won't necessarily advocate this patch to be included, but after having wasted this much time on this stuff, I'll certainly try to maintain the patch for a while, in case others are interested. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On May 18, 2010, at 0:33 , Kenton Varda wrote: What if you did a fast scan of the bytes first to see if any are non- ASCII? Maybe only do this fast scan if the data is short enough to fit in L1 cache? I didn't try this exact idea, but my guess is that it may not be a win: to get fast access to the char[] in the string, you need to copy them into a separate char[] array. If you are doing this much work, you might as well encode them into UTF-8 while you are at it. Conclusion: 10% performance win for ASCII (garbage collection savings); no win for general UTF-8 text. Not worth it for protocol buffers, but I'll try digging into the decoding. The approach I found worked the best: 1. Copy the string into a pre-allocated and re-used char[] array. This is needed since the JDK does not permit access to the String's char[] ,to enforce immutability. This is a performance loss VS the JDK, which can access the char[] directly. 2. Encode into a pre-allocated byte[] array. This will completely handle short strings. For long strings, you end up needing to allocate additional temporary space. This is better than the JDK, which allocates a new temporary buffer = 4 * str.length(). 3. Allocate the final byte[] array, and System.arraycopy into it. This is the same as the JDK. Conclusion: This is only better than the JDK in that it reduces allocation and garbage collection. It is worse than the JDK because it requires a copy from the String into another char[]. On my tests with ASCII-only data, it ends up ~10% faster. In my tests with UTF-8 data, it ends up about the same speed. In other words: this probably isn't worth it for Protocol Buffers: this is a small performance improvement, but complicates the code significantly. However: For applications that can encode the string directly to the output buffer, this custom code can be significantly faster. However, since protocol buffers needs to encode to another buffer first to get the string length, this I may spend some time poking around at string *decoding*, because there we should be able to decode directly from the input buffer, saving a copy and an allocation of a temporary byte[] array. Unclear if this will actually be significantly faster, but it might be slightly faster. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
That seems simple enough and likely to produce a net win often enough. --Chris On May 17, 2010, at 9:33 PM, Kenton Varda ken...@google.com wrote: What if you did a fast scan of the bytes first to see if any are non-ASCII? Maybe only do this fast scan if the data is short enough to fit in L1 cache? On Mon, May 17, 2010 at 7:59 PM, Christopher Smith cbsm...@gmail.com wrote: This does somewhat suggestive that it might be worthwhile specifically tagging a field as ASCII only. There are enough cases of this that it could be a huge win. On 5/17/10, Evan Jones ev...@mit.edu wrote: On May 17, 2010, at 15:38 , Kenton Varda wrote: I see. So in fact your code is quite possibly slower in non-ASCII cases? In fact, it sounds like having even one non-ASCII character would force extra copies to occur, which I would guess would defeat the benefit, but we'd need benchmarks to tell for sure. Yes. I've been playing with this a bit in my spare time since the last email, but I don't have any results I'm happy with yet. Rough notes: * Encoding is (quite a bit?) faster than String.getBytes() if you assume one byte per character. * If you guess the number bytes per character poorly and have to do multiple allocations and copies, the regular Java version will win. If you get it right (even if you first guess 1 byte per character) it looks like it can be slightly faster or on par with the Java version. * Re-using a temporary byte[] for string encoding may be faster than String.getBytes(), which effectively allocates a temporary byte[] each time. I'm going to try to rework my code with a slightly different policy: a) Assume 1 byte per character and attempt the encode. If we run out of space: b) Use a shared temporary buffer and continue the encode. If we run out of space: c) Allocate a worst case 4 byte per character buffer and finish the encode. This should be much better than the JDK version for ASCII, a bit better for short strings that fit in the shared temporary buffer, and not significantly worse for the rest, but I'll need to test it to be sure. This is sort of just a fun experiment for me at this point, so who knows when I may get around to actually finishing this. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- Sent from my mobile device Chris -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
This does somewhat suggestive that it might be worthwhile specifically tagging a field as ASCII only. There are enough cases of this that it could be a huge win. On 5/17/10, Evan Jones ev...@mit.edu wrote: On May 17, 2010, at 15:38 , Kenton Varda wrote: I see. So in fact your code is quite possibly slower in non-ASCII cases? In fact, it sounds like having even one non-ASCII character would force extra copies to occur, which I would guess would defeat the benefit, but we'd need benchmarks to tell for sure. Yes. I've been playing with this a bit in my spare time since the last email, but I don't have any results I'm happy with yet. Rough notes: * Encoding is (quite a bit?) faster than String.getBytes() if you assume one byte per character. * If you guess the number bytes per character poorly and have to do multiple allocations and copies, the regular Java version will win. If you get it right (even if you first guess 1 byte per character) it looks like it can be slightly faster or on par with the Java version. * Re-using a temporary byte[] for string encoding may be faster than String.getBytes(), which effectively allocates a temporary byte[] each time. I'm going to try to rework my code with a slightly different policy: a) Assume 1 byte per character and attempt the encode. If we run out of space: b) Use a shared temporary buffer and continue the encode. If we run out of space: c) Allocate a worst case 4 byte per character buffer and finish the encode. This should be much better than the JDK version for ASCII, a bit better for short strings that fit in the shared temporary buffer, and not significantly worse for the rest, but I'll need to test it to be sure. This is sort of just a fun experiment for me at this point, so who knows when I may get around to actually finishing this. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- Sent from my mobile device Chris -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
What if you did a fast scan of the bytes first to see if any are non-ASCII? Maybe only do this fast scan if the data is short enough to fit in L1 cache? On Mon, May 17, 2010 at 7:59 PM, Christopher Smith cbsm...@gmail.comwrote: This does somewhat suggestive that it might be worthwhile specifically tagging a field as ASCII only. There are enough cases of this that it could be a huge win. On 5/17/10, Evan Jones ev...@mit.edu wrote: On May 17, 2010, at 15:38 , Kenton Varda wrote: I see. So in fact your code is quite possibly slower in non-ASCII cases? In fact, it sounds like having even one non-ASCII character would force extra copies to occur, which I would guess would defeat the benefit, but we'd need benchmarks to tell for sure. Yes. I've been playing with this a bit in my spare time since the last email, but I don't have any results I'm happy with yet. Rough notes: * Encoding is (quite a bit?) faster than String.getBytes() if you assume one byte per character. * If you guess the number bytes per character poorly and have to do multiple allocations and copies, the regular Java version will win. If you get it right (even if you first guess 1 byte per character) it looks like it can be slightly faster or on par with the Java version. * Re-using a temporary byte[] for string encoding may be faster than String.getBytes(), which effectively allocates a temporary byte[] each time. I'm going to try to rework my code with a slightly different policy: a) Assume 1 byte per character and attempt the encode. If we run out of space: b) Use a shared temporary buffer and continue the encode. If we run out of space: c) Allocate a worst case 4 byte per character buffer and finish the encode. This should be much better than the JDK version for ASCII, a bit better for short strings that fit in the shared temporary buffer, and not significantly worse for the rest, but I'll need to test it to be sure. This is sort of just a fun experiment for me at this point, so who knows when I may get around to actually finishing this. Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- Sent from my mobile device Chris -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
I think the 30-80% speed boost would be well worth the extra code size / memory overhead. Please send me the patch! On Sun, Jan 17, 2010 at 9:33 AM, Evan Jones ev...@mit.edu wrote: I've implemented a rough prototype of an optimization to the Java implementation that serializes Java strings once when optimize_for == SPEED, rather than twice. Brief overview: * Add a private volatile byte[] cache for each string field. * When computing the serialized size, serialize the string and store it in the cache. * When serializing, use the cache if available, then set the cache to null. I used the ProtoBench.java program included in the SVN repository, using the messages included in the repository. Brief summary of results: * Serializing a protocol buffer more than once is possibly slightly slower (~2-10%). I'm guessing the reason is that since it already has the message length, the extra conditionals and fields for string caching just get in the way. * Serializing a protocol buffer once, with the length prepended, is significantly faster (~33% for SpeedMessage1, with 10 strings, ~80% for SpeedMessage2, with lots of strings; the benchmark measures the time to create the protocol buffer, so the improvement is probably actually larger). * Classes are a bit bigger (SpeedMessage1.class with 8 strings: 24802 - 25577) * Size messages are unchanged. * Messages without strings are unchanged. Pros: * Faster serialization of messages that contain strings. Cons: * More code (extra fields; conditional checking of the cache) * More RAM (extra fields) * Some changes to CodedOutputStream (more code to handle cached strings). Does this seem like a worthwhile optimization? If so, I'll clean up my patch a bit and submit it for code review. Thanks, Evan -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Wed, Dec 23, 2009 at 7:44 AM, Evan Jones ev...@mit.edu wrote: On Dec 22, 2009, at 19:59 , Kenton Varda wrote: I wonder if we can safely discard the cached byte array during serialization on the assumption that most messages are serialized only once? This is a good idea, and it seems to me that this should definitely be possible. It would need to be done somewhat carefully, since Message objects are supposed to be thread safe, but I don't think this is particularly hard. Right. Assuming pointer reads and writes are atomic -- a reasonable assumption in general, but we can guarantee it with volatile -- then it is safe for one thread to set the cached value to null even if another thread is reading it simultaneously. Either the other thread will successfully get the pointer and be able to use it, or it will get null and have to rebuild it. The only additional tricky part: on subsequent serializations, it would be useful to know the serialized size of the string, in order to serialize the string directly into the output buffer, rather than needing to create a temporary byte[] array to get the length. This is also a solvable problem, and my lame benchmarks suggest this is only a small improvement anyway. We could cache the size separately, but since it would only be useful in the case of multiple serializations, it's probably not worthwhile. Who serializes the same immutable message multiple times? Solution 3: Maintain a private freelist of encoder objects within CodedOutputStream. Allocate one the first time a string is encoded on a particular stream object, and return it to the freelist on flush() (which is always called before discarding the stream unless an exception interrupts serialization). In may make sense for the freelist to additionally be thread-local to avoid locking, but if it's only one lock per serialization maybe it's not a big deal? I would guess that this might be more expensive than the ThreadLocal, but I don't know that for sure. It would avoid the one encoder/decoder per thread overhead. Do you think it is worth it? If a simple ThreadLocal is faster, use that. I was just thinking that if ThreadLocal lookup is slow, then it would be useful to only have to do that lookup once per CodedOutputStream object -- then subsequent uses are just dereferencing a field. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Wed, Dec 23, 2009 at 1:26 AM, Evan Jones ev...@mit.edu wrote: I've done some quick and dirty benchmarking of Java string encoding/ decoding to/from UTF-8 for an unrelated project, but I've realized that these performance improvements could be added to protobufs. The easy way to do UTF-8 conversions is the way CodedInputStream/ CodedOutputStream does it: using String.getBytes() and new String(). It turns out that using the java.nio.charset.CharsetDecoder/ CharsetEncoder *can* be faster. However, to make it faster the objects need to be reused, due to the cost of allocating temporary buffers and objects. Before I attempt to make any improvements, I want to see if anyone (Kenton primarily) has any opinions if these make sense. They would add ~100 lines of code to replace something which is now a few lines of code, and it is a small improvement (approximately 40% less time per encode/decode, on a list of 1400 strings in different languages). I haven't tried adding this to protobufs yet, so final performance improvements are unknown: Problem 1: A Java protobuf string is stored as a String instance. It typically gets converted to UTF-8 *twice*: Once in getSerializedSize() via a call to CodedOutputStream.computeStringSize, then again in writeTo(). Solution: Cache the byte[] version of String fields. This would increase the memory size of each message (an additional pointer per string, plus the space for the byte[]), but would HALVE the number of conversions. I suspect this will be a fair bit faster. If added, it should only be added for the SPEED generated messages. I noticed this as well before ... the solution could be applied to *all* generated messages for efficiency. There should be a writeByteArray(int fieldNumber, byte[] value) in CodedOutputStream so that the cached bytes of strings would be written directly. The ByteString would not help, it adds more memory since it creates a copy of the byte array. Problem 2: Using the NIO encoders/decoders can be faster than String.getBytes, but only if it is used = 4 times. If used only once, it is worse. The same is approximately true about decoding. Lame results: http://evanjones.ca/software/java-string-encoding.html Solution 1: Add a custom encoder/decoder to CodedOutputStream, allocated as needed. This could be *bad* for applications that call Message.toByteString or .toByteArray frequently for messages with few strings, since that creates and throws away a single CodedOutputStream instance. Solution 2: Add a custom encoder/decoder per thread via a ThreadLocal. This requires fetching the ThreadLocal, which is slightly expensive, and adds some per-thread memory overhead (~ 4kB, tunable). however the allocations are done ONCE per thread, which should be significantly better. -- Evan Jones http://evanjones.ca/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.comprotobuf%2bunsubscr...@googlegroups.com . For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. -- When the cat is away, the mouse is alone. - David Yu -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Tue, Dec 22, 2009 at 7:06 PM, David Yu david.yu@gmail.com wrote: There should be a writeByteArray(int fieldNumber, byte[] value) in CodedOutputStream so that the cached bytes of strings would be written directly. The ByteString would not help, it adds more memory since it creates a copy of the byte array. We could cache the bytes as a ByteString. Converting a String to a ByteString does not require a redundant copy, as ByteString has methods for this. I think it would be better to do it this way because, in the long run, we actually want to extend ByteString to allow avoiding copies in some cases. For example, if you are serializing a message to a ByteString (you caleld toByteString()) or parsing from a ByteString, then handling bytes fields should require any copy. Instead, it should be possible to construct a ByteString which is a substring of some other ByteString in O(1) time, as well as concatenate ByteStrings in O(1) time. So this way, if the size-computation step converted the String to a ByteString and cached that, no further copy of the bytes would ever be needed in many cases. -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Tue, Dec 22, 2009 at 8:18 PM, David Yu david.yu@gmail.com wrote: On Wed, Dec 23, 2009 at 11:14 AM, Kenton Varda ken...@google.com wrote: On Tue, Dec 22, 2009 at 7:06 PM, David Yu david.yu@gmail.com wrote: There should be a writeByteArray(int fieldNumber, byte[] value) in CodedOutputStream so that the cached bytes of strings would be written directly. The ByteString would not help, it adds more memory since it creates a copy of the byte array. We could cache the bytes as a ByteString. Converting a String to a ByteString does not require a redundant copy, as ByteString has methods for this. I think it would be better to do it this way because, in the long run, we actually want to extend ByteString to allow avoiding copies in some cases. For example, if you are serializing a message to a ByteString (you caleld toByteString()) or parsing from a ByteString, then handling bytes fields should require any copy. Instead, it should be possible to construct a ByteString which is a substring of some other ByteString in O(1) time, as well as concatenate ByteStrings in O(1) time. So this way, if the size-computation step converted the String to a ByteString and cached that, no further copy of the bytes would ever be needed in many cases. Cool. Btw, the ByteString's snippet is: return new ByteString(text.getBytes(UTF- 8)); Another improvement would be avoiding the lookup and instead cache the Charset.forName(UTF-8) object and use it. I believe you google guys have also been evangelizing this :-) (PDF from http://code.google.com/p/guava-libraries/) I tried doing that at one point and found that it was *much slower* -- apparently String.getBytes(UTF-8) is highly-optimized, whereas creating a Charset object (even statically) and using that is not. :/ -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Java UTF-8 encoding/decoding: possible performance improvements
On Wed, Dec 23, 2009 at 12:21 PM, Kenton Varda ken...@google.com wrote: On Tue, Dec 22, 2009 at 8:18 PM, David Yu david.yu@gmail.com wrote: On Wed, Dec 23, 2009 at 11:14 AM, Kenton Varda ken...@google.com wrote: On Tue, Dec 22, 2009 at 7:06 PM, David Yu david.yu@gmail.comwrote: There should be a writeByteArray(int fieldNumber, byte[] value) in CodedOutputStream so that the cached bytes of strings would be written directly. The ByteString would not help, it adds more memory since it creates a copy of the byte array. We could cache the bytes as a ByteString. Converting a String to a ByteString does not require a redundant copy, as ByteString has methods for this. I think it would be better to do it this way because, in the long run, we actually want to extend ByteString to allow avoiding copies in some cases. For example, if you are serializing a message to a ByteString (you caleld toByteString()) or parsing from a ByteString, then handling bytes fields should require any copy. Instead, it should be possible to construct a ByteString which is a substring of some other ByteString in O(1) time, as well as concatenate ByteStrings in O(1) time. So this way, if the size-computation step converted the String to a ByteString and cached that, no further copy of the bytes would ever be needed in many cases. Cool. Btw, the ByteString's snippet is: return new ByteString(text.getBytes(UTF- 8)); Another improvement would be avoiding the lookup and instead cache the Charset.forName(UTF-8) object and use it. I believe you google guys have also been evangelizing this :-) (PDF from http://code.google.com/p/guava-libraries/) I tried doing that at one point and found that it was *much slower* -- apparently String.getBytes(UTF-8) is highly-optimized, whereas creating a Charset object (even statically) and using that is not. :/ Just checked the code ... and you're absolutely right. java.lang.StringCoding (line 277) creates an unnecessary copy of the char array which makes it slow. I'm not sure but it might be a sun jdk 6 bug. -- When the cat is away, the mouse is alone. - David Yu -- You received this message because you are subscribed to the Google Groups Protocol Buffers group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.