I see. So in fact your code is quite possibly slower in non-ASCII cases?
In fact, it sounds like having even one non-ASCII character would force
extra copies to occur, which I would guess would defeat the benefit, but
we'd need benchmarks to tell for sure.
On Fri, May 7, 2010 at 6:21 PM, Evan
On May 17, 2010, at 15:38 , Kenton Varda wrote:
I see. So in fact your code is quite possibly slower in non-ASCII
cases? In fact, it sounds like having even one non-ASCII character
would force extra copies to occur, which I would guess would defeat
the benefit, but we'd need benchmarks to
Yeah I don't think we should add a way to inject decoders into ByteString...
I'd be very interested to hear why the JDK is not optimal here.
On Mon, May 3, 2010 at 6:16 PM, Evan Jones ev...@mit.edu wrote:
On May 3, 2010, at 21:11 , Evan Jones wrote:
Yes, I actually changed ByteString, since
On May 7, 2010, at 18:54 , Kenton Varda wrote:
I'd be very interested to hear why the JDK is not optimal here.
I dug into this. I *think* the problem is that the JDK ends up
allocating a huge temporary array for the UTF-8 data. Hence, the
garbage collection cost is higher for the JDK's
Interesting. Since this seems like a JVM implementation issue, I wonder if
the results are different on Dalvik (Android). Also, the extra code sounds
undesirable for lite mode, but my guess is that you had to place this code
inside CodedOutputStream which is shared by lite mode. So yeah, there
On May 3, 2010, at 21:11 , Evan Jones wrote:
Yes, I actually changed ByteString, since ByteString.copyFromUtf8 is
how protocol buffers get UTF-8 encoded strings at this point.
Although now that I think about it, I think it might be possible to
enable this only for SPEED messages, if that