Hi all,

My first post here.  Not sure it's the best place for it, but hoping 
someone here might be able to assist.  We're developers of a streaming 
application that does 100k+ messages per second processing, so anything 
that allocates to the heap can cause GC pressure.  We've been targeting 
removing allocations recently and one difficult ones is conversion from 
String to UTF-8 to OutputStream.  The advised methods for String to UTF-8 
all create byte[] as intermediary objects.

We've been able to read Strings from streams using a ThreadLocal ByteBuffer 
and CharBuffer which will allocate the char[] and String object only.  For 
reference, we've done the following.  In a 1minute JMC Flight Recorder 
char[] and String make up about 1 gig or allocations which is unavoidable 
because we're processing a lot of string data:

    public static final class StringBuffers
    {
        ByteBuffer buffer = ByteBuffer.allocate(512);
        CharBuffer charBuffer = CharBuffer.allocate(300);
        CharsetDecoder decoder = Charset.forName("UTF8").newDecoder();
    }

    public static final class U8Utf8MethodHandleReader extends 
AbstractReader
    {
        private final ThreadLocal<StringBuffers> buffers = new 
ThreadLocal<StringBuffers>()
        {
            @Override
            public StringBuffers initialValue()
            {
                return new StringBuffers();
            }
        };

        public U8Utf8MethodHandleReader(final MethodHandle setHandle)
        {
            super(setHandle);
        }

        @Override
        public void read(final Object o, final TypeInputStream in) throws 
Throwable
        {
            final int len = in.read();

            // Grab a thread local set of buffers to use temporarily.
            final StringBuffers buf = buffers.get();

            // get a reference to the buffers.
            final ByteBuffer b = buf.buffer;
            final CharBuffer c = buf.charBuffer;

            b.clear();
            c.clear();

            // read the stream into the byte buffer.
            in.getStream().read(b.array(), 0, len);
            b.limit(len);

            // decode the bytes into the char buffer.
            final CharsetDecoder decoder = buf.decoder;
            decoder.reset();
            decoder.decode(b, c, true);

            // flip the char buffer.
            c.flip();

            // get a copy of
            final String str = c.toString();

            // finally set the string value via method handle.
            setHandle.invoke(o, str);
        }
    }

For writing Strings we've tried a similar method:

    public static final class StringBuffers
    {
        ByteBuffer buffer = ByteBuffer.allocate(512);
        CharsetEncoder encoder = Charset.forName("UTF8").newEncoder();
    }

    public static final class U8Utf8MethodHandleWriter extends 
AbstractWriter
    {
        private final ThreadLocal<StringBuffers> buffers = new 
ThreadLocal<StringBuffers>()
        {
            @Override
            public StringBuffers initialValue()
            {
                return new StringBuffers();
            }
        };

        public U8Utf8MethodHandleWriter(final MethodHandle getHandle)
        {
            super(getHandle);
        }

        @Override
        public void write(final Object o, final TypeOutputStream out) 
throws Throwable
        {
            // finally set the string value.
            final String str = (String) getHandle.invoke(o);

            final OutputStream os = out.getStream();

            // empty strings just write 0 for length.
            if (str == null)
            {
                os.write(0);
                return;
            }

            // Grab a thread local set of buffers to use temporarily.
            final StringBuffers buf = buffers.get();

            // get a reference to the buffers.
            final ByteBuffer b = buf.buffer;

            // this does allocate an object, but at least it isn't copying 
the buffer!
            final CharBuffer c = CharBuffer.wrap(str);

            // clear the byte buffer.
            b.clear();

            // decode the bytes into the char buffer.
            final CharsetEncoder encoder = buf.encoder;
            encoder.reset();
            encoder.encode(c, b, true);

            // flip the char buffer.
            b.flip();

            final int size = b.limit();

            if (size > 255)
            {
                throw new TypeException("u8ascii: String length exceeded 
max length of 255.  len =" + size);
            }

            if (writeNotNull)
            {
                os.write(1);
            }

            os.write(size);
            os.write(b.array(), 0, size);
        }
    }

The offending CharBuffer.wrap(str) currently allocates 766MB in a one 
minute period and has the largest allocation profile for the application.  

Interested if anyone else has found a better solution for this or can 
suggest alternative solutions.

Thanks,
David.




-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to