David M. Lloyd wrote:
<snip/>
I think that using a byte[] (for instance in the encoder), transform it to a ByteBuffer, is another way to deal with the problem.

One important point is that ByteBuffers are just mean to contains a fixed amount of data. It's a buffer, not a data structure. Transforming ByteBuffer to make them able to expand twist their intrinsic semantic.

Yes, it makes far more sense to accumulate buffers until you can decode your message from it.
Or decode the stream as it comes, creating the object on the fly. A statefull decoder...

So I would say that BB should be used on the very low level (reading data and sending data), but then, the other layers should use byte[] or a stream of bytes.

I don't see the advantage of using byte[] honestly - using at the least a wrapper object seems preferable.
This is what we are doing in ADS : LDAP messages are built on the fly, simply by coping with ByteBuffers.

Consider that accumulating BB to create a big byte[] should be understand as : transform BB directly to the targeted wrapper objects. Thanks for correcting me :)
And if you're going to use a wrapper object, why not just use ByteBuffer.
Because you may receive more than one BB before you can build the wrapper object.

This will lead to very intersting performances questions :
- how to handle large stream of data ?

One buffer at a time. :-)
Well, I tried to think about other strategies, but, eh, you are just plain right ! It's up to the codec filter to deal with the complexity of the data it has to decode !

- should we serialize the stream at some point ?

What do you mean by "serialize"?
Write to disk if the received data are too big. See my previous point (it's up to the decoder to deal with this)

- how to write an efficient decoder, when you may receive fractions of what you are waiting for ?

An ideal decoder would be a state machine which can be entered and exited at any state. This way, even a partial buffer can be fully consumed before returning to wait for the next buffer.
This is what we have in ADS : A stateful decoder. Not as simple as if you have the whole data in memory, especially if you have to deal with multi-bytes markers, but not too complex neither.

However many decoders are not ideal due to various constraints. In the worst case, you could accumulate ByteBuffer instances until you have a complete message that can be handled. What I do at this point is to create a DataInputStream that encapsulates all the received buffers.
Yeah, 100% agree.

Note that a buffer might contain data from more than one message as well. So it's important to use only a slice of the buffer in this case.
Not a big deal. Again, it's the decoder task to handle such a case. We have experimented such a case in LDAP too. (make me think that we should describe the ldap codec on the MINA site, just to give some insight for people who want to write a statefull decoder)

- how to write an efficient encoder when you have no idea about the size of the data you are going to send ?

Use a buffer factory, such as IoBufferAllocator, or use an even simpler interface like this:

public interface BufferFactory {
    ByteBuffer createBuffer();
}

which mass-produces pre-sized buffers. In the case of stream-oriented systems like TCP or serial, you could probably send buffers as you fill them. For message-oriented protocols like UDP, you can accumulate all the buffers to send, and then use a single gathering write to send them as a single message (yes, this stinks in the current NIO implementation, as Trustin pointed out in DIRMINA-518, but it's no worse than the repeated copying that auto-expanding buffers use; and APR and other possible backends [and, if I have any say at all in it, future OpenJDK implementations] would hopefully not suffer from this limitation).
That's an idea. But this does not solve one little pb : if the reader is slow, you may saturate the server memory with prepared BB. So you may need a kind of throttle mechanism, or a blocking queue, to manage this issue : a new BB should not be created unless the previous one has been completely sent.

For all these reasons, the mail I sent a few days ago express my personnal opinion that IoBuffer may be a little bit overkilling (remember that this class -and the associated tests- represent around 13% of all mina common code ! )

Yes, that's very heavy. I looked at resolving DIRMINA-489 more than once, and was overwhelmed by the sheer number of methods that had to be implemented, and the overly complex class structure.

One option could be to use ByteBuffer with some static support methods,
+1
and streams to act as the "user interface" into collections of buffers. For example, an InputStream that reads from a collection of buffers, and an OutputStream that is configurable to auto-allocate buffers, performing an action every time a buffer is filled:

public interface BufferSink {
    void handleBuffer(ByteBuffer buffer);
}
That's an option.

Another option is to skip ByteBuffers and go with raw byte[] objects (though this closes the door completely to direct buffers).
Well, ByteBuffers are so intimately wired with NIO that I don't think we can easily use byte[] without losing performances... (not sure though ...)

Yet another option is to have a simplified abstraction for byte arrays like Trustin proposes, and use the stream cleasses for the buffer state implementation.

This is all in addition to Trustin's idea of providing a byte array abstraction and a buffer state abstraction class.
I'm afraid that offering a byte[] abstraction might lead to more complexity, wrt with what you wrote about the way codec should handle data. At some point, your ideas are just the good ones, IMHO : use BB, and let the codec deal with it. No need to add more complex data structure on top of it.

Otherwise, the idea may be to define some simple codec which transform a BB to a array[], for those who need it. As we have a cool Filter chain, let's use it...

wdyt ?


--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org


Reply via email to