[Architecture] MSS chunked multipart data processing support

Samiyuru Senarathne Mon, 09 Nov 2015 20:21:39 -0800

In MSS, we implemented support for streaming chunked HTTP requests by
providing the following interface.


HttpStreamHandler

chunk()

finished()

error()

When a resource method implements this interface to handle a chunked
input: chunk()
will be called each time a data chunk is received, finished() will be
called when all chunks have been received, error() will be called when
exceptions are thrown while processing a request.

We can use this behaviour in order to handle large request payloads without
consuming much memory in a zero copy manner. IE. We can write the chunks
directly to another stream (file output stream etc.) without first
aggregating the chunks in memory.

But this streaming behaviour falls in short when the payload is not an
individual entity. IE. When the payload is multipart [1] (Common when html
forms are submitted with files). In order to parse such payloads either we
have to have the whole request body in place or we have to parse the HTTP
chunks while they are streaming to produce meaningful elements. Having the
whole request body in place works for small payloads but not a solution for
large ones since they can consume a huge amount of memory. Therefore, for
large payloads we have to support a way of processing the chunks while
streaming to get meaningful entities out of them. In this case, If we
provide only the HttpStreamHandler low level interface to the developers,
they will have to do the stream processing themselves. Then this will be a
big burden that creates lot of trouble.

Considering the above facts I suggest to discourage the use of the
HttpStreamHandler
interface and provide a high level alternative streaming solution. For
that, we can introduce a new high level @Context injectable stream
parameter for our JAX-RS like resource methods. This new stream should be
able to behind the scenes process the streaming HTTP chunks and provide a
meaningful object stream. IE. The objects in the stream will be, key value
pairs for form fields etc. and inputstreams for files etc. Basically this
stream will be a form of pipe that consumes very low amount of memory in
terms of the implementation. Developers will be able to conveniently use
this new stream inside the resource methods without worrying about low
level details of the requests as shown in the following code fragment.

// Note: newStream is the suggested new stream. We should give it a proper
name later.
// newStream should be injected to the resource method using @Context

newStream.listen(elem -> {

if(elem instanceof Field){

...

} else if (elem instanceof StreamedFile ){

…

// for these types of content we can provide another zero copy solution to

// pipe the stream to a file etc.

// eg: elem.stream(outputStream1, outputStream2).whenFinished(()->{});

}

});

WDYT?

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.2


Regards,
Samiyuru

-- 
Samiyuru Senarathne
*Software Engineer*
Mobile : +94 (0) 71 134 6087
[email protected]

_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

[Architecture] MSS chunked multipart data processing support

Reply via email to