On Thu, Mar 6, 2008 at 7:14 AM, Jon Blower <[EMAIL PROTECTED]> wrote: [...] > We have an existing RESTful web application that involves clients > downloading multiple streams of data simultaneously. Our current > implementation is based on servlets and we are experiencing > scalability problems with the number of threads involved in serving > multiple large data streams simultaneously. I recently came across > Restlet and was attracted by the potential to use NIO under the hood > to enable more scalable large file transfers.
Cool. > In our case we are not necessarily serving large files that already > exist on disk: we are essentially creating the files ourselves on the > fly (so they are of unknown length when the file transfer starts). I > was wondering if anyone could offer advice on how to support the > serving of such data streams through Restlet in a scalable manner > (ideally without creating a new thread on the server for each file > transfer)? What do you mean by "large files"? I.e., are talking about generating content that is merely large relative to a web page (i.e., measured in megabytes) or are you talking about something like complete hi-def video (GBs in size) or something both large and nominally endless like live video streams? For the first case, if they are small enough I'd start by just fully rendering the contents to a Representation as usual and profile how well you can use the existing Jetty connector (with tuning, etc.). As you add more simultaneous clients, add more servers. Also, run your experiments with the new Grizzly connector and track that as it and v1.1+ stabilizes. For the second case (or where you have content sizes in the first case but lots of slow clients), I'd actually have that part of my origin servers either be fronted by a reverse-caching-proxy (e.g., squid) or generate and dump the contents from the origin server into a local file and redirect the client to get that content from e.g., lighttpd (+mod_secdownload). Depending on the nature of your client applications, the potential reuse of the generated content, etc. you can tune how you clean up the caches. For the last case, if I controlled the clients then I'd probably have the clients request good-sized chunks of the data in a loop and devolve to the appropriate combination of the first two approaches. Of course, that's more or less presuming that you can generate those chunks more or less independently (i.e., with minimal state information needed to keep the continuity from chunk to chunk). If you have heavy amounts of state and/or if you don't control the clients then I'd want to know a good bit more before making any recommendation. Hope this helps, John

