Re: streaming request body
John Keyes wrote: My point here is that if I have X requests then there can be X * CONTENT_LENGTH_CHUNKED bytes in memory at one time. I see what you mean. But the above calculation does not make sense: CONTENT_LENGTH_CHUNKED is a (negative) integer that signals to HttpClient that you do not want the request to be buffered. As previously pointed out by others the (implicit) small buffering in memory is hardly avoidable and you should just accept it as a fact. Even with thousands of connection this should not be too much of a problem if you configure your JVM to use enough memory. Currently the buffer size is hard coded in EntityEnclosingMethod::writeRequestBody to 4096 bytes. Maybe we could make this configurable if this helps you. Odi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
On 25 Feb 2004, at 14:43, Ortwin Glück wrote: John Keyes wrote: My point here is that if I have X requests then there can be X * CONTENT_LENGTH_CHUNKED bytes in memory at one time. I see what you mean. But the above calculation does not make sense: CONTENT_LENGTH_CHUNKED is a (negative) integer that signals to HttpClient that you do not want the request to be buffered. Doh! I didn't research that, I assumed it meant it was a chunk size. As previously pointed out by others the (implicit) small buffering in memory is hardly avoidable and you should just accept it as a fact. Even with thousands of connection this should not be too much of a problem if you configure your JVM to use enough memory. Yeah. I need to put some more thought into what I was saying. Memory shouldn't be an issue but we are trying to cover the bases. Currently the buffer size is hard coded in EntityEnclosingMethod::writeRequestBody to 4096 bytes. Maybe we could make this configurable if this helps you. It's always good to have things configurable :-) Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
It's always good to have things configurable :-) John, feel free to file a feature request with the Bugzilla if you want to keep track of the issue resolution and provide us with some feedback. http://nagoya.apache.org/bugzilla/enter_bug.cgi?product=Commons I deem this feature fairly easy to implement based on our new preference architecture. Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 25, 2004 18:54 To: Commons HttpClient Project Subject: Re: streaming request body On 25 Feb 2004, at 14:43, Ortwin Glück wrote: John Keyes wrote: My point here is that if I have X requests then there can be X * CONTENT_LENGTH_CHUNKED bytes in memory at one time. I see what you mean. But the above calculation does not make sense: CONTENT_LENGTH_CHUNKED is a (negative) integer that signals to HttpClient that you do not want the request to be buffered. Doh! I didn't research that, I assumed it meant it was a chunk size. As previously pointed out by others the (implicit) small buffering in memory is hardly avoidable and you should just accept it as a fact. Even with thousands of connection this should not be too much of a problem if you configure your JVM to use enough memory. Yeah. I need to put some more thought into what I was saying. Memory shouldn't be an issue but we are trying to cover the bases. Currently the buffer size is hard coded in EntityEnclosingMethod::writeRequestBody to 4096 bytes. Maybe we could make this configurable if this helps you. It's always good to have things configurable :-) Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? As for (b), this is again under your control via HttpMethod.getResponseBodyAsStream(). As with (a), you can also invoke HttpClient such that it does cache the entire contents (HttpMethod.getResponseBodyAsString() ). In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. Connection pooling is only part of the concern. HttpClient supports HTTP 1.1 persistent connections. It doesn't expose the underlying socket's InputStream and OutputStream. If it did, it cound not ensure that persistent connections work properly. I still don't see the problem. The OutputStream and InputStream can be wrapped so there is no loss of control. Why do you think control would be lost? -John K -Eric. John Keyes wrote: Guys, A colleague pointed out to me that this does not in fact resolve the situation. The solutions pointed out allow me to read the attachment as a stream. The contents are still held in memory prior to writing it on the wire. To fully support this you would need access to the OutputStream. If we could pass a HttpClient to the HttpMethod then we could get access to the output stream via the getRequestOutputStream method. I don't understand the connection pooling argument. I thought it should be a user preference whether to have connection pooling. Any ideas on this? -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] * ** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. * **
Re: streaming request body
John Keyes schrieb: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Best Regards, Stefan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
John Keyes wrote: In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. I am getting a little angry by now. C'mon man, we wrote this baby and we know very well what's possible with it. So please don't tell us it can not do unbuffered requests. It's as simple as: InputStream dataStream = // get this from wherever, use a pipe or something PostMethod method = new PostMethod(/myservlet); method.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHUNKED) method.setRequestBody(dataStream); client.execute(method); If you still can not figure out how you can get an InputStream for your in-memory data, then please refer to other Java resources; but this is certainly not an issue with HttpClient. hope that helps Ortwin Glück - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
Stefan Dingfelder wrote: I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Best Regards, Stefan must not be a file on disk should read does not have to be a file on disk (this is a common German mistake). But you are absolutely correct. There is no need to use the disk at all. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
John Just for the record: HttpClient 2.0 design is completely broken in many, many wonderful ways, and we are perfectly aware of that. Excuse my lack of understanding, however, I do think that applies to the current implementation of the content streaming. Allow me to reiterate that HttpClient does provide enough flexibility to avoid content buffering. Albeit, instead of exposing the OutputStream interface and letting the caller handle the content writing, HttpClient accepts InputStream as input and does all the content reading from the stream and writing it into output stream of the underlying socket. The idea here is to avoid exposure of underlying socket in order to prevent possible misuse/abuse. Think of maintaining integrity of persistent connection, for instance. One situation where to content buffering seems unavoidable is object serialization. I believe this is what Eric was referring to. There's currently a request pending for better OutputSteam based serialization. See Mohammad's post for more details Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 15:19 To: Commons HttpClient Project Subject: Re: streaming request body For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? As for (b), this is again under your control via HttpMethod.getResponseBodyAsStream(). As with (a), you can also invoke HttpClient such that it does cache the entire contents (HttpMethod.getResponseBodyAsString() ). In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. Connection pooling is only part of the concern. HttpClient supports HTTP 1.1 persistent connections. It doesn't expose the underlying socket's InputStream and OutputStream. If it did, it cound not ensure that persistent connections work properly. I still don't see the problem. The OutputStream and InputStream can be wrapped so there is no loss of control. Why do you think control would be lost? -John K -Eric. John Keyes wrote: Guys, A colleague pointed out to me that this does not in fact resolve the situation. The solutions pointed out allow me to read the attachment as a stream. The contents are still held in memory prior to writing it on the wire. To fully support this you would need access to the OutputStream. If we could pass a HttpClient to the HttpMethod then we could get access to the output stream via the getRequestOutputStream method. I don't understand the connection pooling argument. I thought it should be a user preference whether to have connection pooling. Any ideas on this? -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has
Re: streaming request body
John Keyes wrote: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? Many on the mailing list are aware of architectural limitations in the 2.0 design of HttpClient. This was a conscious compromise that we made many months ago to live within certain constraints, with the key trade-off being a final version of the 2.0 implementation sooner. This is apparently good choice for you too, in that you've started using it actively! This very issue you raise is on the list of possible tasks to address for the 3.0 release, as per someone else's post (see: http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=6015 ). See bug http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26070 referred to in that post. You should read the discussion there, as it also describes an implementation approach to get around the specific limitation. If you'd like to second the request to get the change in for 3.0, or provide additional work-arounds, or add to the discussion, you might look there. A patch to address the issue would be welcome, I'm sure. As for (b), this is again under your control via HttpMethod.getResponseBodyAsStream(). As with (a), you can also invoke HttpClient such that it does cache the entire contents (HttpMethod.getResponseBodyAsString() ). In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. I have thought of many requests. I still maintain it is possible. Your argument may be that it requires more coding on your part for it to work well, or that it requires massive disk caching, which could dramatically affect performance. I don't disagree. Connection pooling is only part of the concern. HttpClient supports HTTP 1.1 persistent connections. It doesn't expose the underlying socket's InputStream and OutputStream. If it did, it cound not ensure that persistent connections work properly. I still don't see the problem. The OutputStream and InputStream can be wrapped so there is no loss of control. Why do you think control would be lost? We're saying the same thing here. I'm saying they're not exposed, and you're saying they could be wrapped, thus hiding them. Since they are already hidden, your issue would seem to be a problem with *how* they are exposed (or not). Again, comments and feedback or a patch for bug 26070 would be welcome. -Eric. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
On 24 Feb 2004, at 14:36, Stefan Dingfelder wrote: John Keyes schrieb: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Correct. But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. I am looking at avoiding using Sun's connection class as it buffers all of the content prior to writing to the wire. -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
On 24 Feb 2004, at 14:39, Ortwin Glück wrote: John Keyes wrote: In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. I am getting a little angry by now. C'mon man, we wrote this baby and we know very well what's possible with it. Ortwin, if I am wrong just correct me, maybe I just can't explain myself properly. So please don't tell us it can not do unbuffered requests. It's as simple as: InputStream dataStream = // get this from wherever, use a pipe or something PostMethod method = new PostMethod(/myservlet); method.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHU NKED) method.setRequestBody(dataStream); client.execute(method); My point here is that if I have X requests then there can be X * CONTENT_LENGTH_CHUNKED bytes in memory at one time. -John K If you still can not figure out how you can get an InputStream for your in-memory data, then please refer to other Java resources; but this is certainly not an issue with HttpClient. hope that helps Ortwin Glück - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
lots_snipped/ Again, comments and feedback or a patch for bug 26070 would be welcome. Okay, I'll investigate it more and see what I come up with, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
Ortwin Glück schrieb: Stefan Dingfelder wrote: I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Best Regards, Stefan must not be a file on disk should read does not have to be a file on disk (this is a common German mistake). But you are absolutely correct. There is no need to use the disk at all. A gotcha. Thanks for reading it though it contains that bug. And yes, you are perfectly right ;) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. John, Just to make sure I understand you correctly, you are saying that your application will be processing XXX,000 requests *concurrently*? What kind of application is it, if I may ask you? Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 16:59 To: Commons HttpClient Project Subject: Re: streaming request body On 24 Feb 2004, at 14:36, Stefan Dingfelder wrote: John Keyes schrieb: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Correct. But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. I am looking at avoiding using Sun's connection class as it buffers all of the content prior to writing to the wire. -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
John Keyes schrieb: On 24 Feb 2004, at 14:39, Ortwin Glück wrote: John Keyes wrote: In both cases, it is possible to get the behavior that you desire. Not it is not. Again think of XXX,000 of requests. I am getting a little angry by now. C'mon man, we wrote this baby and we know very well what's possible with it. Ortwin, if I am wrong just correct me, maybe I just can't explain myself properly. So please don't tell us it can not do unbuffered requests. It's as simple as: InputStream dataStream = // get this from wherever, use a pipe or something PostMethod method = new PostMethod(/myservlet); method.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHU NKED) method.setRequestBody(dataStream); client.execute(method); My point here is that if I have X requests then there can be X * CONTENT_LENGTH_CHUNKED bytes in memory at one time. -John K That is right for different origins. For streaming the same file you could think of sending one chunk to all before proceeding. But I guess the latter one is not your intention. IMHO it depends upon the data you are sending . If chunk size is e.g. 1k per request you will need that much memory. This in fact seems not too much for 100.000 users especially when you are doing user login and so on. Smaller chunks will require less memory but it will increase network traffic - not always a good idea. Alternativly you may make your own spezialized data handling that will send to x connections at a time for getting new data while the other have to wait (1k for each of the first 100 users than 1k for the next 100). Depends upon the time for preparing and managing it all. Regards, Stefan Dingfelder - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
On 24 Feb 2004, at 16:22, Kalnichevski, Oleg wrote: But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. John, Just to make sure I understand you correctly, you are saying that your application will be processing XXX,000 requests *concurrently*? What kind of application is it, if I may ask you? It's a SOAP processor - we just want to stop using the J2SDK connection class, hide the connection class behind an API (in case other impls come along) and work from there - we need to process around 70,000 requests a minute. -John K Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 16:59 To: Commons HttpClient Project Subject: Re: streaming request body On 24 Feb 2004, at 14:36, Stefan Dingfelder wrote: John Keyes schrieb: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Correct. But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. I am looking at avoiding using Sun's connection class as it buffers all of the content prior to writing to the wire. -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
come along) and work from there - we need to process around 70,000 requests a minute. But not concurrently, right? So, the memory overhead is (No of concurrent connections) * (buffer size). Even if you had 1,000 concurrent SOAP requests, with 2K buffer you would still end up with 2,048 * 1,000 bytes. Allow me to speculate that even a fairly modern PDA would be able to afford that amount of memory overhead. Am I missing something too? Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 17:29 To: Commons HttpClient Project Subject: Re: streaming request body On 24 Feb 2004, at 16:22, Kalnichevski, Oleg wrote: But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. John, Just to make sure I understand you correctly, you are saying that your application will be processing XXX,000 requests *concurrently*? What kind of application is it, if I may ask you? It's a SOAP processor - we just want to stop using the J2SDK connection class, hide the connection class behind an API (in case other impls come along) and work from there - we need to process around 70,000 requests a minute. -John K Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 24, 2004 16:59 To: Commons HttpClient Project Subject: Re: streaming request body On 24 Feb 2004, at 14:36, Stefan Dingfelder wrote: John Keyes schrieb: For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. So you think buffering all requests to disk to support streaming is an acceptable solution? If I am dealing with XXX,000 of requests that sure as hell would suck with all the disk I/O going on. Does this not suggest that there is a problem with the architecture? I am missing something here from both views. Maybe I am wrong but as I understand it, I can provide any InputStream. And that must not be a file on disk (which I dislike also - except for large files or live streams that cannot be put to memory in total) but can be any object in memory. So in case of sending it there should be no problem.. Correct? Correct. But a *segment* will be held in memory prior to writing to the output stream though. For XXX,000 requests I think this is an unreasonable memory overhead. I am looking at avoiding using Sun's connection class as it buffers all of the content prior to writing to the wire. -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete
Re: streaming request body
John Keyes wrote: method.setRequestBody( ... ); client.execute( method ); So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. setRequestBody accepts an InputStream. You could use a piped stream to provide your data. Unfortunately HttpClient can not give you the OutputStream of the underlying socket because of the connection pooling. Please also note that an unbuffered request can not automatically be retried. HTH Ortwin Glück - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/httpclient/src/examples/UnbufferedPost.java?content-type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/httpclient/src/examples/ChunkEncodedPost.java?content-type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
Thanks for the information guys. -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
Guys, A colleague pointed out to me that this does not in fact resolve the situation. The solutions pointed out allow me to read the attachment as a stream. The contents are still held in memory prior to writing it on the wire. To fully support this you would need access to the OutputStream. If we could pass a HttpClient to the HttpMethod then we could get access to the output stream via the getRequestOutputStream method. I don't understand the connection pooling argument. I thought it should be a user preference whether to have connection pooling. Any ideas on this? -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: streaming request body
Have a look here: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26070 Thanks Moh -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 3:49 PM To: Commons HttpClient Project Subject: Re: streaming request body Guys, A colleague pointed out to me that this does not in fact resolve the situation. The solutions pointed out allow me to read the attachment as a stream. The contents are still held in memory prior to writing it on the wire. To fully support this you would need access to the OutputStream. If we could pass a HttpClient to the HttpMethod then we could get access to the output stream via the getRequestOutputStream method. I don't understand the connection pooling argument. I thought it should be a user preference whether to have connection pooling. Any ideas on this? -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: streaming request body
John, Two separate questions: a) sending a large post/put request, without buffering it in memory. b) reading a large response to a request. For (a), Oleg's response is correct. You might easily be confused, in the sense that HttpClient's API inverts the control. It is not that you write to an OutputStream to send your data, it is that you provide HttpClient with an InputStream, and it reads that stream and sends the data. HttpClient is designed to accomodate your concern, and if your configuration is correct (as per the examples), it will not buffer the entire contents of your InputStream, but rather read it and send it in small chunks. As another post points you, you may still have to buffer what you're sending to *disk*, but not to memory. As for (b), this is again under your control via HttpMethod.getResponseBodyAsStream(). As with (a), you can also invoke HttpClient such that it does cache the entire contents (HttpMethod.getResponseBodyAsString() ). In both cases, it is possible to get the behavior that you desire. Connection pooling is only part of the concern. HttpClient supports HTTP 1.1 persistent connections. It doesn't expose the underlying socket's InputStream and OutputStream. If it did, it cound not ensure that persistent connections work properly. -Eric. John Keyes wrote: Guys, A colleague pointed out to me that this does not in fact resolve the situation. The solutions pointed out allow me to read the attachment as a stream. The contents are still held in memory prior to writing it on the wire. To fully support this you would need access to the OutputStream. If we could pass a HttpClient to the HttpMethod then we could get access to the output stream via the getRequestOutputStream method. I don't understand the connection pooling argument. I thought it should be a user preference whether to have connection pooling. Any ideas on this? -John K On 23 Feb 2004, at 13:02, Kalnichevski, Oleg wrote: John, HttpClient's entity enclosing methods (POST, PUT) do support content streaming when (1) the content length does not need to be automatically calculated or (2) chunk-encoding is used Please refer to the following sample applications for details Unbuffered post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/UnbufferedPost.java?content- type=text%2Fplainrev=1.2.2.1 Chunk-encoded post: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-commons/ httpclient/src/examples/ChunkEncodedPost.java?content- type=text%2Fplainrev=1.4.2.1 Hope this helps Oleg -Original Message- From: John Keyes [mailto:[EMAIL PROTECTED] Sent: Monday, February 23, 2004 13:54 To: [EMAIL PROTECTED] Subject: streaming request body Hi, I notice you have separated out the functions of the connection and the content creation. So the code must be something like HttpClient client = new HttpClient( url ); HttpMethod method = new GetMethod(); method.setRequestHeader( ... ); ... method.setRequestBody( ... ); client.execute( method ); If I want to send a large attachment and I don't want it all to be in memory then I can't do it. The issue is that you have to write your data to the HttpMethod. The HttpMethod doesn't know where to then write this data until you call execute and pass the client which has the connection to write to. So there isn't really a way around this because of the separation of the connection from the HttpMethod. So my question is, is there a way to stream the request body rather than having to store the request in memory prior to writing it on the wire. Thanks, -John K - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] *** The information in this email is confidential and may be legally privileged. Access to this email by anyone other than the intended addressee is unauthorized. If you are not the intended recipient of this message, any review, disclosure, copying, distribution, retention, or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. If you are not the intended recipient, please reply to or forward a copy of this message to the sender and delete the message, any attachments, and any copies thereof from your system. *** - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED]