Re: ATTN Open-source projects using HttpClient

stack Thu, 30 Sep 2004 09:17:33 -0700

Oleg Kalnichevski wrote:

We have already had a few reports regarding IBM JSSE semantical incompatibilities with Sun JSSE. It appears IBM JSSE implementation unlike Sun's does not like attempts to set socket parameters when the socket is closed. I believe it is clearly a bug in IBM JSSE but we can think of working it around in HttpClient.

That'd be grand if its possible (We like the IBM JVMs' speed and more detailed thread dumps). We used to subclass httpclient so we could do the below, moving the setting of the timeout till after the open. HttpClient 3.0 now sets timeout, etc., after the open seemingly so our subclass is no longer necessary (Hurray!).

           // HERITRIX: Moved this timeout to after connection.open.
           // connection.setSoTimeout(soTimeout);

           if (!connection.isOpen()) {
               connection.setConnectionTimeout(connectionTimeout);
               connection.open();
               // HERITRIX: Move socket timeout here.  It used to be done
               // before connection.open.
               connection.setSoTimeout(soTimeout);

...

+                inputStream = httpRecorder.inputWrap((InputStream)
+                    (new BufferedInputStream(socket.getInputStream(),
+                    inbuffersize)));
+                outputStream = httpRecorder.outputWrap((OutputStream)
+                    (new BufferedOutputStream(socket.getOutputStream(),
+                    outbuffersize)));
+            }
+            // END HERITRIX change.
+

What does exactly httpRecorder do? Probably we could think of a less intrusive way of getting the same thing done.

HttpRecorder duplicates all sent and received to files on disk.  It wraps the 
(buffered) socket streams with input/output streams that do the duplication.  
Subsequently, the file is fed to a set of processors to with as they wilt.  Link 
extraction is main task performed by processors.

We need to record what was sent over the wire preserving order and all bytes sent back 
and forth (We're trying to archive the web).  If there's a less intrusive way of 
getting what we need, we'd love to hear of it.

...

+ value = new StringBuffer(line.substring(colon + 1).trim()); } - name = line.substring(0, colon).trim(); - value = new StringBuffer(line.substring(colon + 1).trim()); + // END HERITRIX change. }
This is a known problem. Basically it appears there's no one right way
to parse HTTP status line and headers that fits all type of
applications. Our plan is to provide a plugin mechanism for custom HTTP
parsers in the version 4
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25468

Sounds good.

Thanks again for the software.
Yours,
St.Ack

Cheers,

Oleg

***************************************************************************************************
The information in this email is confidential and may be legally privileged.  Access 
to this email by anyone other than the intended addressee is unauthorized.  If you are 
not the intended recipient of this message, any review, disclosure, copying, 
distribution, retention, or any action taken or omitted to be taken in reliance on it 
is prohibited and may be unlawful.  If you are not the intended recipient, please 
reply to or forward a copy of this message to the sender and delete the message, any 
attachments, and any copies thereof from your system.
***************************************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: ATTN Open-source projects using HttpClient

Reply via email to