Hi John,

Let me clean up the code a bit and then post it.  If my understanding of the HTTP 
protocol
is correct, the client requests a resource and if credentials are required the server
returns a 401 with WWW-Authenticate header(s).  If the client can authenticate it will
re-request the page supplying credentials.  I'm guessing that the client can supply the
credentials on the first request but that may be a bit too soon without knowing the 
proper
authentication type and/or realm.

I do think that the authentication information can be for both hosts and realms and or
combinations but I think that realm is probably the first we should target (although we
should build the code so that it can handle either).  The benefit of a realm is that it
can cover multiple hosts -- if an organization has setup their servers this way.

I'll try to post the code today.

Matt



                                                                                       
                                                                           
                    [EMAIL PROTECTED]                                                  
                                                                             
                    Sent by:                                 To:     [EMAIL PROTECTED] 
                                                                               
                    [EMAIL PROTECTED]       cc:     [EMAIL PROTECTED], [EMAIL 
PROTECTED]                                      
                    eforge.net                               Subject:     Re: 
[Nutch-dev] Http Protocol                                                           
                                                                                       
                                                                           
                                                                                       
                                                                           
                    07/06/2004 11:59 PM                                                
                                                                           
                    Please respond to dev                                              
                                                                           
                                                                                       
                                                                           




Matt,

On Fri, Jul 02, 2004 at 10:37:32PM -0700, Matt Tencati wrote:
> I've been interested in using Nutch in a corporate environment where most content
requires
> authentication.  I've begun implementing the changes required to include an
> HttpAuthentication set of interfaces and classes in order to support this (my initial
plan
> is to key off realms).  However, I have found an issue in the implementation of
> Content.java (and subclasses) which may not make this process as clean as possible.  
> The
> metadata information is stored via Properties which implements Map only allows a 
> single
> value for a given key.  Authentication allows for multiple WWW-Authenticate headers 
> so
> that the client can create a new request and choose any of the given challenges as 
> the
> method to authenticate.
>
> I have reviewed the HTTP protocol (RFC 1945) and it does allow for multiple headers
using
> the same name - which makes me think that there may be other headers (now or in the
> future) that would require multiple values.  I have created a class called
> MultipleProperties which will handle this however it breaks the contract of the Map
> interface.

Hi, Matt,

Could you post your code? Does not have to be a working patch.
It will be easier to discuss.

One other thing: is this HttpAuthentication information in metadata
of every page? If so, may be redundant for large site? Need to be host based
or realm based?

John


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers





-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to