Hi Tomi,
On 10/22/06, Tomi NA <[EMAIL PROTECTED]> wrote:
Toufeeq, could you say anything more on the topic of nutch in-built
NTLM authentication support?
My work has been limited to 0.7.X version of nutch. Below are some of
my findings..
The file
src/plugin/protocol-httpclient/src/java/org/
Oops..
Sorry about the mail below. Did not know reply-to munging was being done. :)
-Toufeeq
On 10/30/06, Toufeeq Hussain <[EMAIL PROTECTED]> wrote:
dude..
You got Nutch working with NTLM ?
-Toufeeq
On 10/12/06, Guruprasad Iyer <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I need to know how to craw
dude..
You got Nutch working with NTLM ?
-Toufeeq
On 10/12/06, Guruprasad Iyer <[EMAIL PROTECTED]> wrote:
Hi,
I need to know how to crawl (intranet) sites which require authentication.
One suggestion was that I replace protocol-http with protocol-httpclient in
the value field of plugin.includ
2006/10/14, Tomi NA <[EMAIL PROTECTED]>:
2006/10/14, Toufeeq Hussain <[EMAIL PROTECTED]>:
> From internal tests with ntlmaps + Nutch the conclusion we came to was
> that though it "kinda-works" it puts a huge load on the Nutch server
> as ntlmaps is a major memory-hog and the mixture of the two
Yeah seriously - if NTLM auth (or HTTP Basic for that matter) is supported
natively by Nutch, I'd love to read the documentation on it!
-- Jim
On 10/14/06, Tomi NA <[EMAIL PROTECTED]> wrote:
2006/10/14, Toufeeq Hussain <[EMAIL PROTECTED]>:
> From internal tests with ntlmaps + Nutch the conclu
2006/10/14, Toufeeq Hussain <[EMAIL PROTECTED]>:
From internal tests with ntlmaps + Nutch the conclusion we came to was
that though it "kinda-works" it puts a huge load on the Nutch server
as ntlmaps is a major memory-hog and the mixture of the two leads to
performance issues. For a PoC this wil
Hi Tomi,
On 10/13/06, Tomi NA <[EMAIL PROTECTED]> wrote:
Guruprasad,
please use "reply-all" so your messages end up on the list as well. As
far as ntlmaps is concerned, you can read about it here
http://ntlmaps.sourceforge.net/ od download it here
http://sourceforge.net/project/showfiles.php?gro
2006/10/13, Guruprasad Iyer <[EMAIL PROTECTED]>:
Hi Tomi,
"using a ntlmaps proxy"
How do I get this proxy?
"You tell nutch to use the proxy and you provide the proxy with adequate
access priviledges."
How do I do this? Can you elaborate?
I am a new Nutch user and am very much in the learning p
Switching from protocol-http to protocol-httpclient will help in
crawling secured sites (https).
If your site supports HTTP Basic authentication, then you can modify
the HTTP class in the protocol-httpclient plugin.
These are minor changes in the configureClient method:
client.getParams().setAu
2006/10/12, Guruprasad Iyer <[EMAIL PROTECTED]>:
Hi,
I need to know how to crawl (intranet) sites which require authentication.
One suggestion was that I replace protocol-http with protocol-httpclient in
the value field of plugin.includes tag in the nutch-default.xml file.
However, this did not
Standard community response: it's not built in, but you could write an
extension!
(I asked this myself a few months back).
-- Jim
On 10/12/06, Guruprasad Iyer <[EMAIL PROTECTED]> wrote:
Hi,
I need to know how to crawl (intranet) sites which require authentication.
One suggestion was that I r
Hi,
I need to know how to crawl (intranet) sites which require authentication.
One suggestion was that I replace protocol-http with protocol-httpclient in
the value field of plugin.includes tag in the nutch-default.xml file.
However, this did not solve the problem.
Can you help me out on this? Th
12 matches
Mail list logo