Hi Tomi, On 10/13/06, Tomi NA <[EMAIL PROTECTED]> wrote:
Guruprasad, please use "reply-all" so your messages end up on the list as well. As far as ntlmaps is concerned, you can read about it here http://ntlmaps.sourceforge.net/ od download it here http://sourceforge.net/project/showfiles.php?group_id=69259&package_id=68110&release_id=303755. If you're using linux chances are all you need to do is issue a command like "emerge ntlmaps" or "apt-get install ntlmaps". Read the ntlmaps documentation on how you set it up or just follow the comments in its config file: /etc/ntlmaps/server.cfg.
From internal tests with ntlmaps + Nutch the conclusion we came to was
that though it "kinda-works" it puts a huge load on the Nutch server as ntlmaps is a major memory-hog and the mixture of the two leads to performance issues. For a PoC this will do but for production-deployments I would not suggest one goes the ntlmaps way. An alternate would be to have a separate ntlmaps-server ,a dedicated machine acting as the NTLM proxy for the Nutch-box which sits behind it. The right way would be to use the in-built authentication features of Nutch for Auth based crawling. -Toufeeq -- blog @ http://toufeeq.net
