re's a place to vote to suggest that compiled versions still be
distributed, I vote for that.
Thanks.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & P
require a different command-line?
Thanks.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650
ct.
\dmc
PS: The robots.txt file shouldn't have any mention of a sitemap,
except possibly to include the URL.
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor &
here) to
the IP address where Nutch is running and the regular one to all
other IP addresses.
There may be other kludges available.
Hope this helps.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled
user-agents in the
http.robots.agents tag with an asterisk (*), i.e.:
http.robots.agents
my-robot,*
Hope this helps.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
09-09-09 15:46:58,659 INFO fetcher.Fetcher - -activeThreads=0
Thank you in advance,
bye,
Kranthi Reddy. B
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, Ne
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consultant: The Cole Group <http://colegroup.com/> F: (650) 475-8479
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consultant: The Cole Group <http://colegroup.com/> F: (650) 475-8479
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
which I just pass
along.
I hope my humble little effort helps someone.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consultant: The Cole Group <http://colegroup.com/> F: (650) 475-8479
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
Peter Wang tutorial should work fine, though you do need to have
Java 1.6 installed, as the Hadoop routines require it.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor
ports Java 1.6 on 10.5 Intel.
\dmc
PS: And I just used the standard Peter Wang tutorial for installing
Nutch on a Mac; just figure on using Terminal rather than Cygwin.
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
Dav
d, you can find an installer.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consult
nks.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consultant: The Cole Group <http://colegroup.com/> F: (650) 475-8479
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
722.
Thanks.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Coled...@colegroup.com
Editor & Publisher, NewsInc. <http://newsinc.net>V: (650) 557-2993
Consultant: The Cole Group <http://colegroup.com/> F: (650) 475-8479
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
ine up?
Alternately, is there a way to get basic HTTP authorization without
using httpclient-auth?
Your thoughts would be appreciated.
Thanks.
\dmc
--
*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+
David M. Cole
15 matches
Mail list logo