When used with inexpensive commodity hardware (dual core desktop-class CPU,
8GB RAM and fast hard drive) Nutch can usually handle around 20 million
pages per search node. Using simple math that would mean it would take 50
nodes to serve a 1 billion page index.
This type of setup would likely
Laurent Laborde wrote:
On Fri, Dec 26, 2008 at 8:51 AM, buddha1021 buddha1...@yahoo.cn wrote:
thank you very much!!!
but ,in this condition,how many bandwiths (whitch would be used for nutch
to
access to the internet for people to search! and also for nutch to fetch
pages!) would
Your going to be using a lot of bandwidth, start talking terabytes not
gigabytes.
I'm not sure what type of connections you have available in terms of
capacity and price viability, but your almost certain to want this type of
setup placed in a data centre.
Try to find a provider that offers
hi dennis:
in your opinion,which is the most important reason for the fast search
speed of google :
1 google's programme(Code) is very excellence. or
2 google put all the indexes into the RAM.
which is the most important reason?
and,if nutch put all the indexes into RAM,can nutch's search
it.
From: Dennis Kubes ku...@apache.org
To: nutch-user@lucene.apache.org
Sent: Thursday, January 8, 2009 10:22:09 PM
Subject: Re: Search performance for large indexes (100M docs)
buddha1021 wrote:
hi dennis:
in your opinion,which is the most
Hi Sean Dean-3:
“I have one index just above 20 million that takes up about 29GB in space.”
It's very very great! The difficult for me is the size of the indexes! It's
too large! If 20 million's indexes is only ~30G, the difficulty can be
solve!
If your idear become a reality,search 100 million
hi :
Does nutch use j2se sdk or j2ee sdk? sun's or ibm's?
and which server is the best for nutch? tomcat or WebSphere or the other?
thank you!!!
--
View this message in context:
http://www.nabble.com/nutch-jdk%EF%BC%9F-tp21908182p21908182.html
Sent from the Nutch - User mailing list archive at
Sami Siren-2 wrote:
Dennis Kubes wrote:
jdk1.5 or better, I am currently on jdk1.6 sun. For the webapp we use
tomcat but should run on any jsp/servlet container, websphere included.
I think you need 1.6 now (for trunk) since we use Hadoop 0.19.
--
Sami Siren
which sdk will be
Sami Siren-2 wrote:
buddha1021 wrote:
Sami Siren-2 wrote:
Dennis Kubes wrote:
jdk1.5 or better, I am currently on jdk1.6 sun. For the webapp we use
tomcat but should run on any jsp/servlet container, websphere included.
I think you need 1.6 now (for trunk) since we
hi:
How to build clusters to search web ,through nutch?
Any document ?
thank you!
--
View this message in context:
http://www.nabble.com/How-to-build-clusters--tp22020673p22020673.html
Sent from the Nutch - User mailing list archive at Nabble.com.
hi:
How many kb is a page's index? on average!
and when build distribute search clusters, the node is 1u server? or the
common pc that people daily used on windiws? which can maximize performance?
--
View this message in context:
hi:
Is there some web search engine based on nutch ?
I mean that the web search engine like google ,but not the Vertical search
engine !
If anyone know, please list out!
I want have a look at them!
If there is non,what is the reason ?
thank you !
--
View this message in context:
hi:
Are there the functions of More Like This and Spell Checking in the
nutch?
These two functions are very important for the search engine in my opinion!
If there is non, who can add the patches ? It will be very great! !!
thanks very much!
--
View this message in context:
hi:
you should change to this:
1:import org.apache.nutch.parse.ParseResult;
return new ParseStatus(ParseStatus.FAILED,
ParseStatus.FAILED_EXCEPTION,
e.toString()).getEmptyParseResult(content.getUrl(), getConf());
ParseResult
14 matches
Mail list logo