Hey guys,

one more addition, we're not using DFS. We got a single XP box with NFTS (so
no distributed index).

Hope this helps, greetings..


JoostRuiter wrote:
> 
> Ok thanks for all your input guys! I`ll discuss this with my co-worker.
> Dennis, what more information do you need?
> 
> Thanks everyone!
> 
> 
> Briggs wrote:
>> 
>> One more thing...
>> 
>> Are you using a distributed index?  If this is so, you do not want to
>> do this; indexes should be local to the machine that is being
>> searched.
>> 
>> On 4/23/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
>>> Without more information this sounds like your tomcat search
>>> nutch-site.xml file is setup to use the DFS rather than the local file
>>> system.  Remember that processing jobs occurs on the DFS but for
>>> searching, indexes are best moved to the local file system.
>>>
>>> Dennis Kubes
>>>
>>> JoostRuiter wrote:
>>> > Hi All,
>>> >
>>> > First off, I'm quite the noob when it comes to Nutch, so don't bash me
>>> if
>>> > the following is an enormously stupid question.
>>> >
>>> > We're using Nutch on a P4 Duo Core system (800mhz fsb) with 4gig RAM
>>> and a
>>> > 500gig SATA (3gig/sec) HD. We indexed 350 000 pages into 1 segment of
>>> 15gig.
>>> >
>>> >
>>> > Performance is really poor, if we do get search results it will take
>>> > multiple minutes. When the query is longer we are getting the
>>> following:
>>> >
>>> > "java.lang.OutOfMemoryError: Java heap memory"
>>> >
>>> > What we have tried to improve on this:
>>> > - Slice the segments into smaller chuncks (max: 50000 url/per seg)
>>> > - Set io.map.index.skip to 8
>>> > - Set indexer.termIndexInterval to 1024
>>> > - Cluster with Hadoop (4 nodes to search)
>>> >
>>> > Any ideas? Missing information? Please let me know, this is my
>>> graduation
>>> > internship and I would really like to get a good grade ;)
>>>
>> 
>> 
>> -- 
>> "Conscious decisions by conscious minds are what make reality real"
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Perfomance-problems-and-segmenting-tf3631982.html#a10155864
Sent from the Nutch - Dev mailing list archive at Nabble.com.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to