Hey guys, one more addition, we're not using DFS. We got a single XP box with NFTS (so no distributed index).
Hope this helps, greetings.. JoostRuiter wrote: > > Ok thanks for all your input guys! I`ll discuss this with my co-worker. > Dennis, what more information do you need? > > Thanks everyone! > > > Briggs wrote: >> >> One more thing... >> >> Are you using a distributed index? If this is so, you do not want to >> do this; indexes should be local to the machine that is being >> searched. >> >> On 4/23/07, Dennis Kubes <[EMAIL PROTECTED]> wrote: >>> Without more information this sounds like your tomcat search >>> nutch-site.xml file is setup to use the DFS rather than the local file >>> system. Remember that processing jobs occurs on the DFS but for >>> searching, indexes are best moved to the local file system. >>> >>> Dennis Kubes >>> >>> JoostRuiter wrote: >>> > Hi All, >>> > >>> > First off, I'm quite the noob when it comes to Nutch, so don't bash me >>> if >>> > the following is an enormously stupid question. >>> > >>> > We're using Nutch on a P4 Duo Core system (800mhz fsb) with 4gig RAM >>> and a >>> > 500gig SATA (3gig/sec) HD. We indexed 350 000 pages into 1 segment of >>> 15gig. >>> > >>> > >>> > Performance is really poor, if we do get search results it will take >>> > multiple minutes. When the query is longer we are getting the >>> following: >>> > >>> > "java.lang.OutOfMemoryError: Java heap memory" >>> > >>> > What we have tried to improve on this: >>> > - Slice the segments into smaller chuncks (max: 50000 url/per seg) >>> > - Set io.map.index.skip to 8 >>> > - Set indexer.termIndexInterval to 1024 >>> > - Cluster with Hadoop (4 nodes to search) >>> > >>> > Any ideas? Missing information? Please let me know, this is my >>> graduation >>> > internship and I would really like to get a good grade ;) >>> >> >> >> -- >> "Conscious decisions by conscious minds are what make reality real" >> >> > > -- View this message in context: http://www.nabble.com/Perfomance-problems-and-segmenting-tf3631982.html#a10155864 Sent from the Nutch - Dev mailing list archive at Nabble.com. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers