Folks, Have you found any solutions?
I am facing the same issue. Thanks, YT. Thet Tomi N/A wrote: > > On 9/6/06, Andrei Hajdukewycz <[email protected]> wrote: >> Another problem I've noticed is that it seems the db grows *rapidly* with >> each successive recrawl. Mine started at 379MB, and it seems to increase >> by roughly 350MB every time I run a recrawl, despite there not being >> anywhere near that many additional pages. >> >> This seems like a pretty severe problem, honestly, obviously there's a >> lot of duplicated data in the segments. > > I have the same problem: my index grew from 1.5GB after the original > crawl to over 5GB(!) after the recrawl...from the looks of it, I might > as well crawl anew every time. :\ > > t.n.a. > > -- View this message in context: http://lucene.472066.n3.nabble.com/Recrawling-tp609663p1915415.html Sent from the Nutch - User mailing list archive at Nabble.com.

