All right. Take a look to this output of the segread command:
060803 132735 PARSED? STARTED FINISHED COUNT DIR NAME 060803 132735 true 20060717-14:41:58 20060717-14:41:58 1 crawl-legislacao_copia/segments/20060717144154 060803 132735 true 20060717-14:42:03 20060717-14:43:22 77 crawl-legislacao_copia/segments/20060717144201 060803 132735 true 20060717-14:43:29 20060717-15:08:10 1464 crawl-legislacao_copia/segments/20060717144327 060803 132735 true 20060717-15:08:17 20060717-15:11:58 223 crawl-legislacao_copia/segments/20060717150815 060803 132736 true 20060718-09:02:56 20060718-09:03:10 5 crawl-legislacao_copia/segments/20060718090250 060803 132736 true 20060803-10:55:18 20060803-12:53:49 1541 crawl-legislacao_copia/segments/20060803105509 060803 132736 true 20060803-13:07:15 20060803-13:07:20 4 crawl-legislacao_copia/segments/20060803130707 060803 132736 TOTAL: 3315 entries in 7 segments. My db.default.fetch.interval is 15. Before I run a recrawl script I had 5 segments ( 200607* ) and the Index points to 1537 documents. After run the recrawl 2 segments was created and then the script index all. When I analyzed the index generated I see it had 1541 documents. But how can you see the segments 200607* are old and can be deleted. I done this: rm -rf segments/200607* Then I get de NPE. I right I must to re-index the 2 remain segments. I've done this. So, I analize again the index and it has only 1417! My questions: Why it occurs? How can I know which segments can be deleted? I hope you can help me On 8/3/06, Marko Bauhardt <[EMAIL PROTECTED]> wrote:
Hi, if you delete segments then be sure that you doesnt have an index from this segment. The segment contains the parsed content and the index is the index from this content. If you delete the segment and you doing a search on this index, a NPE occurs because no summary (parsed content) are found. HTH Marko Am 03.08.2006 um 16:33 schrieb Lourival Júnior: > Why when I delete some segments that reach the > db.default.fetcth.intervalthe search application gets the > nullPointerException? Periodically I have to > recrawl my Site. And delete old segments is a problem. Someone have a > suggestion? > > Regards > > -- > Lourival Junior > Universidade Federal do Pará > Curso de Bacharelado em Sistemas de Informação > http://www.ufpa.br/cbsi > Msn: [EMAIL PROTECTED]
-- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]