If I understand correctly, You have situation where you have a large main index and then you create small indexes and finally merge to the main index. It can happen that half way through merging, the system crashed and the index got corrupted. I do not think in this case you can use my solution.
What I am trying to do is to remove a corrupt segment and associated files from the index folder, not trying to fix a corrupt segment. This way atleast I can add new documents to the index. Of cource I am sure I didn't loose anything because my index file size was actually 0 bytes. Thanks, George --- Karthik N S <[EMAIL PROTECTED]> wrote: > Hi > > George > > Do u think ,the same would work for MERGED > Indexes.... > Please Can u suggest a solution. > > > Karthik > > -----Original Message----- > From: Honey George [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 19, 2004 2:08 PM > To: Lucene Users List > Subject: RE: Restoring a corrupt index > > > This is what I did. > > There are 2 classes in the lucene source which are > not > public and therefore cannot be accessed from outside > the package. The classes are > 1. org.apache.lucene.index.SegmentInfos > - collection of segments > 2. org.apache.lucene.index.SegmentInfo > -represents a sigle segment > > I took these two files and moved to a separate > folder. > Then created a class with the following code > fragment. > > public void displaySegments(String indexDir) > throws Exception > { > Directory dir = > (Directory)FSDirectory.getDirectory(indexDir, > false); > SegmentInfos segments = new SegmentInfos(); > segments.read(dir); > > StringBuffer str = new StringBuffer(); > int size = segments.size(); > str.append("Index Dir = " + indexDir ); > str.append("\nTotal Number of Segments " + > size); > > str.append("\n--------------------------------------"); > for(int i=0;i<size;i++) > { > str.append("\n"); > str.append((i+1) + ". "); > > str.append(((SegmentInfo)segments.get(i)).name); > } > > str.append("\n--------------------------------------"); > > System.out.println(str.toString()); > } > > > public void deleteSegment(String indexDir, > String > segmentName) > throws Exception > { > Directory dir = > (Directory)FSDirectory.getDirectory(indexDir, > false); > SegmentInfos segments = new SegmentInfos(); > segments.read(dir); > > int size = segments.size(); > String name = null; > boolean found = false; > for(int i=0;i<size;i++) > { > name = > ((SegmentInfo)segments.get(i)).name; > if (segmentName.equals(name)) > { > found = true; > segments.remove(i); > System.out.println("Deleted the > segment with name " + name > + "from the segments file"); > break; > } > } > if (found) > { > segments.write(dir); > } > else > { > System.out.println("Invalid segment > name: > " + segmentName); > } > } > > Use the displaySegments() method to display the > segments and deleteSegment to delete the corrupt > segment. > > Thanks, > George > > --- Karthik N S <[EMAIL PROTECTED]> wrote: > > Hi Guys.... > > > > > > In Our Situation we would be indexing Million > & > > Millions of Information > > documents > > > > with Huge Giga Bytes of Data Indexed and > > finally would be put into a > > MERGED INDEX, Categorized accordingly. > > > > There may be a possibility of Corruption, So > > Please do post the code > > reffrals.... > > > > > > Thx > > Karthik > > > > > > -----Original Message----- > > From: Honey George [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, August 18, 2004 5:51 PM > > To: Lucene Users List > > Subject: Re: Restoring a corrupt index > > > > > > Thanks Erik, that worked. I was able to remove the > > corrupt index and now it looks like the index is > OK. > > I > > was able to view the number of documents in the > > index. > > Before that I was getting the error, > > java.io.IOException: read past EOF > > > > I am yet to find out how my index got corrupted. > > There > > is another thread going on about this topic, > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg03165.html > > > > If anybody is facing similar problem and is > > interested > > in the code I can post it here. > > > > Thanks, > > George > > > > > > > > --- Erik Hatcher <[EMAIL PROTECTED]> > > wrote: > > > The details of the segments file (and all the > > > others) is freely > > > available here: > > > > > > > > > > > > http://jakarta.apache.org/lucene/docs/fileformats.html > > > > > > Also, there is Java code in Lucene, of course, > > that > > > manipulates the > > > segments file which could be leveraged (although > > > probably package > > > scoped and not easily usable in a standalone > > repair > > > tool). > > > > > > Erik > > > > > > > > > On Aug 18, 2004, at 6:50 AM, Honey George wrote: > > > > > > > Looks like problem is not with the hexeditor, > > even > > > in > > > > the ultraedit(i had access to a windows box) I > > am > > > > seeing the same display. The problem is I am > not > > > able > > > > to identify where a record starts with just 1 > > > record > > > > in the file. > > > > > === message truncated === ___________________________________________________________ALL-NEW Yahoo! Messenger - all new features - even more fun! http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
