I meant ~182K files ... > Nick, could you provide additional info: > (1) Env info - Lucene version, Java version, OS, JVM args (e.g. -XmNNN), > etc... > (2) is this reproducible? By the file sizes there seem to be ~182 indexed > docs when the problem occur, so, if this is reproducible it would hopefully > not take too long. If reproducible, I wonder if you can also create it > without storing any field... (should go faster). > > - Doron > > "NIck P" <[EMAIL PROTECTED]> wrote on 10/10/2006 12:24:19: > > > Hi, i sent this 30 min ago and it didn't seem to go through so i'm > > trying again, i apologize if two copies finally arrive. > > > > I am working on the development of a product that is using Lucene. A > > corrupt index was reported by testers and it is in an odd state. > > The indexes are built in batches (to multiple ram indexes in parallel) > > and then eventually merged into a disk index with > > IndexWriter.addIndexes(Directory[]). > > Somehow the index got corrupted, there were no indications of a crash or > > errors in log. The failure in SegmentMerger.mergeNorms: > > private void mergeNorms() throws IOException { > > for (int i = 0; i < fieldInfos.size(); i++) { > > FieldInfo fi = fieldInfos.fieldInfo(i); > > if (fi.isIndexed && !fi.omitNorms) { > > IndexOutput output = directory.createOutput(segment + ".f" + i); > > try { > > for (int j = 0; j < readers.size(); j++) { > > IndexReader reader = (IndexReader) readers.elementAt(j); > > int maxDoc = reader.maxDoc(); > > byte[] input = new byte[maxDoc]; > > reader.norms(fi.name, input, 0); <==== ERROR HERE > > for (int k = 0; k < maxDoc; k++) { > > if (!reader.isDeleted(k)) { > > output.writeByte(input[k]); > > } > > } > > } > > } finally { > > output.close(); > > } > > } > > } > > } > > > > The problem is that the maxDoc() returned by the indexReader > > (FieldsReader in this case) is larger then the size, in bytes, of the > > norms file. then there is an error in IndexInput.read(byte[], int, > > int) because there is not enough data in file to read. > > Here is part of the directory listing (there are many stored fields of > > the same size so omitting all but first 3): > > > > -rw-r--r-- 1 icmadmin db2grp1 811 Sep 27 20:48 _a4.fnm > > -rw-r--r-- 1 icmadmin db2grp1 1451696 Sep 27 20:49 _a4.fdx > > -rw-r--r-- 1 icmadmin db2grp1 12736304 Sep 27 20:49 _a4.fdt > > -rw-r--r-- 1 icmadmin db2grp1 5648544509 Sep 27 21:30 _a4.prx > > -rw-r--r-- 1 icmadmin db2grp1 1695149231 Sep 27 21:30 _a4.frq > > -rw-r--r-- 1 icmadmin db2grp1 45688880 Sep 27 21:30 _a4.tis > > -rw-r--r-- 1 icmadmin db2grp1 673588 Sep 27 21:30 _a4.tii > > -rw-r--r-- 1 icmadmin db2grp1 181159 Sep 27 21:30 _a4.f2 > > -rw-r--r-- 1 icmadmin db2grp1 181159 Sep 27 21:30 _a4.f1 > > -rw-r--r-- 1 icmadmin db2grp1 181159 Sep 27 21:30 _a4.f0 > > > > from looking at the code the sizeof(.fdx)/8 should equal sizeof(.f0) > > but it doesn't in this case. > > > > any ideas? Also, I'm wasn't sure if this was more appropriate for dev > > or user so i guessed user. > > > > -Nick > > (programmer working @ ibm) > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]