Yes. Mike
On Wed, Feb 10, 2010 at 6:36 AM, Naama Kraus <naamakr...@gmail.com> wrote: > Do you mean by calling > > IndexWriter#*setInfoStream*(PrintStream > <http://java.sun.com/j2se/1.5/docs/api/java/io/PrintStream.html> > infoStream) > > ? > > Naama > > > On Mon, Feb 8, 2010 at 3:22 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hmmm... I think that means you're using the default data mode >> (ordered), which should properly preserve writes if the OS or machine >> crashes. >> >> And actually I was wrong before -- even if the mount had >> data=writeback, since you are "only" kill -9ing the process (not >> crashing the machine), the data mount option doesn't matter. That >> option only affects what happens on a crash... >> >> Can you work up a small example showing the problem? And if possible, >> turn on IndexWriter's infoStream, capture the output as you index up >> until the kill -9, and post that? >> >> Mike >> >> On Mon, Feb 8, 2010 at 3:57 AM, Michael McCandless >> <luc...@mikemccandless.com> wrote: >> > Thanks for sharing... >> > >> > Software RAID should be perfectly fine for Lucene, in general, unless >> > the mount is configured to ignore fsync (I think the "data=writeback" >> > mount option for ext3 does so on Linux). >> > >> > Can you check the mount options on your RAID filesystem? >> > >> > Mike >> > >> > On Mon, Feb 8, 2010 at 2:09 AM, Naama Kraus <naamakr...@gmail.com> >> wrote: >> >> Hi All, >> >> >> >> I am back to this one after some while. >> >> It appears the file system I was using resides on software RAID disks. I >> ran >> >> the same code on the same Linux machine, but on another file system >> residing >> >> on SCSI disks. I didn't observe the problem there. >> >> Both file systems are ext3. >> >> So I am guessing the problem relates to the RAID disks. >> >> >> >> I looked again at commit() API, and the following comment may be >> explaining: >> >> >> >> "Note that this operation calls Directory.sync on the index files. That >> call >> >> should not return until the file contents & metadata are on stable >> storage. >> >> For FSDirectory, this calls the OS's fsync. But, beware: some hardware >> >> devices may in fact cache writes even during fsync, and return before >> the >> >> bits are actually on stable storage, to give the appearance of faster >> >> performance. If you have such a device, and it does not have a battery >> >> backup (for example) then on power loss it may still lose data. Lucene >> >> cannot guarantee consistency on such devices." >> >> >> >> Well, for me, running on the SCSI disks is just fine, I wanted to anyway >> >> share my experience. >> >> >> >> Naama >> >> >> >> On Fri, Jan 8, 2010 at 12:09 AM, Naama Kraus <naamakr...@gmail.com> >> wrote: >> >> >> >>> Thanks all for the hints, I'll get back to my code and do some >> additional >> >>> checks. >> >>> Naama >> >>> >> >>> >> >>> On Thu, Jan 7, 2010 at 6:57 PM, Michael McCandless < >> >>> luc...@mikemccandless.com> wrote: >> >>> >> >>>> kill -9 is harsh, but, perfectly fine from Lucene's standpoint. >> >>>> Likewise if the OS or JVM crashes, power is suddenly lost, the index >> >>>> will just fallback to the last successful commit. What will cause >> >>>> corruption is if you have bit errors happening somewhere in the >> >>>> machine... or if two writers are accidentally allowed to be open on >> >>>> one index... then you're in trouble. >> >>>> >> >>>> What IO system (filesystem & hardware) are you using on Linux? >> >>>> Boiling down to a smallish test case can help to isolate the >> >>>> problem... >> >>>> >> >>>> Mike >> >>>> >> >>>> On Thu, Jan 7, 2010 at 11:51 AM, Erick Erickson < >> erickerick...@gmail.com> >> >>>> wrote: >> >>>> > Can you show us the code where you commit? >> >>>> > >> >>>> > And how do you kill your process? Kill -9 is...er...harsh.... >> >>>> > >> >>>> > Yeah, I'm wondering whether the index file size *stays* >> >>>> > changed after you kill you process. If it keeps its >> >>>> > growing on every run (after you kill your process >> >>>> > multiple times), then I'd suspect that you aren't >> >>>> > adding documents like you think you are. Perhaps >> >>>> > different fields, different analyzers, etc. >> >>>> > >> >>>> > Luke should show you the largest document by ID, >> >>>> > as well as document counts. Comparing changes >> >>>> > in the document count and the max doc ID should >> >>>> > tell you something... >> >>>> > >> >>>> > Is it possible that you are updating existing docs >> >>>> > rather than adding new ones? >> >>>> > >> >>>> > Best >> >>>> > Erick >> >>>> > >> >>>> > On Thu, Jan 7, 2010 at 10:41 AM, Naama Kraus <naamakr...@gmail.com> >> >>>> wrote: >> >>>> > >> >>>> >> Thanks dor the input. >> >>>> >> >> >>>> >> 1. While the process is running, I do see the index files growing >> on >> >>>> disk >> >>>> >> and the time stamps changing. Should I see a change in size right >> after >> >>>> >> killing the process, is that what you mean ? >> >>>> >> 2. Yes, same directory is being used for indexing and search. >> >>>> >> 3. Didn't try Luke, good idea. Though I wonder, the same code runs >> well >> >>>> on >> >>>> >> Windows. >> >>>> >> >> >>>> >> Naama >> >>>> >> >> >>>> >> On Thu, Jan 7, 2010 at 3:37 PM, Erick Erickson < >> >>>> erickerick...@gmail.com >> >>>> >> >wrote: >> >>>> >> >> >>>> >> > Several questions: >> >>>> >> > 1> are the index files larger after you kill your process? >> >>>> >> > Or have the timestamps changed? >> >>>> >> > 2> are you absolutely sure that your indexer, when you >> >>>> >> > add documents, is pointing at the same directory your >> >>>> >> > search is pointing to? >> >>>> >> > 3> Have you gotten a copy of Luke and examined your index >> >>>> >> > to see if, perhaps, your documents aren't being added the >> >>>> >> > way you think they are? >> >>>> >> > >> >>>> >> > Erick >> >>>> >> > >> >>>> >> > On Thu, Jan 7, 2010 at 7:13 AM, Naama Kraus < >> naamakr...@gmail.com> >> >>>> >> wrote: >> >>>> >> > >> >>>> >> > > Hi, >> >>>> >> > > >> >>>> >> > > I am using IndexWriter#commit() methods in my program to commit >> >>>> >> document >> >>>> >> > > additions to the index. I do that once in a while, after a >> bunch of >> >>>> >> > > documents were added. Since my indexing process is long, I want >> to >> >>>> make >> >>>> >> > > sure >> >>>> >> > > I don't loose too many additions in case of a crash. >> >>>> >> > > When running on Windows, things work as expected. But when >> running >> >>>> my >> >>>> >> > code >> >>>> >> > > on Linux, seems like commit() has no effect. If I kill my >> program >> >>>> and >> >>>> >> > then >> >>>> >> > > restart it, I don't see documents that I added and then >> committed >> >>>> (they >> >>>> >> > are >> >>>> >> > > not returned by a search operation). >> >>>> >> > > I am running Lucene 3.0.0 >> >>>> >> > > >> >>>> >> > > Can anyone help ? >> >>>> >> > > >> >>>> >> > > Thanks, Naama >> >>>> >> > > >> >>>> >> > > -- >> >>>> >> > > "If you want your children to be intelligent, read them fairy >> >>>> tales. If >> >>>> >> > you >> >>>> >> > > want them to be more intelligent, read them more fairy tales." >> >>>> >> > > "What really interests me is whether God had any choice in the >> >>>> creation >> >>>> >> > of >> >>>> >> > > the world." >> >>>> >> > > (Albert Einstein) >> >>>> >> > > >> >>>> >> > >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> -- >> >>>> >> "If you want your children to be intelligent, read them fairy >> tales. If >> >>>> you >> >>>> >> want them to be more intelligent, read them more fairy tales." >> >>>> >> "What really interests me is whether God had any choice in the >> creation >> >>>> of >> >>>> >> the world." >> >>>> >> (Albert Einstein) >> >>>> >> >> >>>> > >> >>>> >> >>>> --------------------------------------------------------------------- >> >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >>>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> "If you want your children to be intelligent, read them fairy tales. If >> you >> >>> want them to be more intelligent, read them more fairy tales." >> >>> "What really interests me is whether God had any choice in the creation >> of >> >>> the world." >> >>> (Albert Einstein) >> >>> >> >> >> >> >> >> >> >> -- >> >> "If you want your children to be intelligent, read them fairy tales. If >> you >> >> want them to be more intelligent, read them more fairy tales." >> >> "What really interests me is whether God had any choice in the creation >> of >> >> the world." >> >> "A table, a chair, a bowl of fruit and a violin; what else does a man >> need >> >> to be happy? " >> >> (Albert Einstein) >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > > -- > "If you want your children to be intelligent, read them fairy tales. If you > want them to be more intelligent, read them more fairy tales." > "What really interests me is whether God had any choice in the creation of > the world." > "A table, a chair, a bowl of fruit and a violin; what else does a man need > to be happy? " > (Albert Einstein) > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org