Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, > I don't know > enough about finer points of shutdown hooks to comment on the distinctio, > but my off the cuff assumption is that a shutdown hook would be a bad idea > ... in a long running program wouldn't thta keep the IndexWriter > from being GCed until shutdown? > Could be, haven't use them either... ...If IW.close() calls RT.removeShutdownHook() I think this should work. Doron
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
: I think finalize() is that not trustworthy, in that it may : never be called, e.g. in case GC happened to not collect the specific : object, : and so the way for programmers to guarantee execution of any code : at shutdown is with shutdown hooks, I guess this is that what you meant, i'm not suggesting that this be documented as a *reliable* garunteed way to get a commit, just as a safety net for nocie users. I don't know enough about finer points of shutdown hooks to comment on the distinctio, but my off the cuff assumption is that a shutdown hook would be a bad idea ... in a long running program wouldn't thta keep the IndexWriter from being GCed until shutdown? : > Yes. Totally unexpected magical behaviour. : > What if I didn't commit something on purporse? ... : Applications can call rollback() in this case. or more specificly along the lines of my original point: people who read the docs carefully are more likely to know about rollback and call it explicitly, or to see the autoClose option and explicitly set it to false (or use a constructor where it defualts to false) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
: I like Uwe's idea. As for Hoss's original suggestion, my initial : reaction is that if a user understands the need to set the option : in the first place, they're also more likely to understand the need : for close(). my intention was that if the user used a "novice" type API for getting an IndexWriter, it would default to "true" but any of hte non-trivial constructors where default to false. : > I am against all finalizer stuff, because it also lead to problems and is : > unreliable - we already removed all finalizer stuff in Lucene left over from generally i agree with you, you shouldn't *expect* finalizers to be called, but i'm not aware of any problems that can happen by using the finalizer as a safety net ... rmuir mentioned it could cause a JRE crash but i don't understand how that would happen. : > A comparison is relational databases with autocommit off. If I crash my app : > or don't correctly commit my stuff, it's also reverted on loose of : > connection or foreful shutdown of JDBC driver! Where is the difference? the difference is a lot of DBs do default to autocommit, and we not only don't have "autocommit" (or "autoclose" as i'm suggestion) as a defualt, we don't even offer it as an option. it just seems like the kind of thing that could easily bite someone in the ass that we could help prevent. not just in the caes of a person who writes their first Lucene app and doesn't know to call "close()" or "commit()" at all, but in the case of someone who has an app that works fine 90% of the time, but doesn't realize they have a stray code path where they aren't committing/closing properly ... so *most* of hte time their app works fine and all of their data is there, but sometimesfor reasons they can't understand, data is missing when they do searches (even though their indexing code logs that it was added successfully) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
I like Uwe's idea. As for Hoss's original suggestion, my initial reaction is that if a user understands the need to set the option in the first place, they're also more likely to understand the need for close(). FWIW Erick On Tue, Mar 22, 2011 at 8:15 AM, Uwe Schindler wrote: > Hi, > >> I know there were good reasons for eliminating the "autoCommit" >> functionality from IndexWriter, but threads like tis make me thing thta > even >> though "autoCommit" on flush/merge/whatever was bad, having an option >> for some sort of "autoClose" using a finalizer might by a good idea to > give >> new/novice users a safety net. >> >> In the case of totally successful normal operation, this would result in > one >> commit at GC (assuming the JVM calls the finalizer) and if there were any >> errors it should (if i understnad correclty) do an implicit rollback. >> >> Anyone see a downside? > > I am against all finalizer stuff, because it also lead to problems and is > unreliable - we already removed all finalizer stuff in Lucene left over from > early day, so we should not add them again. This error done by this user is > only done once, the second time this user will have a try...finally block > around his stuff. > > A comparison is relational databases with autocommit off. If I crash my app > or don't correctly commit my stuff, it's also reverted on loose of > connection or foreful shutdown of JDBC driver! Where is the difference? > > But I am for adding a recovery tool for uncommitted segments to CheckIndex. > I this this should not be too hard. Something like looking for cfs/other > filetypes and creating SegmentReaders that are then added using addIndex(). > > Uwe > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, > I know there were good reasons for eliminating the "autoCommit" > functionality from IndexWriter, but threads like tis make me thing thta even > though "autoCommit" on flush/merge/whatever was bad, having an option > for some sort of "autoClose" using a finalizer might by a good idea to give > new/novice users a safety net. > > In the case of totally successful normal operation, this would result in one > commit at GC (assuming the JVM calls the finalizer) and if there were any > errors it should (if i understnad correclty) do an implicit rollback. > > Anyone see a downside? I am against all finalizer stuff, because it also lead to problems and is unreliable - we already removed all finalizer stuff in Lucene left over from early day, so we should not add them again. This error done by this user is only done once, the second time this user will have a try...finally block around his stuff. A comparison is relational databases with autocommit off. If I crash my app or don't correctly commit my stuff, it's also reverted on loose of connection or foreful shutdown of JDBC driver! Where is the difference? But I am for adding a recovery tool for uncommitted segments to CheckIndex. I this this should not be too hard. Something like looking for cfs/other filetypes and creating SegmentReaders that are then added using addIndex(). Uwe - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
On Mon, Mar 21, 2011 at 11:21 PM, Chris Hostetter wrote: > > Anyone see a downside? > I don't think we should do anything serious in a gc finalizer. sounds like its asking for a JRE crash. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, > > I know there were good reasons for eliminating the "autoCommit" > > functionality from IndexWriter, but threads like tis make me thing thta > > even though "autoCommit" on flush/merge/whatever was bad, having an > option > > for some sort of "autoClose" using a finalizer might by a good idea to > > give new/novice users a safety net. > > > > In the case of totally successful normal operation, this would result in > > one commit at GC (assuming the JVM calls the finalizer) and if there were > > any errors it should (if i understnad correclty) do an implicit rollback. > > > > Anyone see a downside? > I think finalize() is that not trustworthy, in that it may never be called, e.g. in case GC happened to not collect the specific object, and so the way for programmers to guarantee execution of any code at shutdown is with shutdown hooks, I guess this is that what you meant, that Lucene would add a shutdown hook? I.e, each IndexWriter object opened for write would add its own method as a shutdown hook, so that at shutdown, that writer would check its state, and in case that it was not closed (and hence also not rolled-back) and has pending uncommitted changes, those changes would be committed, is this what you mean? I think it is almost okay - it would save the use case of this thread, but could still surprise someone... Perhaps there's a third option - "semi-commit"? - that is, with the proposed shutdown hook, iw commits without deleting the previous commit, and marks on dir that its state is "semi-commit" and so when that index is opened for read or write it would throw a special new exception that indicates this stare, and the caller, before continuing to use this index for either read or write would have to call either one of two new utility methods: - commitSemiCommit(Directory) - roolbackSemiCommit(Directory) (Perhaps better names, rollbackSelfCommit, rollbackPartialCommit, etc.) After that, it would be possible to open the index as usual. It seems to me that something like this can work. Not totally convinced that it is worth the effort...? > Yes. Totally unexpected magical behaviour. > What if I didn't commit something on purporse? > Applications can call rollback() in this case. Regards, Doron
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
On Tue, Mar 22, 2011 at 06:21, Chris Hostetter wrote: > > (replying to the dev list, see context below) > > : Unfortunately, you can't easily recover from this (except by > : reindexing your docs again). > : > : Failing to call IW.commit() or IW.close() means no segments file was > written... > > > I know there were good reasons for eliminating the "autoCommit" > functionality from IndexWriter, but threads like tis make me thing thta > even though "autoCommit" on flush/merge/whatever was bad, having an option > for some sort of "autoClose" using a finalizer might by a good idea to > give new/novice users a safety net. > > In the case of totally successful normal operation, this would result in > one commit at GC (assuming the JVM calls the finalizer) and if there were > any errors it should (if i understnad correclty) do an implicit rollback. > > Anyone see a downside? Yes. Totally unexpected magical behaviour. What if I didn't commit something on purporse? > ... > > : > I had a program running for 2 days to build an index for around 160 > million > : > text files, and after program ended, I tried searching the index and found > : > the index was not correctly built, *indexReader.numDocs()* returns 0. I > : > checked the index directory, it looked good, all the index data seemed to > be > : > there, the directory is 1.5 Gigabytes in size. > : > > : > I checked my code and found that I forgot to call > *indexWriter.optimize()*and > : > *indexWriter.close()*, I want to know if it is possible to > : > *re-optimize()*the index so I don't need to rebuild the whole index > : > from scratch? I don't > : > really want the program to take another 2 days. > > > -Hoss > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Kirill Zakharenko/Кирилл Захаренко E-Mail/Jabber: ear...@gmail.com Phone: +7 (495) 683-567-4 ICQ: 104465785 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
(replying to the dev list, see context below) : Unfortunately, you can't easily recover from this (except by : reindexing your docs again). : : Failing to call IW.commit() or IW.close() means no segments file was written... I know there were good reasons for eliminating the "autoCommit" functionality from IndexWriter, but threads like tis make me thing thta even though "autoCommit" on flush/merge/whatever was bad, having an option for some sort of "autoClose" using a finalizer might by a good idea to give new/novice users a safety net. In the case of totally successful normal operation, this would result in one commit at GC (assuming the JVM calls the finalizer) and if there were any errors it should (if i understnad correclty) do an implicit rollback. Anyone see a downside? ... : > I had a program running for 2 days to build an index for around 160 million : > text files, and after program ended, I tried searching the index and found : > the index was not correctly built, *indexReader.numDocs()* returns 0. I : > checked the index directory, it looked good, all the index data seemed to be : > there, the directory is 1.5 Gigabytes in size. : > : > I checked my code and found that I forgot to call *indexWriter.optimize()*and : > *indexWriter.close()*, I want to know if it is possible to : > *re-optimize()*the index so I don't need to rebuild the whole index : > from scratch? I don't : > really want the program to take another 2 days. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org