Re: How do I delete?
I've had success with deletion by running IndexReader.delete(int), then getting an IndexWriter and optimizing the directory. I don't know if that's the right way to do it or not. On Tue, 1 Feb 2005, Jim Lynch wrote: I've been merrily cooking along, thinking I was replacing documents when I haven't. My logic is to go through a batch of documents, get a field called reference which is unique build a term from it and delete it via the reader.delete() method. Then I close the reader and open a writer and reprocess the batch indexing all. Here is the delete and associated code: reader = IndexReader.open(database); Term t = new Term(reference,reference); try { reader.delete(t); } catch (Exception e) { System.out.println(Delete exception;+e); } except it isn't working. I tried to do a commt and a doCommit, but those are both protected. I do a reader.close() after processing the batch the first time. What am I missing? I don't get an exception. Reference is definitely a valid field, 'cause I print out the value at search time and compare to the doc and they are identical. Thanks, Jim. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How do I delete?
Well, in LuceneRAR, the delete by id code does exactly what I said: gets the indexreader, deletes the doc id, then it opens a writer and optimizes. Nothing else. On Tue, 1 Feb 2005, Jim Lynch wrote: Thanks, I'd try that, but I don't think it will make any difference. If I modify the code to not reindex the documents, no files in the index directory are touched, hence there is no record of the deletions anywhere. I checked the count coming back from the delete operation and it is zero. I even tried to delete another unique term with similar results. How does one call the commit method anyway? Isn't it automatically called? Jim. Joseph Ottinger wrote: I've had success with deletion by running IndexReader.delete(int), then getting an IndexWriter and optimizing the directory. I don't know if that's the right way to do it or not. On Tue, 1 Feb 2005, Jim Lynch wrote: I've been merrily cooking along, thinking I was replacing documents when I haven't. My logic is to go through a batch of documents, get a field called reference which is unique build a term from it and delete it via the reader.delete() method. Then I close the reader and open a writer and reprocess the batch indexing all. Here is the delete and associated code: reader = IndexReader.open(database); Term t = new Term(reference,reference); try { reader.delete(t); } catch (Exception e) { System.out.println(Delete exception;+e); } except it isn't working. I tried to do a commt and a doCommit, but those are both protected. I do a reader.close() after processing the batch the first time. What am I missing? I don't get an exception. Reference is definitely a valid field, 'cause I print out the value at search time and compare to the doc and they are identical. Thanks, Jim. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
LuceneRAR nearing first release
https://lucenerar.dev.java.net LuceneRAR is now working on two containers, verified: The J2EE 1.4 RI and Orion. Websphere testing is underway, with JBoss to follow. LuceneRAR is a resource adapter for Lucene, allowing J2EE components to look up an entry in a JNDI tree, using that reference to add and search for documents. It's much like RemoteSearcher would be, except using JNDI semantics for communication instead of RMI, which is a little more elegant in a J2EE environment (where JNDI communication is very common). LuceneRAR was created to allow J2EE components to legitimately use the filesystem indexes (for speed) while not violating J2EE's suggestion to not rely on filesystem access. It also allows distributed access to the index (as remote servers would simply establish a JNDI connection to the LuceneRAR home.) Please take a look at it, if you're interested; the feature set isn't complete, but it's workable. There's a sample application that allows creation, searches, and statistical data about the search included in the distribution. Any comments are welcomed. --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: LuceneRAR project announcement
On Wed, 19 Jan 2005, Erik Hatcher wrote: On Jan 19, 2005, at 2:27 PM, Joseph Ottinger wrote: After babbling endlessly about an RDMS directory and my lack of success with it, I've created a project on java.net to create a Lucene JCA component, to allow J2EE components to interact with a Lucene service. It's at https://lucenerar.dev.java.net/ currently. Could you elaborate on some use cases? Sure, and I'll pick the one that's been driving me along: I have a set of J2EE servers, all of which can generate new content for search, and all of which will be performing searches. They're on separate machines. Sharing directories isn't my idea of doing J2EE correctly. Therefore, I chose to represent Lucene as an enterprise service, one communicated to via a remote service instead, so that every module can communicate with Lucene without realising the communication layer... for the most part. Plus, I no longer violate my purist's sensibilities. What drove you to consider JCA rather than some other technique? I'm curious why it is important to get all J2EE with it rather than working with Lucene much more naturally at a lower level of abstraction. JCA allows me to provide it as a system service instead of as a dependency represented at each component layer. An EJB would have served almost as well, except an EJB has filesystem restrictions that a Connector does not. I briefly browsed the source tree from java.net and saw this comment in your Hits.java: This method loads a LuceneRAR hits object with its equivalent from the Apache Lucene Hits object. It basically walks the Lucene Hits object, copying values as it goes, so it may not be as light or fast as its Apache equivalent I'll say! Haha, it's good to see my propensity for understatement is still alive. :) The Hits object could CERTAINLY use optimization - callbacks into the connector would probably be acceptable, for example. The code you were looking at has a lot of other areas that are, um, surprisingly crippled as well. For example, the add() method... well, first, THAT's the signature. Yes, that's right. It adds constant text. Every time. Likewise, the super-flexible search() -- again, that's the signature. It searches for time. That's it. Nothing more. Nothing less. This is very much a first-cut can I get it working? version. I think, for very limited definitions of working, the answer is yes. I certainly don't think it's got that show-room floor gleam going for it yet. For large result sets, which are more often the norm than the exception for a search, you are going to take a huge performance hit doing something like this, not to mention possibly even killing the process as you run out of RAM. *nod* As stated, a callback would be far more preferable. Given that Lucene's internal Hits object is final and nonserializable, at least my client's Hit object gives me an opportunity to do that. JCA sounds like an unnecessary abstraction around Lucene - though I'm open to be convinced otherwise. I'm more than happy to talk about it. If I can fulfill my needs with no code, hey, that's great! I just haven't been able to successfully do so yet, and everyone to whom I've spoken who says that they HAVE managed... well, they've almost invariably done so by lowering the bar a great deal in order to accept what Lucene requires. I'm certainly not castigating those who've done this - in fact, in many ways, I'm very impressed. It's just something I'd prefer not to do, given any alternative. --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: LuceneRAR project announcement
First off, Erik, thank you for taking an interest in any way. As I've said before, I'm not trying to represent myelf as a Lucene expert, so having someone point out flaws is god. On Wed, 19 Jan 2005, Erik Hatcher wrote: Could you elaborate on some use cases? Sure, and I'll pick the one that's been driving me along: I have a set of J2EE servers, all of which can generate new content for search, and all of which will be performing searches. They're on separate machines. Sharing directories isn't my idea of doing J2EE correctly. doing J2EE correctly is a funny phrase. If sharing directories works and gets the job done right, on time, under budget, can be adjusted later if needed, and has been reasonably well tested, then you've done it right. And since its in Java and not on a cell phone, its basically J2EE. Absolutely. I'm not trying to make fun of pragmatic, working solutions. Nor am I sneering at those who've done it by sharing filesystems or whatever. Also, what about using Lucene over RMI using the RemoteSearchable facility built-in? Well, I'd prefer to avoid RMI. App servers typically have far better transport layers than raw RMI, internally, and JCA can leverage that. Therefore, I chose to represent Lucene as an enterprise service, one communicated to via a remote service instead, so that every module can communicate with Lucene without realising the communication layer... for the most part. And this is where I think the abstraction leaks. The Nutch project has a very scalable enterprise approach to this type of remote service also. *nod* I'll look it up. Plus, I no longer violate my purist's sensibilities. Ah, now we get to the real rationale! :) I'm not giving you, personally, a hard time, really ... but rather this purist approach, where purist means fitting into the acronyms under the J2EE umbrella. I've been there myself, read the specs, and cringed when I saw file system access from a session bean, and so on. Well, in all honesty, there IS a small factor of Gee, I can use an acronym here! involved. It's not ALL that's involved, of course - I think the connector's transparency might be a real benefit for others as well as satisfying my own I need a deployed component, and not a service I have to tune need. The Hits object could CERTAINLY use optimization - callbacks into the connector would probably be acceptable, for example. Gotcha. Yes, callbacks would be the right approach with this type of abstraction. Just as a general question... is it efficient to retrieve a Document by, uh, a sort of Lucene key? (Is there such a thing?) If there is, I can code up a callback procedure in almost no time. (There are some other issues to address first, but THAT would be easy to do.) JCA sounds like an unnecessary abstraction around Lucene - though I'm open to be convinced otherwise. I'm more than happy to talk about it. If I can fulfill my needs with no code, hey, that's great! Would RemoteSearchable get you closer to no code? Dunno, I'll investigate. I just haven't been able to successfully do so yet, and everyone to whom I've spoken who says that they HAVE managed... well, they've almost invariably done so by lowering the bar a great deal in order to accept what Lucene requires. I'm definitely a skeptic when it comes to generic layers on top of Lucene, though there is definitely a yearning for easier management of the lower-level details. On my part, too. If I coul get a Directory reliably and quickly using an RDMS, I'd have gone that route. I'll definitely follow your work with LuceneRAR closely and will do what I can to help out in this forum. So take my feedback as constructive criticism, but keep up the good work! Again, no problem - and thank you. The things you bring up are issues I might not be aware of, so it's good to see them and evaluate them. --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
SUggestions for remoting lucene?
I just realised that the Hits object isn't Serializable, although Document and Field are. I can easily build a Hits equivalent that *is* Serializable, but should that be on my end, or at the Lucene API level? --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
HELP! Directory is NOT getting closed!
*sigh* Yet again, I apologize. I'm generating altogether too much traffic here lately! I'm stuck. I have a custom Directory, and I *need* a callback point so I can clean up. There's a method for this: Directory.close(), which I've overridden. It never gets called! According to IndexWriter.java, line 246 (in 1.4.3's codebase), if closeDir is set, it's supposed to close the directory. That's fine - but that leads me to believe that for some reason, closeDir is *not* set. Why? Under what circumstances would this not be true, and under what circumstances would you NOT want to close the Directory? This is absolutely slaughtering my attempt at a Directory, because I need a single unit-of-work, and I need a place to commit it, when it's done. If I commit it inside the directory's innards, then the UOW gets corrupted (and looks like it's more than one atomic action, which is EXACTLY what I don't need.) --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: HELP! Directory is NOT getting closed!
On Wed, 12 Jan 2005, Morus Walter wrote: Joseph Ottinger writes: According to IndexWriter.java, line 246 (in 1.4.3's codebase), if closeDir is set, it's supposed to close the directory. That's fine - but that leads me to believe that for some reason, closeDir is *not* set. Why? Under what circumstances would this not be true, and under what circumstances would you NOT want to close the Directory? From the sources, you can see, that is is true only, if the directory is created by the IndexWriter itself. If you provide a directory to the IndexWriter you have to close it yourself. ARGH! (I've been saying that a lot lately!) Okay, I was looking at the sources but missed that. Thank you very much. *sigh* --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
IndexWriter failure leaves lock in place
I'm still working through making my own directory, based on JDBC (and yes, I know, there are some out there already, unsuitable for this reason or that reason.) One thing I've noticed is that the Lock procedure in IndexWriter is a little off, I think. My normal process on application startup is to get an IndexWriter, just to make sure an index is there. If I get an exception (FileNotFoundException for the FSDirectory, for example), I assume the index isn't created properly, so then I create a new IndexWriter set to create the index. With a file-based directory, that works well enough - and I realise there might be a better way to do it (but I don't know it yet.) However, the SQL-based directory leaves the lock. I think what's happening is that the IndexWriter constructor (IndexWriter.java:216 from 1.4.3's souce distribution) is obtaining the lock, but then the synchronized block (starting at line 227) gets an IOException from segmentInfos.read(directory), which throws an IOException - but the writeLock is never explicitly removed once it's obtained. I would think that a try/finally (or something even more predictable, like a try/catch tht rethrows the IOException after cleanup) would be appropriate to clear the lock *provided it's obtained* in the IndexWriter construction, and it'd make the code that I typically use work regardless of the specific directory I rely on. Now, to be sure, I'm VERY FAR from a Lucene expert; am I missing something? (I can contribute a patch if you'd like.) --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: IndexWriter failure leaves lock in place
On Mon, 10 Jan 2005, Erik Hatcher wrote: On Jan 10, 2005, at 8:26 AM, Joseph Ottinger wrote: With a file-based directory, that works well enough - and I realise there might be a better way to do it (but I don't know it yet.) How about using IndexReader.indexExists() instead? *blank stare* .. uh... because I didn't know it was there to look for it? :) :) :) Thanks. Would the change still be valid, though, just to catch morons who do what I did? --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lock obtain timed out from an MDB
If this is a stupid question, I deeply apologize. I'm stumped. I have a message-driven EJB using Lucene. In *every* case where the MDB is trying to create an index, I'm getting Lock obtain timed out. It's in org.apache.lucene.store.Lock.obtain(Lock.java:58), which the user list has referred to before - but I don't see how the suggestions there apply to what I'm trying to do. (It's creating a lock file in /var/tmp/ properly, from what I can see, so it's not write permissions, I imagine.) I set the infoStream in my index writer to System.out, but I don't see any extra information. I'm using a SQL-based Directory object, but I get the same problem if I refer to a file directly. Is there a way to override the Lock portably so that I can have the lock itself managed in an RDMS? (It's a J2EE project, so relying on file access is problematic; if the beans using lucene to write to the index are on multiple servers, multiple locks could exist anyway.) --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lock obtain timed out from an MDB
Sorry to reply to my own post, but I now have a greater understanding of PART of my problem - my SQLDirectory is not *quite* right, I think. So I'm rolling back to FSDirectory. Now, I have a servlet that writes to the filesystem to simplify things (as I'm not confident enough to debug the RDMS-based directory yet. That's a task for later, I think). The servlet says it successfully creates the index like so: try { open the index with create=false } catch (file not found) { open the index with create=true } index.optimize(); index.close(); Now, when I fire off any messages to the MDB, it yields the following: java.io.IOException: Lock obtain timed out: Lock@/var/tmp/lucene-d6b0a3281487d1bc4d169d00426f475d-write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:58) Now, this is on only two messages to the MDB, not just a flood of messages. Two handlers, so I expect a lock in one's case, but not the first MDB call - it should be the one causing the lock for the second one, if a lock exists at all. I've verified that when the servlet that initializes the index runs, a lock file is NOT present, but again, it looks like every message fired through looks for a lock and finds one, when I would think it wouldn't be there. What am I not understanding? On Thu, 6 Jan 2005, Joseph Ottinger wrote: If this is a stupid question, I deeply apologize. I'm stumped. I have a message-driven EJB using Lucene. In *every* case where the MDB is trying to create an index, I'm getting Lock obtain timed out. It's in org.apache.lucene.store.Lock.obtain(Lock.java:58), which the user list has referred to before - but I don't see how the suggestions there apply to what I'm trying to do. (It's creating a lock file in /var/tmp/ properly, from what I can see, so it's not write permissions, I imagine.) I set the infoStream in my index writer to System.out, but I don't see any extra information. I'm using a SQL-based Directory object, but I get the same problem if I refer to a file directly. Is there a way to override the Lock portably so that I can have the lock itself managed in an RDMS? (It's a J2EE project, so relying on file access is problematic; if the beans using lucene to write to the index are on multiple servers, multiple locks could exist anyway.) --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lock obtain timed out from an MDB
Well, I think I isolated the problem: stupid error on my part, I think. I was adding an indexed field that had, um, a value of null. Correcting that made the process go much more properly - although note that I haven't scaled up to have multiple elements to index. Good milestone, though. SHouldn't Lucene warn the user if they do something like this? On Thu, 6 Jan 2005, Erik Hatcher wrote: Do you have two threads simultaneously either writing or deleting from the index? Erik On Jan 6, 2005, at 9:27 AM, Joseph Ottinger wrote: Sorry to reply to my own post, but I now have a greater understanding of PART of my problem - my SQLDirectory is not *quite* right, I think. So I'm rolling back to FSDirectory. Now, I have a servlet that writes to the filesystem to simplify things (as I'm not confident enough to debug the RDMS-based directory yet. That's a task for later, I think). The servlet says it successfully creates the index like so: try { open the index with create=false } catch (file not found) { open the index with create=true } index.optimize(); index.close(); Now, when I fire off any messages to the MDB, it yields the following: java.io.IOException: Lock obtain timed out: Lock@/var/tmp/lucene-d6b0a3281487d1bc4d169d00426f475d-write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:58) Now, this is on only two messages to the MDB, not just a flood of messages. Two handlers, so I expect a lock in one's case, but not the first MDB call - it should be the one causing the lock for the second one, if a lock exists at all. I've verified that when the servlet that initializes the index runs, a lock file is NOT present, but again, it looks like every message fired through looks for a lock and finds one, when I would think it wouldn't be there. What am I not understanding? On Thu, 6 Jan 2005, Joseph Ottinger wrote: If this is a stupid question, I deeply apologize. I'm stumped. I have a message-driven EJB using Lucene. In *every* case where the MDB is trying to create an index, I'm getting Lock obtain timed out. It's in org.apache.lucene.store.Lock.obtain(Lock.java:58), which the user list has referred to before - but I don't see how the suggestions there apply to what I'm trying to do. (It's creating a lock file in /var/tmp/ properly, from what I can see, so it's not write permissions, I imagine.) I set the infoStream in my index writer to System.out, but I don't see any extra information. I'm using a SQL-based Directory object, but I get the same problem if I refer to a file directly. Is there a way to override the Lock portably so that I can have the lock itself managed in an RDMS? (It's a J2EE project, so relying on file access is problematic; if the beans using lucene to write to the index are on multiple servers, multiple locks could exist anyway.) -- - Joseph B. Ottinger http://enigmastation.com IT Consultant [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lock obtain timed out from an MDB
On Thu, 6 Jan 2005, Erik Hatcher wrote: On Jan 6, 2005, at 10:41 AM, Joseph Ottinger wrote: SHouldn't Lucene warn the user if they do something like this? When a user indexes a null? Or attempts to write to the index from two different IndexWriter instances? I believe you should get an NPE if you try index a null field value? No? Well, I'd agree - the lack of an exception was rather disturbing, considering how badly it destroyed Lucene for the application (requiring not only restart but cleanup as well.) I don't know Lucene well enough to say according to the code... but NOT adding the null managed to correct the problem entirely. --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How do you pronounce 'Lucene'?
I pronounce it lieu'-seen or loo'-seen, usually the latter because I'm lazy. On Mon, 11 Aug 2003, Danny Sofer wrote: ...and where does the name come from? we've already developed three way to say 'lucene' and we can't agree on which one we like best. somebody please help! many thanks, danny. === danny sofer t. 020 7378 6655 m. 0795 722 1632 www.kitsite.com - content management for websites - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --- Joseph B. Ottinger http://enigmastation.com IT Consultant[EMAIL PROTECTED] J2EE Editor - Java Developer's Journal [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
IndexReader.delete(int) not working for me
I've got a versioning content system where I want to replace documents in a lucene repository. To do so, according to the FAQ and the mailing list archives, I need to open an IndexReader, look for the document in question, delete it via the IndexReader, and then add it. This shouldn't replace the document per se - it should, however, free the index entry (for reuse by documents added later) as I understand it. It should also mark the document as deleted. A query still may return the document (again, as I understand it), requiring a filter to make sure deleted documents aren't returned. If I'm offbase in my understanding, I apologize - this is the best I can tell. In my removeDocument() method (names and parameters are obscured to remove cruft not germane to the problem at hand), I iterate through the IndexReader's documents (because there are non-indexed identifiers used). When I hit a document that contains the correct identifiers, I use ir.delete(idx), and output a log message that I'm deleting the document. This part works as expected. (A log message for one entry is spit out.) Now, however, when I search for documents, things go awry. I'm using the standard analyzer (StandardAnalyzer, I should say), and IndexSearcher(String). I then use code like the following: Hits hits=searcher.search(query, new Filter() { public BitSet bits(IndexReader ir) throws IOException { BitSet bs=new BitSet(); for(int idx=0;idxir.maxDoc();idx++) { boolean deleted=ir.isDeleted(idx); bs.set(idx, !deleted); } return bs; } }); (I also have a log message to output the salient information about the document and whether it's been deleted.) Here's where the problem evinces itself: *every* document here says that it's not deleted, even though the removeDocument() method mentioned above doesn't show all of the documents returned here. It's almost like there are two IndexReaders in action, one noting the deleted documents, and the other not. It's very confusing to me. Can anyone give me any pointers? - Joseph B. Ottinger [EMAIL PROTECTED] http://enigmastation.comIT Consultant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: IndexReader.delete(int) not working for me
Then this means that my IndexReader.delete(i) isn't working properly. What would be the common causes for this? My log shows the documents being deleted, so something's going wrong at that point. On Wed, 5 Mar 2003, Doug Cutting wrote: Joseph Ottinger wrote: This shouldn't replace the document per se - it should, however, free the index entry (for reuse by documents added later) as I understand it. It should also mark the document as deleted. A query still may return the document (again, as I understand it), requiring a filter to make sure deleted documents aren't returned. Searches results do not include deleted documents, so you do not need to explicitly filter for them. After a document is deleted, the space consumed by it may not be reclaimed for a while, and some term statistics may not be updated immediately, but Lucene never returns references deleted documents. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Joseph B. Ottinger [EMAIL PROTECTED] http://enigmastation.comIT Consultant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: IndexReader.delete(int) not working for me
Okay, I think I've done something stupid here: on closer examination, it looks like my comparison to find the specific documents to delete is failing. Let me look further at that. On Wed, 5 Mar 2003, Doug Cutting wrote: Joseph Ottinger wrote: Then this means that my IndexReader.delete(i) isn't working properly. What would be the common causes for this? My log shows the documents being deleted, so something's going wrong at that point. Are you closing the IndexReader after doing the deletes? This is required for the deletions to be saved. What makes you think that that delete is not working properly? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Joseph B. Ottinger [EMAIL PROTECTED] http://enigmastation.comIT Consultant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: IndexReader.delete(int) not working for me
Okay, I found the problem: it was a stupid coder. To wit, here's the salient code: Document d=indexReader.document(i); if(d.getField(key).equals(node.getKey()) { ... } The error, of course, is that getField.equals() is comparing FIELDS and not string values. When I changed this to pull the stringValue() out of getField(), everything worked as expected. Turns out my logging actually was spitting out the *wrong* message somewhere else, which deceived me^Wthe stupid coder into thinking the removal was occurring when it was not. Now everything's working fine. Thank you for your time. On Wed, 5 Mar 2003, Doug Cutting wrote: Joseph Ottinger wrote: Then this means that my IndexReader.delete(i) isn't working properly. What would be the common causes for this? My log shows the documents being deleted, so something's going wrong at that point. Are you closing the IndexReader after doing the deletes? This is required for the deletions to be saved. What makes you think that that delete is not working properly? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Joseph B. Ottinger [EMAIL PROTECTED] http://enigmastation.comIT Consultant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]