Re: Calculate Term Co-occurrence Matrix

2010-08-22 Thread ahmed algohary
Thanks! It is exactly what I need. But, isn't there a way to get the matching score ? for example, "damaged" co-occurs with "shipment" with a probability = 0.4 ?? On Sun, Aug 22, 2010 at 5:35 AM, Ivan Provalov wrote: > Ahmed, > > FYI, I updated the term collocations package I mentioned earlie

Re: Calculate Term Co-occurrence Matrix

2010-08-22 Thread ahmed algohary
I think I got it. In the CollectionIndexer class, I have added the co-occurrence score to the index document: doc.add(new Field("score", collocation.getScore() + "", Field.Store.YES, Field.Index.NOT_ANALYZED)); then in the CollectionSearcher, the scores can be retrieved: d.get

Re: Calculate Term Co-occurrence Matrix

2010-08-22 Thread Ivan Provalov
Ahmed, Instead, I would use the score coming out of the CollocationSearcher class. I changed it a bit to return the LinkedHashMap of collocated terms and their scores relative to the term used in the query. I have attached the new version. Thanks, IP --- On Sun, 8/22/10, ahmed algohary wr

IndexWriter.deleteDocuments(Query[]) not deleting

2010-08-22 Thread Paul J. Lucas
Hi - Using Lucene 2.9.3, I'm indexing the metadata in image files. For each image ("document" in Lucene), I have 2 additional special fields: "FILE-PATH" (containing the full path of the file) and "DIR-PATH" (containing the full path of the directory the file is in). The FILE-PATH Field is cr

Re: IndexWriter.deleteDocuments(Query[]) not deleting

2010-08-22 Thread Erick Erickson
Did you issue a commit (or close) the IndexWriter after you deleted the documents? And I'm assuming that something really weird didn't happen like a case change, but your NOT_ANALYZED should take care of that at index time, but are you sure your cases match when you submit your term queries? An in

Re: IndexWriter.deleteDocuments(Query[]) not deleting

2010-08-22 Thread Paul J. Lucas
On Aug 22, 2010, at 1:47 PM, Erick Erickson wrote: > Did you issue a commit (or close) the IndexWriter after you deleted the > documents? I originally wrote: > I create/close a new IndexWriter for the delete. So the answer is "yes." > ... are you sure your cases match when you submit your term

Re: IndexWriter.deleteDocuments(Query[]) not deleting

2010-08-22 Thread Erick Erickson
Yep, sure hate it when that happens, which doesn't prevent it happening to me more often than I'd like :). Glad you figured it out. Erick On Sun, Aug 22, 2010 at 3:04 PM, Paul J. Lucas wrote: > On Aug 22, 2010, at 1:47 PM, Erick Erickson wrote: > > > Did you issue a commit (or close) the Index