Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread Ian Lea
This is very hard to follow. I for one don't recall what you described or what you are looking for. Have you worked through http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F? -- Ian. On Tue, Nov 29, 2011 at 7:25 AM, SBS jturn...@uow.edu.au wrote:

Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread SBS
This is very hard to follow. I for one don't recall what you described or what you are looking for. Sorry about that, I am using the web interface where the context of my post is visible to all. To sum up, my original post was: It seems that when I use a PorterStemFilter in my custom

Re: PorterStemFilter causes wildcard searches to not work

2011-11-29 Thread Ian Lea
A google search of lucene stemming wildcards finds some hits implying these don't work well together. http://lucene.472066.n3.nabble.com/Conflicts-with-Stemming-and-Wildcard-Prefix-Queries-td540479.html may be a solution. -- Ian. On Tue, Nov 29, 2011 at 10:39 AM, SBS jturn...@uow.edu.au

Error while re-indexing - cannot overwrite 0.fdt

2011-11-29 Thread Rohan A Ambasta
Hi, I get the error - Cannot Overwrite 0.fdt when I start indexing. Detail TestCase - 1) Performing indexing for the first time work fine. 2) Then I do search and I get the search results 3) After search, If I again start indexing I get the error - Cannot overwrite 0.fdt Has anybody faced

Re: Error while re-indexing - cannot overwrite 0.fdt

2011-11-29 Thread Ian Lea
Close the first index writer? http://lmgtfy.com/?q=lucene+Cannot+overwrite+%22_0.fdt%22+file If you can't find the answer and need to post again, include as a minimum details of the OS and lucene version that you are using. -- Ian. On Tue, Nov 29, 2011 at 12:15 PM, Rohan A Ambasta

Quoted search on Analyzed fields

2011-11-29 Thread Mihai Caraman
field = new Field(author,(author).toLowerCase(),Field.Store.NO, Field.Index.NOT_ANALYZED); field.setIndexOptions(FieldInfo.IndexOptions.DOCS_ONLY); field.setOmitNorms(true); When in the above configuration i switched from NOT_ANALYZED to ANALYZED, luke's results for

Re: Scoring a document using LDA topics

2011-11-29 Thread Stephen Thomas
Sujit, Thanks for your reply, and the link to your blog post, which was helpful and got me thinking about Payloads. I still have one more question. I need to be able to compute the Sim(query q, doc d) similarity function, which is defined below: Sim (query q, doc d) = sum_{t in q} sum_{z} P(t,

Re: Quoted search on Analyzed fields

2011-11-29 Thread Mihai Caraman
Still no difference, it may be because of some other hidden bug.java-user-h...@lucene.apache.orgAnyway, adding freq and positions will be a no - no because of space :) so bye bye quotes. Thank you

Re: Quoted search on Analyzed fields

2011-11-29 Thread Robert Muir
Again there is nothing wrong with the quotes: its instead how you are configuring the analysis for this field. If you put stuff in quotes and your analyzer breaks it into multiple tokens, then queryparser forms a phrase query. You must index positions to support phrase queries. Normally

Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
List, I have written my own CustomAnalyzer, as follows: public TokenStream tokenStream(String fieldName, Reader reader) { // TODO: add calls to RemovePuncation, and SplitIdentifiers here // First, convert to lower case TokenStream

RE: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Uwe Schindler
Hi, There is WordDelimiterFilter in Solr that was also ported to Lucene Analysis module in Lucene trunk (4.0). In 3.x yu can still add solr.jar to your classpath and WordDelimiterFilterFactory to produce one (WordDelimiterFilter itself is package-private). - Uwe Schindler H.-H.-Meier-Allee

Re: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
How do you use the WordDelimiterFilterFactory()? I tried the following code: TokenStream out = new LowerCaseTokenizer(reader); WordDelimiterFilterFactory wdf = new WordDelimiterFilterFactory(); out = wdf.create(out); ... But I am getting a runtime error: Exception in thread main

RE: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Uwe Schindler
Hi, Be sure to use the same Solr version as your Lucene version (if = 3.1) and this is example code from test case: WordDelimiterFilterFactory fact = new WordDelimiterFilterFactory(); // we don’t need this if we don’t load external exclusion files: // ResourceLoader loader = new

Re: Scoring a document using LDA topics

2011-11-29 Thread Sujit Pal
Hi Stephen, We precompute a variant of P(z,d) during indexing, and do the first 3 steps. The resulting documents are ordered by payload score, which is basically z in our case. We don't currently care about P(t,z) but it seems like a good thing to have for disambiguation purposes. So anyway, I