Yeah sorry, I should have created 7 documents in the testindex - in my rush to get a standalone test done and emailed out I botched that. Thanks for the insight into the case issue with the KeywordAnalyzer. I'm starting to think how I might structure my application to possibly use the Query API in conjunction with the QueryParser. But, QueryParser is very compelling.
Sent from my iPhone On Jun 26, 2012, at 9:28 PM, "Lingam, ChandraMohan J" <chandramohan.j.lin...@intel.com> wrote: > Interestingly, the query generated from this var query = > queryParser.Parse("Id:BAUER*") is converted to lower case "bauer*" eventhough > you are using KeywordAnalyzer. I am not sure if this is the intended > behavior of the keyword analyzer. > > So, best option to make this example work is to index in lowercase: > document.Add(new Field("Id", "bauerrevenue", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > > Also, the assert will always fail because hit count even when it matches will > be 1 since there is only one document with several values associated with the > field. You would need to iterate thru the fields. If you want to match 6 > documents, then you have to add as six separate documents instead one > document will all the values. > > > > > -----Original Message----- > From: Rob Cecil [mailto:rob.ce...@gmail.com] > Sent: Tuesday, June 26, 2012 6:55 PM > To: lucene-net-user@lucene.apache.org > Subject: Re: SPAM-HIGH: Disparity between API usage and Luke > > Sure, this is self-contained: > > [Test] > public void QueryNonAnalyzedField() > { > var indexPath = Path.Combine(Environment.CurrentDirectory, > "testindex"); > var directory = FSDirectory.Open(new DirectoryInfo(indexPath)); > var analyzer = new KeywordAnalyzer(); > var writer = new IndexWriter(directory, analyzer, true, > IndexWriter.MaxFieldLength.LIMITED); > var document = new Document(); > document.Add(new Field("Id", "BAUERREVENUE", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "BAUERLOCATION", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "BAUERPRODUCT", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "BAUERPRODUCTLINE", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "BAUERSTATE", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "BAUERTOTAL", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > document.Add(new Field("Id", "NOTBAUER", Field.Store.YES, > Field.Index.NOT_ANALYZED)); > writer.AddDocument(document); > writer.Optimize(); > writer.Close(); > > IndexReader reader = IndexReader.Open(directory, true); > var queryParser = new QueryParser(Version.LUCENE_29, "content", > analyzer); > var query = queryParser.Parse("Id:BAUER*"); > var indexSearch = new IndexSearcher(reader); > var hits = indexSearch.Search(query); > Assert.AreEqual(6, hits.Length()); > } > > > On Tue, Jun 26, 2012 at 6:35 PM, Lingam, ChandraMohan J < > chandramohan.j.lin...@intel.com> wrote: > >> Just did a simple test and Keywordanalyzer does indeed work like a >> prefix query if you put a star at the end. Agree with Simon. Most >> likely luke was using keyword analyzer and somehow UI was not reflecting it? >> >> Please post a small snippet of your index code and query code... >> >> -----Original Message----- >> From: Rob Cecil [mailto:rob.ce...@gmail.com] >> Sent: Tuesday, June 26, 2012 5:25 PM >> To: lucene-net-user@lucene.apache.org >> Subject: Re: SPAM-HIGH: Disparity between API usage and Luke >> >> Thanks, and there is no equivalent QueryParser syntax for that? >> >> On Tue, Jun 26, 2012 at 6:21 PM, Lingam, ChandraMohan J < >> chandramohan.j.lin...@intel.com> wrote: >> >>> actually, that makes sense. Keyword analyzer would try for an exact >> match. >>> Since you are looking for prefix based search, your best option is >>> to simply use PrefixQuery and there is no need to put a "*" for prefixquery. >>> >>> -----Original Message----- >>> From: Rob Cecil [mailto:rob.ce...@gmail.com] >>> Sent: Tuesday, June 26, 2012 4:57 PM >>> To: lucene-net-user@lucene.apache.org >>> Subject: Re: SPAM-HIGH: Disparity between API usage and Luke >>> >>> That is correct. I've verified in Luke 1.0.1 that both analyzers >>> produce the same results. >>> >>> To make it interesting, back in my code, I switched over to using >>> the KeywordAnalyzer, and I'm still not getting any results against >>> that NOT_ANALYZED field. >>> >>> ? >>> >>> On Tue, Jun 26, 2012 at 5:52 PM, Lingam, ChandraMohan J < >>> chandramohan.j.lin...@intel.com> wrote: >>> >>>> Luke using keyword analyzer as default makes sense. However, in >>>> the original post, there was a link to luke output screenshot >>>> which showed that standard analyzer was in use for query parsing. >>>> >>>> -----Original Message----- >>>> From: Simon Svensson [mailto:si...@devhost.se] >>>> Sent: Tuesday, June 26, 2012 2:56 PM >>>> To: lucene-net-user@lucene.apache.org >>>> Subject: Re: SPAM-HIGH: Disparity between API usage and Luke >>>> >>>> Luke defaults to KeywordAnalyzer which wont change your term in >>>> any >> way. >>>> The QueryParser will still break up your query, so "Name:Jack Bauer" >>>> would become (Name:Jack DefaultField:Bauer). I believe you can >>>> have per-field analyzers (KeywordAnalyzer for Id, StandardAnalyzer >>>> for everything else) using a PerFieldAnalyzerWrapper. >>>> >>>> On 2012-06-26 23:06, Lingam, ChandraMohan J wrote: >>>>> QueryParser has no knowledge of how data was indexed. For your >>>> scenario, I don't believe you would be able to use Query Parser >>>> with standard analyzer when data was originally indexed with >>>> Field.Index.NOT_ANALYZED option. >>>>> >>>>> Interesting question is why is luke working/finding the match? >>>>> I would >>>> have expected Luke to not find any matches. >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Rob Cecil [mailto:rob.ce...@gmail.com] >>>>> Sent: Tuesday, June 26, 2012 12:54 PM >>>>> To: lucene-net-user@lucene.apache.org >>>>> Subject: Re: SPAM-HIGH: Disparity between API usage and Luke >>>>> >>>>> I can definitely try that. I just expected QueryParser would >>>>> respect the >>>> case of the source string. I was hoping to avoid using the Query >>>> API per-se, and just let the parser to the work for me. >>>>> >>>>> On Tue, Jun 26, 2012 at 1:19 PM, Lingam, ChandraMohan J < >>>> chandramohan.j.lin...@intel.com> wrote: >>>>> >>>>>>>> var query = _parser.Parse("Id:BAUER*"); >>>>>> In your code, most likely, the value got converted to lower >>>>>> case >> (i.e. >>>>>> bauer*) by the parse statement. >>>>>> Whereas indexed value is in upper case as it is not analyzed >>>>>> (from screen shot). >>>>>> >>>>>> Can you explicitly try using prefix query? >>>>>> >>>>>> >>>>>> >>>>>>> Same results, apparently, when I use Luke 1.0.1. >>>>>>> >>>>>>> When I search for "Id:BAUER*" I get 15 hits in Luke, but in my >>>>>>> custom app, zero. >>>>>>> >>>>>>> On Tue, Jun 26, 2012 at 12:31 PM, Rob Vesse >>>>>>> <rve...@dotnetrdf.org> >>>>>> wrote: >>>>>>>> You appear to be using Luke 3.5 which per the information on >>>>>>>> the Luke homepage (http://code.google.com/p/luke/) uses >>>>>>>> Lucene >>>>>>>> 3.5 >>>>>>>> >>>>>>>> Since Lucene.Net is currently on 2.9.4 I wouldn't be >>>>>>>> surprised to see different behavior between the API and executing in >>>>>>>> Luke. >>>>>>>> >>>>>>>> If you use a version of Luke which more closely aligns with >>>>>>>> the version >>>>>>> of >>>>>>>> Lucene.Net (Luke 1.0.1 uses Lucene 3.0.1 which should be >>>>>>>> close enough since the 2.9.x releases were previews of the >>>>>>>> 3.0.x releases as I understood it) what behavior do you see? >>>>>>>> >>>>>>>> Hope this helps, >>>>>>>> >>>>>>>> Rob >>>>>>>> >>>>>>>> On 6/26/12 10:50 AM, "Rob Cecil" <rob.ce...@gmail.com> wrote: >>>>>>>> >>>>>>>>> If I run a query against my index using QueryParser to query >>>>>>>>> a >>> field: >>>>>>>>> >>>>>>>>> var query = _parser.Parse("Id:BAUER*"); >>>>>>>>> var topDocs = searcher.Search(query, 10); >>>>>>>>> Assert.AreEqual(count, topDocs.TotalHits); >>>>>>>>> >>>>>>>>> I get 0 for my TotalHits, yet in Luke, the same query phrase >>>>>>>>> yields >>>>>>>>> 15 results, what am I doing wrong? I use the >>>>>>>>> StandardAnalyzer both to create the index and to query. >>>>>>>>> >>>>>>>>> The field is defined as: >>>>>>>>> >>>>>>>>> new Field("Id", myObject.Id, Field.Store.YES, >>>>>>>>> Field.Index.NOT_ANALYZED) >>>>>>>>> >>>>>>>>> and is a string field. The result set back from Luke looks >>>>>>>>> like >>>>>>>>> (screencap): >>>>>>>>> >>>>>>>>> http://screencast.com/t/NooMK2Rf >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> >>>> >>> >>