Get matching fields from a BooleanQuery
Hey everyone, To start, we are using Lucene 4.3. To search, we prepare several queries and combine these into a BooleanQuery. What we are looking for is a way to determine on which specific fields a certain document matched. For example, I create 2 queries: one to search in the "Name" field, and another to search in the "Description" field. Combining these into a BooleanQuery and running it will return the matching documents, but we'd like to know for each document returned whether there was a match in the Name field or in the Description field. It seems to me that something like the highlighter would need to know this too but highlighting isn't a goal currently. I've also looked at indexsearcher.explain() but the doc says that this is as expensive as running the query against the entire index, so I'd obviously like to avoid running the same queries mutliple times :). Kind regards, Frederik
Re: Updating the DocValues field doesn't seem to update its associated StoredField value
Hi, Could anyone help with my issue described below? If I'm not posting on the right mailing list please direct me to the correct one. Many thanks, Joe On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye wrote: > Hi, > > I have a few NumericDocValuesField fields and also added separate > StoredField fields to store the values so that I can access them in query > results. I used IndexWriter.updateNumericDocValue to update the value of > a DocValues field. Then I firstly called SearcherManager.maybeRefresh to > ensure SearcherManager.acquire will return refreshed instances and used > DocValuesNumbersQuery > with the updated value. I did get the matching document in the query > result but when I tried to access its value using Document.get, it's still > the old value. It appears that updating the DocValues field doesn't update > its associated StoredField value. What do I miss here? > > > I would highly appreciate your help! > > > Regards, > > Joe >
Re: Updating the DocValues field doesn't seem to update its associated StoredField value
Updating the doc value will not update the stored field (what document.get returns). If you need to change stored fields you have to use the IW.updateDocuments API, where the old document is deleted and a new document is indexed, atomically (to refresh). But also see Erick's solr-specific response (to the list) a week ago. Mike McCandless http://blog.mikemccandless.com On Mon, Jun 19, 2017 at 5:41 AM, Joe Ye wrote: > Hi, > > Could anyone help with my issue described below? If I'm not posting on the > right mailing list please direct me to the correct one. > > Many thanks, > Joe > > > On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye wrote: > > > Hi, > > > > I have a few NumericDocValuesField fields and also added separate > > StoredField fields to store the values so that I can access them in query > > results. I used IndexWriter.updateNumericDocValue to update the value of > > a DocValues field. Then I firstly called SearcherManager.maybeRefresh to > > ensure SearcherManager.acquire will return refreshed instances and used > DocValuesNumbersQuery > > with the updated value. I did get the matching document in the query > > result but when I tried to access its value using Document.get, it's > still > > the old value. It appears that updating the DocValues field doesn't > update > > its associated StoredField value. What do I miss here? > > > > > > I would highly appreciate your help! > > > > > > Regards, > > > > Joe > > >
Re: Updating the DocValues field doesn't seem to update its associated StoredField value
Thanks Mike! My colleague only forwarded Erick's Solr reply today as it seems I didn't get any emails and may have been taken off the mailing list for some reason? We're using Lucene core only (version 6.2.1 at the moment). So there's no link between the docValue and its associated stored field? Is there anything similar/equivalent to useDocValuesAsStored in Lucene core? We're trying to use docValues to avoid a full update (delete + create new)... Yet, we still need to retrieve the updated values. Regards, Joe On Mon, Jun 19, 2017 at 4:16 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Updating the doc value will not update the stored field (what document.get > returns). If you need to change stored fields you have to use the > IW.updateDocuments API, where the old document is deleted and a new > document is indexed, atomically (to refresh). > > But also see Erick's solr-specific response (to the list) a week ago. > > Mike McCandless > > http://blog.mikemccandless.com > > On Mon, Jun 19, 2017 at 5:41 AM, Joe Ye wrote: > >> Hi, >> >> Could anyone help with my issue described below? If I'm not posting on the >> right mailing list please direct me to the correct one. >> >> Many thanks, >> Joe >> >> >> On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye wrote: >> >> > Hi, >> > >> > I have a few NumericDocValuesField fields and also added separate >> > StoredField fields to store the values so that I can access them in >> query >> > results. I used IndexWriter.updateNumericDocValue to update the value >> of >> > a DocValues field. Then I firstly called SearcherManager.maybeRefresh to >> > ensure SearcherManager.acquire will return refreshed instances and used >> DocValuesNumbersQuery >> > with the updated value. I did get the matching document in the query >> > result but when I tried to access its value using Document.get, it's >> still >> > the old value. It appears that updating the DocValues field doesn't >> update >> > its associated StoredField value. What do I miss here? >> > >> > >> > I would highly appreciate your help! >> > >> > >> > Regards, >> > >> > Joe >> > >> > >
Re: Updating the DocValues field doesn't seem to update its associated StoredField value
Joe: I have no reason to believe you were taken off the user's list intentionally. Maybe your spam filter is over-zealous or something? Or perhaps you registered with some no-longer-valid mail address and could register again? Erick On Mon, Jun 19, 2017 at 8:50 AM, Joe Ye wrote: > Thanks Mike! My colleague only forwarded Erick's Solr reply today as it > seems I didn't get any emails and may have been taken off the mailing list > for some reason? > > We're using Lucene core only (version 6.2.1 at the moment). So there's no > link between the docValue and its associated stored field? Is there > anything similar/equivalent to useDocValuesAsStored in Lucene core? We're > trying to use docValues to avoid a full update (delete + create new)... > Yet, we still need to retrieve the updated values. > > Regards, > Joe > > On Mon, Jun 19, 2017 at 4:16 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Updating the doc value will not update the stored field (what document.get >> returns). If you need to change stored fields you have to use the >> IW.updateDocuments API, where the old document is deleted and a new >> document is indexed, atomically (to refresh). >> >> But also see Erick's solr-specific response (to the list) a week ago. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Mon, Jun 19, 2017 at 5:41 AM, Joe Ye wrote: >> >>> Hi, >>> >>> Could anyone help with my issue described below? If I'm not posting on the >>> right mailing list please direct me to the correct one. >>> >>> Many thanks, >>> Joe >>> >>> >>> On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye wrote: >>> >>> > Hi, >>> > >>> > I have a few NumericDocValuesField fields and also added separate >>> > StoredField fields to store the values so that I can access them in >>> query >>> > results. I used IndexWriter.updateNumericDocValue to update the value >>> of >>> > a DocValues field. Then I firstly called SearcherManager.maybeRefresh to >>> > ensure SearcherManager.acquire will return refreshed instances and used >>> DocValuesNumbersQuery >>> > with the updated value. I did get the matching document in the query >>> > result but when I tried to access its value using Document.get, it's >>> still >>> > the old value. It appears that updating the DocValues field doesn't >>> update >>> > its associated StoredField value. What do I miss here? >>> > >>> > >>> > I would highly appreciate your help! >>> > >>> > >>> > Regards, >>> > >>> > Joe >>> > >>> >> >> - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
sub
Re: Updating the DocValues field doesn't seem to update its associated StoredField value
In pure Lucene you could just pull the doc values for the docIDs in your set of search results; MultiDocValues can be helpful sugar here, unless you need SORTED or SORTED_SET in which case it's best to go per-segment. Or just track down where Solr does this and poach those sources. Mike McCandless http://blog.mikemccandless.com On Mon, Jun 19, 2017 at 11:50 AM, Joe Ye wrote: > Thanks Mike! My colleague only forwarded Erick's Solr reply today as it > seems I didn't get any emails and may have been taken off the mailing list > for some reason? > > We're using Lucene core only (version 6.2.1 at the moment). So there's no > link between the docValue and its associated stored field? Is there > anything similar/equivalent to useDocValuesAsStored in Lucene core? We're > trying to use docValues to avoid a full update (delete + create new)... > Yet, we still need to retrieve the updated values. > > Regards, > Joe > > On Mon, Jun 19, 2017 at 4:16 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Updating the doc value will not update the stored field (what >> document.get returns). If you need to change stored fields you have to use >> the IW.updateDocuments API, where the old document is deleted and a new >> document is indexed, atomically (to refresh). >> >> But also see Erick's solr-specific response (to the list) a week ago. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Mon, Jun 19, 2017 at 5:41 AM, Joe Ye wrote: >> >>> Hi, >>> >>> Could anyone help with my issue described below? If I'm not posting on >>> the >>> right mailing list please direct me to the correct one. >>> >>> Many thanks, >>> Joe >>> >>> >>> On Mon, Jun 12, 2017 at 3:05 PM, Joe Ye wrote: >>> >>> > Hi, >>> > >>> > I have a few NumericDocValuesField fields and also added separate >>> > StoredField fields to store the values so that I can access them in >>> query >>> > results. I used IndexWriter.updateNumericDocValue to update the value >>> of >>> > a DocValues field. Then I firstly called SearcherManager.maybeRefresh >>> to >>> > ensure SearcherManager.acquire will return refreshed instances and >>> used DocValuesNumbersQuery >>> > with the updated value. I did get the matching document in the query >>> > result but when I tried to access its value using Document.get, it's >>> still >>> > the old value. It appears that updating the DocValues field doesn't >>> update >>> > its associated StoredField value. What do I miss here? >>> > >>> > >>> > I would highly appreciate your help! >>> > >>> > >>> > Regards, >>> > >>> > Joe >>> > >>> >> >> >
Re: email field - analyzed and not analyzed in single field using custom analyzer
Hi Steve Thanks for the input. How to apply WordDelimiterGraphFilter / WordDelimiterFilter for email tokens alone using email regex ? i want to have only analyzed tokens for other tokens with other type of special characters... -- Kumaran R On Thu, Jun 15, 2017 at 7:43 PM, Steve Rowe wrote: > Hi Kumaran, > > WordDelimiterGraphFilter with PRESERVE_ORIGINAL should do what you want: < > http://lucene.apache.org/core/6_6_0/analyzers-common/ > org/apache/lucene/analysis/miscellaneous/WordDelimiterGraphFilter.html>. > > Here’s a test I added to TestWordDelimiterGraphFilter.java that passed > for me: > > - > public void testEmail() throws Exception { > final int flags = GENERATE_WORD_PARTS | GENERATE_NUMBER_PARTS | > SPLIT_ON_CASE_CHANGE | SPLIT_ON_NUMERICS | PRESERVE_ORIGINAL; > Analyzer a = new Analyzer() { > @Override public TokenStreamComponents createComponents(String field) { > Tokenizer tokenizer = new MockTokenizer(MockTokenizer.WHITESPACE, > false); > return new TokenStreamComponents(tokenizer, new > WordDelimiterGraphFilter(tokenizer, flags, null)); > } > }; > assertAnalyzesTo(a, "will.sm...@yahoo.com", > new String[] { "will.sm...@yahoo.com", "will", "smith", "yahoo", > "com" }, > null, null, null, > new int[] { 1, 0, 1, 1, 1 }, > null, false); > a.close(); > } > - > > -- > Steve > www.lucidworks.com > > > On Jun 15, 2017, at 8:53 AM, Kumaran Ramasubramanian > wrote: > > > > Hi All, > > > > i want to index email fields as both analyzed and not analyzed using > custom > > analyzer. > > > > for example, > > sm...@yahoo.com > > will.sm...@yahoo.com > > > > that is, indexing sm...@yahoo.com as single token as well as analyzed > > tokens in same email field... > > > > > > My existing custom analyzer, > > > > public class CustomSearchAnalyzer extends StopwordAnalyzerBase > > { > > > >public CustomSearchAnalyzer(Version matchVersion, Reader stopwords) > > throws Exception > >{ > >super(matchVersion, loadStopwordSet(stopwords, matchVersion)); > >} > > > >@Override > >protected Analyzer.TokenStreamComponents createComponents(final String > > fieldName, final Reader reader) > >{ > >final ClassicTokenizer src = new ClassicTokenizer(getVersion(), > > reader); > >src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); > >TokenStream tok = new ClassicFilter(src); > >tok = new LowerCaseFilter(getVersion(), tok); > >tok = new StopFilter(getVersion(), tok, stopwords); > >tok = new ASCIIFoldingFilter(tok); // to enable AccentInsensitive > > search > > > >return new Analyzer.TokenStreamComponents(src, tok) > >{ > >@Override > >protected void setReader(final Reader reader) throws > IOException > >{ > > > > src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); > >super.setReader(reader); > >} > >}; > >} > > } > > > > > > And so i want to achieve like, > > > > 1.if i search using query "sm...@yahoo.com", records with > > will.sm...@yahoo.com should not come... > > 2.Also i should be able to search using query "smith" in that field > > 3.if possible, should be able to detect email values in all other fields > > and apply the same type of tokenization > > > > How to achieve point 1 and 2 using UAX29URLEmailTokenizer? how to add > > UAX29URLEmailTokenizer in my existing custom analyzer without using email > > analyzer ( perfieldanalyzer ) for email field.. And so i can apply this > > tokenizer for email terms of all fields.. > > > > > > > > - > > Kumaran R > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >
SpanNearQuery Class issue
Hi, This is regarding the search limit of SpanNearQuery Class. I create a lucene index consisting of 2 billion documents and search the index using SpanNearQuery class object in Searcher.search(Query query, int n). But the search method returns Results only if search terms are within first 6 crore inserted documents. Am I missing anything during initialization so that search is getting restricted or is this a limitation issue with SpanNearQuery Class? I am using Apache lucene 6.5.0 version. Please let me know about this since I am using this for a critical project? Thanks, Ranganath B. N.