Re: Need suggestions on implementing a custom query (offload R-tree filter to fully in-memory) on Lucene-8.3

2019-12-04 Thread
ocValuesQuery = > LatLonDocValuesField.newSlowBoxQuery("poi", www, xxx, yyy, zzz); > Query poiQuery = new IndexOrDocValuesQuery(latLonPointQuery, > latLonDocValuesQuery); > Query query = new BooleanQuery.Builder() > .add(textQuery, Occur.MUST) > .add(poiQuery, Occur.FIL

How can i specify a custom Analyzer for a Field of Document?

2019-12-09 Thread
Directory indexDataDir = FSDirectory.open(Paths.get("index_data")); Analyzer analyzer = MyLuceneAnalyzerFactory.newInstance(); IndexWriterConfig iwc = new IndexWriterConfig(analyzer); iwc.setOpenMode(OpenMode.CREATE); iwc.setRAMBufferSizeMB(256.0); IndexWriter indexWriter = new

Need suggestions on implementing a custom query (offload R-tree filter to fully in-memory) on Lucene-8.3

2019-12-03 Thread
Background: i need to implement a document indexing and search for POIs(point of interest) under LBS scene. A POI has name, address, and location(LatLonPoint), and i want to combine a text query with a geo-spatial 2d range filter. The problem is, when i first build a native in-memory index which

Re: Question about PhraseQuery's capacity...

2020-01-12 Thread
hi i have filed a issue to lucene-core: https://issues.apache.org/jira/browse/LUCENE-9130 i just write a test case, and find that BooelanQuery with MUST filter mode is ok, but PhraseQuery fails 小鱼儿 于2020年1月10日周五 下午7:14写道: > explain api helps! thanks for hint~! > I have found out that on

Re: Question about PhraseQuery's capacity...

2020-01-10 Thread
if there is any mismatch in analyzer's processing or if there is a capacity limit in PhraseQuery... Mikhail Khludnev 于2020年1月10日周五 下午6:21写道: > Hello, > Sometimes IndexSearcher.explain(Query, int) allows to analyse mismatches. > > On Fri, Jan 10, 2020 at 1:13 PM 小鱼儿 wrote: > > >

Re: Question about PhraseQuery's capacity...

2020-01-10 Thread
alysis chain when creating the phrase query? Can you show > us how you build the phrase query? > > On Fri, Jan 10, 2020 at 9:24 AM 小鱼儿 wrote: > > > I use SmartChineseAnalyzer to do the indexing, and add a document with a > > TextField whose value is a long sentence, when anay

What's the difference between LatLonPoint and LatLonDocValuesField?

2020-01-10 Thread
In my understanding from reading the oniline documentation, LatLonPoint is used for BKD indexing, and LatLonDocValuesField is used for Sort argument's input. But does it means if a POI has a GeoPoint type "location" field, then i must add the same location value to the 2 fields which makes me

Question about PhraseQuery's capacity...

2020-01-10 Thread
I use SmartChineseAnalyzer to do the indexing, and add a document with a TextField whose value is a long sentence, when anaylized, will get 18 terms. & then i use the same value to construct a PhraseQuery, setting slop to 2, and adding the 18 terms concequently... I expect the search api to find

Quest about Lucene's IndexSearcher.search(Query query, int n) API's parameter n

2020-01-09 Thread
I'm doing a POI(Point-of-interest) search using lucene, each POI has a "location" which is a GeoPoint/LonLat type. I need do a keyword-range search but the query result POIs need to sort by distance to a starting point. This "distance", in fact, is a dynamic computed property which cannot be used

Re: Question about PhraseQuery's capacity...

2020-01-10 Thread
the TokenStream interface? Adrien Grand 于2020年1月10日周五 下午4:53写道: > It should match. My guess is that you might not reusing the same positions > as set by the analysis chain when creating the phrase query? Can you show > us how you build the phrase query? > > On Fri, Jan 10, 2020

Question abount combining InvertedIndex and SortField

2019-12-30 Thread
Assume i first use keyword search to get a DocIDSet from inverted index, then i want to sort these docIds by some numeric field, like a `updateTime`, does Lucene do this without need of loading the Document objects but only with an sorted index on `updateTime`? Which i call it "Index-Only Sort

Re: Question abount combining InvertedIndex and SortField

2020-01-01 Thread
ucene should be able to a high-perf Top-N query if SortField can support dynamically-generated ranking scores... not only the native indexed numeric/String fields) Mikhail Khludnev 于2019年12月31日周二 下午4:41写道: > Hello, 小鱼儿. > > On Tue, Dec 31, 2019 at 6:32 AM 小鱼儿 wrote: > > > As

Needs advice on auto-keyword-correction mode custom query

2020-01-05 Thread
Hi everybody, I want to implement an auto-keyword-correction mode custom query: suppose a scenario where user inputs a keyword query A, but due to typo or other reasons, A should be B, A is not a valid term in lucene's index which B is. (I'm not considering NLP in high-dimensional semantice space

Why Lucene's Suggest API can ONLY load field terms which is Store.YES?

2019-12-27 Thread
I have a document `category` field, which is a "|,;" separator separated string, in indexing phase, i do manually split the value into atomic terms and index as StringField, & i also add a same name StoredField which contains original value form: *List terms =

Re: Why Lucene's Suggest API can ONLY load field terms which is Store.YES?

2019-12-27 Thread
> That's it. > > On Fri, Dec 27, 2019 at 11:32 AM 小鱼儿 wrote: > > > I have a document `category` field, which is a "|,;" separator separated > > string, in indexing phase, i do manually split the value into atomic > terms > > and index as StringField, &am

Re: Compatibility problems between AnalyzerWrapper api & MultiTerms.getTerms api

2020-04-15 Thread
4, 2020 at 9:52 AM 小鱼儿 wrote: > > > I'm using AnalyzerWrapper to do per-field analyzer to do special > indexing: > > > > PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(..); > > // PerFieldAnalyzerWrapper is subclass of Lucene's AnalyzerWrapper >

Compatibility problems between AnalyzerWrapper api & MultiTerms.getTerms api

2020-04-14 Thread
I'm using AnalyzerWrapper to do per-field analyzer to do special indexing: PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(..); // PerFieldAnalyzerWrapper is subclass of Lucene's AnalyzerWrapper IndexWriterConfig iwc = new IndexWriterConfig(analyzer); However, i found that later