Hi,
For Lucene 4.7.2 Facets, once we invoke FacetCollector and get the topNChildren
into FacetResult, is there any mechanism that for a particular search result, I
could get the docIds corresponding to any facet?
Say, I have a facet defined on Field1. Upon Search and FacetCollection, I get
I think, you need to execute DrilDownQuery to get the docIds.
On Mon, Jul 7, 2014 at 4:40 PM, Sandeep Khanzode
sandeep_khanz...@yahoo.com.invalid wrote:
Hi,
For Lucene 4.7.2 Facets, once we invoke FacetCollector and get the
topNChildren into FacetResult, is there any mechanism that for a
I think emitting two tokens for vans is the right (potentially only) way to
do it. You could
also control the dictionary of terms that require this special treatment.
Any reason makes you not happy with this approach?
On Jul 06, 2014, at 11:48 AM, Arjen van der Meijden acmmail...@tweakers.net
Some of these anomalous cases are best handled by simply suppressing
stemming, using PatternKeywordMarkerFilter and SetKeywordMarkerFilter, to
set the keyword attribute for matching tokens and then most stemmers will
not change them.
You can create a list of words to ignore, like plurals of
Hi Arjen,
You could also mark a token as keyword so the stemmer passes it through
unchanged. For example, per the Javadocs for PorterStemFilter:
http://lucene.apache.org/core/4_6_0/analyzers-common/org/apache/lucene/analysis/en/PorterStemFilter.html
Note: This filter is aware of the
Arjen,
An approach requiring less list maintenance could be more advanced
linguistic processing to distinguish the stop word from the content word,
such as lemmatization rather than stemming.
A commercial offering, Rosette Search Essentials from Basis
http://www.basistech.com/search-essentials/
Hi,
I tried to index bigrams from a documhe system gave and the system gave me
the following output with the frequencies of the bigrams(output 1):
array size:15
array terms are:{contents: /1, assist librarian/1, assist manjula/2, assist
sabaragamuwa/1, fine manjula/1, librari manjula/1,