Hello! Still can't get the result I want: stop words not appearing in
buckets.
Further testing showed that:
- if I filter aggregation with a query for one of the stop words, I get an
empty result for aggregations;
- the same analyzer is changing all :) and :( and replacing them with
SMILE and FROWN, these appear as such in the aggregation results;
- if I include all the stop words using the "exclude" option, it works;
So it appears that my analyzer is doing everything it should, except
filtering the stop words when getting the aggregations (it works for
search).
And I am beginning to wonder if this could be, in fact, a bug... Any
thoughts?
Thanks,
André Morais
Quinta-feira, 24 de Julho de 2014 16:57:59 UTC+1, André Morais escreveu:
>
> Hi!
>
> I'm really enjoying all the possibilities brought about by the move from
> facets to aggregations. However, I still can't figure out the relationship
> between facets or buckets and analyzers. Is it not possible at all to get
> the buckets out of an analyzed field?
>
> Specifically, I need to get list of most common words, but I want to use
> my stopword list to exclude those that do not matter to me.
>
> I am using a stop word filter:
>
> index.analysis.filter.fnstop:
> type: stop
> stopwords: ["my", "it", "the", "likes"]
>
> And a custom analyzer:
>
> index.analysis.analyzer.test:
> type: custom
> tokenizer: whitespace
> filter: lowercase, asciifolding, fnstop
>
> I then map my field with the custom analyzer:
> ...
> "Clean_Message" : {{"type" : "string", "analyzer" : "test"}
>
> And request list of top 100 most common terms, using the search API:
> {
> "query": { "bool": { "must": [ { "match_all": {} } ] } },
> "aggs": {
> "Message": {
> "terms": {
> "field": "Clean_Message",
> "size": 100,
> "order": { "_count": "desc" }
> }
> }
> }
> }
>
> However, some words in my stop filter appear in that list.
>
> Is it by design? Are we not supposed to run facets or aggregations agains
> an analyzed field?
>
> Is it possible to get the list of most common terms against an analyzed
> field?
>
> Thank you very much for your attention and for your work!
>
> André Morais
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/79bd7b05-ba26-4f26-a817-e3c34061a325%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.