Thanks Ivan, do you mean what I obtain from a request such as
curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase,my_ascii_folding,my_stopwords' -d 'El corregimiento de Mulaló, jurisdicción del municipio de Yumbo (Valle del Cauca)' is not what will be present in the index after the analysis process? If so, how could I check whether the stop words filter is being (will be) applied to a sample phrase? 2014-08-28 14:03 GMT-05:00 Ivan Brusic <[email protected]>: > Also note that the content returned will still contain the stop words. > Only the inverted index will contain the stopword-less content. > > -- > Ivan > > > On Thu, Aug 28, 2014 at 11:55 AM, Itamar Syn-Hershko <[email protected]> > wrote: > >> What would be the usecase for such a process (removing stop words without >> tokenization)? >> >> This may be a good read btw: >> http://www.elasticsearch.org/blog/stop-stopping-stop-words-a-look-at-common-terms-query/ >> >> -- >> >> Itamar Syn-Hershko >> http://code972.com | @synhershko <https://twitter.com/synhershko> >> Freelance Developer & Consultant >> Author of RavenDB in Action <http://manning.com/synhershko/> >> >> >> On Thu, Aug 28, 2014 at 9:48 PM, German Carrillo < >> [email protected]> wrote: >> >>> Hi all, >>> >>> >>> I'm looking for a way to remove stop words from tokens returned by a >>> keyword tokenizer, i.e., I'd like to obtain the original text without stop >>> words after the analysis process. >>> >>> Sample data looks like: "El corregimiento de >>> Mulaló, jurisdicción del municipio de Yumbo (Valle del Cauca)" >>> After the lowercase token filter: "el corregimiento de mulaló, >>> jurisdicción del municipio de yumbo (valle del cauca)" >>> After the ascii folding token filter: "el corregimiento de >>> mulalo, jurisdiccion del municipio de yumbo (valle del cauca)" >>> After removing stop words: "corregimiento mulalo, >>> municipio yumbo (valle cauca)" >>> >>> The stop words (currently) are: ["la", "el", "de", "del", "los", >>> "las", "jurisdiccion"] >>> >>> Is the pattern replace token filter the only (or best) way to go for >>> such a task? >>> >>> I'd really like to avoid writing custom regular expressions rather than >>> specifying a stop words list, which I know would work perfectly fine for >>> other tokenizers. >>> >>> >>> Regards, >>> >>> Germán >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/038ff037-ccf3-4aca-b0c0-bb421531c495%40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/038ff037-ccf3-4aca-b0c0-bb421531c495%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu%2BJGsL7Srsg7inbs3TkejOqp4jFZ1op-18WfiT3VoGOQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zu%2BJGsL7Srsg7inbs3TkejOqp4jFZ1op-18WfiT3VoGOQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCJAM-4nJAKjUix7GvT9766%2B5si_z76txfnt-S-BTJqBw%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCJAM-4nJAKjUix7GvT9766%2B5si_z76txfnt-S-BTJqBw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANaz7mxuoDv3cV83nUgr-SXentuwfBcs3bX8oLMA_tvBd40bWA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
