Il 12/11/2014 15:25, Nikolas Everett ha scritto:


On Wed, Nov 12, 2014 at 8:15 AM, Alessandro Bonfanti <[email protected]> wrote:
Hi, I'm very newbie on ElasticSearch.
I'm try to indexing a set of biological data. There are some fields like 'gene_id' or 'gene_shortname' that should be processed as literal strings.
When I try to search for 'ZNF6092' in a field filled with 'linc-ZNF6092-6', I can't find anything. When I search for 'linc' I find correct document elsewhere.
It seems that this is a problem with ES analyzer, but I tried to set it for do not analyze fields, but it seems that nothing changes.
I try with:

curl -XPOST 'localhost:9200/a3' -d @tracking_map.json

where tracking_map.json is

{
 
"mappings": {
   
"tracking": {
     
"properties": {
       
"tracking_id" : {
         
"type": "string",
         
"index":"not_analyzed"
       
},
       
"nearest_ref_id" : {
         
"type": "string",
         
"index":"not_analyzed"
       
},
       
"gene_id" : {
         
"type": "string",
         
"index":"not_analyzed"
       
},
       
"gene_short_name" : {
         
"type": "string",
         
"index":"not_analyzed"
       
}
     
}
   
}
 
}
}



And then re-indexing of all documents. I failed, but where?
Thanks in advance,

Alessandro

Its an analyzer problem, certainly.  You've turned off analyzers with "index":"not_analazyed".  What you probably want is for the gene_short_name to be analyzed so that dashes are considered "word separators".  If you do that you can find linc-ZNF6092-6 by performing a simple_query_string (or match) search for <code>ZNF6092</code> or <code>ZNF6092 6</code> or <code>6</code> or <code>linc</code>.   Have a look at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html and go from there.  You may also want to use a lowercase filter so you can search for <code>znf6092</code> and still find it.

This is a good read on how to change the mapping as well:
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/
even if you don't need all the information in there it is nice to know.

Nik
--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/Y6I2qNZxR-s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd06sKTVS6JC8q7x7R37gUEnsHEiuar0-yy_ZdOJQhKYzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Very thanks for your answer,
What I want is that ES store fields as literals, so I should find ZNF6092 with a wilcard search (*ZNF6092* for example).
I tried set "pattern" to "*" for testing (* isn't in gene_shortname, so I suppose that entire string is stored. But anyway I still find nothing.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5463873D.1070507%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to