On 11/06/2016 03:18 PM, sebb wrote:
> Fields such as message-id are stored as text strings, but they are
> only really intended to be used as ids. They don't contain independent
> text parts.
> 
> From what I have understood so far from reading the ES docs, such
> fields should be tagged as
> 
> "index": "not_analyzed"
> 
> AIUI this reduces the analysis overhead and storage requirements, and
> also makes it harder to find fields with
> This probably applies to other fields in "mbox":
> 
> mid
> possibly in-reply-to
> also references
> 
> And of course the auto-created fields such as attachments
> 
> Likewise the doc types currently missing from setup.py:
> 
> notifications
> account
> mailinglists
> 
> These are internal use only so are not intended for searching.
> 
> Or have I got this completely wrong?
> 

message-id is set to not be analyzed, by the setup script (it's in the
mappings it sends to ES when creating the index). mid and in-reply-to
should probably also be not analyzed, although mid is really a copy of
the doc ID, IIRC. the list ID is also not analyzed by default (as
list_raw), neither is the raw from address

Reply via email to