On 6 November 2016 at 14:37, John D. Ament <[email protected]> wrote: > On Sun, Nov 6, 2016 at 9:27 AM Daniel Gruno <[email protected]> wrote: > >> On 11/06/2016 03:18 PM, sebb wrote: >> > Fields such as message-id are stored as text strings, but they are >> > only really intended to be used as ids. They don't contain independent >> > text parts. >> > >> > From what I have understood so far from reading the ES docs, such >> > fields should be tagged as >> > >> > "index": "not_analyzed" >> > >> > AIUI this reduces the analysis overhead and storage requirements, and >> > also makes it harder to find fields with >> > This probably applies to other fields in "mbox": >> > >> > mid >> > possibly in-reply-to >> > also references >> > >> > And of course the auto-created fields such as attachments >> > >> > Likewise the doc types currently missing from setup.py: >> > >> > notifications >> > account >> > mailinglists >> > >> > These are internal use only so are not intended for searching. >> > >> > Or have I got this completely wrong? >> > >> >> message-id is set to not be analyzed, by the setup script (it's in the >> mappings it sends to ES when creating the index). mid and in-reply-to >> should probably also be not analyzed, although mid is really a copy of >> the doc ID, IIRC. the list ID is also not analyzed by default (as >> list_raw), neither is the raw from address >> > > So I notice the query process is an arbitrary full text query, which runs > against _all. > https://github.com/apache/incubator-ponymail/blob/master/site/api/lib/elastic.lua#L44
Huh? The query starts: local url = config.es_url .. doc .. "/_search?q="..query where es_url = "http://localhost:9200/ponymail/" and doc = "mbox" by default. Where does the _all come in? > unless > I need to dig into it a bit further to see if there's something building up > query a bit different. > > So... that means most of these mappings are moot.
