On 6 November 2016 at 14:37, John D. Ament <[email protected]> wrote:
> On Sun, Nov 6, 2016 at 9:27 AM Daniel Gruno <[email protected]> wrote:
>
>> On 11/06/2016 03:18 PM, sebb wrote:
>> > Fields such as message-id are stored as text strings, but they are
>> > only really intended to be used as ids. They don't contain independent
>> > text parts.
>> >
>> > From what I have understood so far from reading the ES docs, such
>> > fields should be tagged as
>> >
>> > "index": "not_analyzed"
>> >
>> > AIUI this reduces the analysis overhead and storage requirements, and
>> > also makes it harder to find fields with
>> > This probably applies to other fields in "mbox":
>> >
>> > mid
>> > possibly in-reply-to
>> > also references
>> >
>> > And of course the auto-created fields such as attachments
>> >
>> > Likewise the doc types currently missing from setup.py:
>> >
>> > notifications
>> > account
>> > mailinglists
>> >
>> > These are internal use only so are not intended for searching.
>> >
>> > Or have I got this completely wrong?
>> >
>>
>> message-id is set to not be analyzed, by the setup script (it's in the
>> mappings it sends to ES when creating the index). mid and in-reply-to
>> should probably also be not analyzed, although mid is really a copy of
>> the doc ID, IIRC. the list ID is also not analyzed by default (as
>> list_raw), neither is the raw from address
>>
>
> So I notice the query process is an arbitrary full text query, which runs
> against _all.
> https://github.com/apache/incubator-ponymail/blob/master/site/api/lib/elastic.lua#L44

Huh?

The query starts:

local url = config.es_url .. doc .. "/_search?q="..query

where

es_url = "http://localhost:9200/ponymail/";

and

doc = "mbox" by default.

Where does the _all come in?

> unless
> I need to dig into it a bit further to see if there's something building up
> query a bit different.
>
> So... that means most of these mappings are moot.

Reply via email to