On Tue, May 21, 2019, at 18:41, Dilyan Palauzov wrote:
Hello,
thanks, Bron, for your answer.
I gave it a try.
squatter does not remove .NEW directories when aborted (SIGINT), the
directories have to be removed manually
https://github.com/cyrusimap/cyrus-imapd/issues/2765
squatter -t X -z X -o recognizes, when the directory structure behind
tier X exists, that nothing has to be done, prints “Skipping X for
user.ABC, only one” and quits, without updating the .xapianactive files.
yeah right, that won't work. Glad to know :)
squatter -t Y -z Y -o, when the directory structructure behind tier Y
does not exist, prints “compressing Y:1,Y:0 to Y:2 for user... (active
Y:1,Y:0)”. As far as I remember this has not updated the xapianactive
files.
Yeah right, it won't add a new target unless you are compressing the
current first item in xapianactive.
squatter -t X -z Y -o does add to the .xapianactive files the
defaultsearhtier, but first has to duplicate with rsync all existing
files. This takes a while… But at the end did what I wanted.
Afterwards the directory structure for the new tier was not created.
The directory structure was created once I started all the cyrus
processes again.
That makes sense. We don't create a directory structure until a
document gets created in there.
squatter -t X -z Y -o emits the message “undefined search partition
X,Ysearchpartition-default” and then “compressing X:0,X,Y:0 to Y:2 for
... (active Y:0,X:0,X,Y:0,Y:1)”.
That sounds like a sanity checking failure! Good catch:
https://github.com/cyrusimap/cyrus-imapd/issues/2764
Does squatter -t X -z Y append X to Y, or it deletes Y and copies X to
Y? In the latter case, is there any (performance) difference between
"squatter -t X,Y -z Y" and “squatter -t Y,X -z Y”?
There's no difference in what order you add items to -t. -t is a
comma separated list of selectors for source items. You can even
explicitly say:
squatter -t X:0,X:2,Y:45 -z Y and it will compact just those three
sources into a new target in Y.
What it does under the hood is creates a new database and copy all
the documents over from the source databases, then compress the end
result into the most compact and fastest xapian format which is
designed to never write again. This compressed file is then stored
into the target database name, and in an exclusively locked
operation the new database is moved into place and the old tiers are
removed from the xapianactive, such that all new searches look into
the single destination database instead of the multiple source
databases.
Can one xapian tier store a document, and another tier store the
information, that the address of the document has changed?
It doesn't work like that. The addresses of the documents never
change (they are the sha1 of the document contents, and Cyrus
documents are all immutable). The xapian engine searches across the
full set of databases listed in xapianactive in order to find
document ids, then maps them through the conversations.db file to
find the actual emails. A copy/move of an email updates the
conversations.db lookups, so the next search will find the new
location without anything changing in xapian.
the cyrus.indexed.db file is just a convenience to allow rolling
squatter to avoid having to re-scan records that it knows are
already indexed.
Bron.
Regards
Дилян
----- Message from Bron Gondwana <br...@fastmailteam.com> ---------
Date: Mon, 20 May 2019 18:52:07 +1000
From: Bron Gondwana <br...@fastmailteam.com>
Subject: Re: Prepending Xapian Tiers
To: Cyrus Devel <cyrus-devel@lists.andrew.cmu.edu>
> On Fri, May 17, 2019, at 23:52, Дилян Палаузов wrote:
>> Hello,
>>
>> I set up a Cyrus system with one tier. I think it works. The
>> .xapianactive files contain 'tiername: 0'.
>>
>> How can I insert a second tier?
>
> I have never tried this on a live server! Clearly the right thing to
> do is to build a cassandane search which implements doing this so
> that we can make sure it works.
>
>> Adding a XYZsearchpartition-default to imapd.conf, together with
>> defaultsearchtier: XYZ does not utilize the new directory: it stays
>> empty and the .xapianactive files do not get updated to mention the
>> new tier.
>
> That looks like it should work. I assume you have restarted your
> cyrus since making the change? I'm not certain that a rolling
> squatter will discover a new config in the way that imapd does.
>
> Also - you'll need to run squatter in compact mode in order to add a
> new xapianactive entry. The simplest could be:
>
> squatter -z tiername -t tiername -o
>
> I believe that given your current setup, this will just copy the
> entry from tiername:0 to tirename:1 and also create XYZ:0 in the
> xapianactive file at the same time.
>
>> Besides, if a message is MOVEd over IMAP, is any optimization
>> utilized, to avoid reindexing the message, but just change the
>> address of the document?
>
> Yes, both XAPINDEXED mode where the GUID is read from xapian, and
> CONVINDEXED mode where the GUID is looked up via user.conversations
> and then mapped into the cyrus.indexed.db files in each xapian tier
> allow Xapian to skip reindexing when a message is already indexed.
> This works for both MOVE and for re-uploading of an identical
> message file via IMAP.
>
> Cheers,
>
> Bron.
>
> --
> Bron Gondwana, CEO, FastMail Pty Ltd
> br...@fastmailteam.com
----- End message from Bron Gondwana <br...@fastmailteam.com> -----
--
Bron Gondwana, CEO, FastMail Pty Ltd
br...@fastmailteam.com