Re: Xapian searches of the body of an email

2019-01-08 Thread Egoitz Aurrekoetxea
Hi Robert! 

Thank you so much mate :) :) 

Best regards,

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 08-01-2019 09:26, Robert Stepanek escribió:

> Hi, 
> 
> On Mon, Jan 7, 2019, at 7:32 PM, Egoitz Aurrekoetxea wrote: 
> 
>> So, if I run Squatter in the master in rolling mode... then I assume there's 
>> no need to launch manually squatter command in the master... isn't it?.
> 
> If you indexed all messages in the mailbox before, then there shouldn't be a 
> need to manually run squatter afterwards. The rolling squatter should pick up 
> and index all newly created messages. 
> 
>> I'm planning to upgrade some 2.3 running machines to either 2.4 or 3.0 
>> and you know... is a big responsability... am doing tons of sandbox 
>> testing... is it known (as Michael stated) not to be working traditional 
>> squatter in 3.0 if you don't want to use now the Xapian engine?.
> 
> Sorry, I have no experience with the upgrades from version 2. 
> 
> Cheers, 
> Robert
 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-08 Thread Robert Stepanek
Hi,

On Mon, Jan 7, 2019, at 7:32 PM, Egoitz Aurrekoetxea wrote:
> So, if I run Squatter in the master in rolling mode... then I assume
> there's no need to launch manually squatter command in the master...
> isn't it?.
If you indexed all messages in the mailbox before, then there shouldn't
be a need to manually run squatter afterwards. The rolling squatter
should pick up and index all newly created messages.
> I'm planning to upgrade some 2.3 running machines to either 2.4 or
> 3.0 and you know... is a big responsability... am doing tons of
> sandbox testing... is it known (as Michael stated) not to be working
> traditional squatter in 3.0 if you don't want to use now the Xapian
> engine?.
Sorry, I have no experience with the upgrades from version 2.

Cheers,
Robert

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Hi Robert! 

I see! I though you only needed to run Squatter in rolling mode in the
slave.. I though the roling mode was just for slaves to take the
Squatter changes caused by a normal squatter command launched in the
master... So, if I run Squatter in the master in rolling mode... then I
assume there's no need to launch manually squatter command in the
master... isn't it?. 

I'm planning to upgrade some 2.3 running machines to either 2.4 or
3.0 and you know... is a big responsability... am doing tons of
sandbox testing... is it known (as Michael stated) not to be working
traditional squatter in 3.0 if you don't want to use now the Xapian
engine?. 

I'll read the post right now :) :) 

Thank you so much for all your help mate :) 

Cheers!

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 07-01-2019 19:24, Robert Stepanek escribió:

> Hi Egon, 
> 
> Yes, the slave should index in conversations.db automatically AFAIK.  
> 
> You should run squatter in rolling mode on the master, too.   
> 
> BTW: in 2014, Bron wrote a blog post about the search setup at FastMail: 
> https://fastmail.blog/2014/12/01/email-search-system/ 
> It's quite technical, but should give you a good idea at how it's set up for 
> fast indexing and search  
> 
> Cheers, Robert  
> 
> On Mon, Jan 7, 2019, at 5:54 PM, Egoitz Aurrekoetxea wrote: 
> 
> Hi Robert! 
> 
> Thank you so much for helping us (mainly which is the one boring the list 
> with questions :) although I promise I've checked the doc before asking :) :) 
>  ). 
> 
> When you have a master/slave config... in the slave one, when running 
> Squatter in rolling mode... does it update the conversations db too?. By the 
> way, Squatter in rolling mode only makes sense in slave machines isn't it?. 
> 
> Many thanks! 
> 
> --- 
> 
> EGOITZ AURREKOETXEA 
> Departamento de sistemas 
> 
> 944 209 470 
> Parque Tecnológico. Edificio 103 
> 48170 Zamudio (Bizkaia) 
> ego...@sarenet.es
> 
> www.sarenet.es [1] 
> 
> Antes de imprimir este correo electrónico piense si es necesario hacerlo. 
> 
> El 07-01-2019 16:42, Robert Stepanek escribió: 
> Hi, 
> 
> Sebastian is right: 
> 
> On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote: 
> 
> squatter is nowadays a bit of a misnomer, because it uses whatever index 
> you have configured. In cyrus 2.4, squatter would always create a SQUAT 
> index. When you run squatter with Xapian, it will build the index, but for 
> the index to actually work, you also need the conversationsdb. 
> 
> conversations.db is indeed a misnomer now. The database was only used to keep 
> track of mail threads (hence the name), but its role expanded. One of the 
> indexes it stores is the SHA1 hashes of every message, and separate hashes 
> for each of that message MIME parts. Such a hash is named the GUID, and for 
> each GUID we store a list of all mailbox:UID[bodypart] pairs where this 
> content occurs in. 
> 
> For search, we keep track of the indexed messages by GUID, so we can avoid 
> reindexing duplicate mails. To return a search result, we can now map that 
> GUID back to its mailbox:message pairs. That's why we need conversations.db 
> for search. 
> 
> I can't help with upgrading from 2.4, unfortunately, but if you re-index your 
> mailboxes once in conversations.db, you should be all set. 
> 
> Cheers, 
> Robert 
> 
>  
> Cyrus Home Page: http://www.cyrusimap.org/ 
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ 
> To Unsubscribe: 
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

  

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Robert Stepanek
Hi Egon,

Yes, the slave should index in conversations.db automatically AFAIK. 

You should run squatter in rolling mode on the master, too.  

BTW: in 2014, Bron wrote a blog post about the search setup at FastMail: 
https://fastmail.blog/2014/12/01/email-search-system/It’s quite technical, but 
should give you a good idea at how it’s set up
for fast indexing and search
Cheers, Robert 


On Mon, Jan 7, 2019, at 5:54 PM, Egoitz Aurrekoetxea wrote:
> Hi Robert!


> 


> Thank you so much for helping us (mainly which is the one boring the
> list with questions :) although I promise I've checked the doc before
> asking :) :)  ).> 


> When you have a master/slave config... in the slave one, when running
> Squatter in rolling mode... does it update the conversations db too?.
> By the way, Squatter in rolling mode only makes sense in slave
> machines isn't it?.> 


> Many thanks!


> 


> ---
> 
> sarenet
> *Egoitz Aurrekoetxea*
> Departamento de sistemas
> 944 209 470
> Parque Tecnológico. Edificio 103
> 48170 Zamudio (Bizkaia)
> ego...@sarenet.es
> www.sarenet.es
> 
> Antes de imprimir este correo electrónico piense si es necesario
> hacerlo.> 


> El 07-01-2019 16:42, Robert Stepanek escribió:


>> Hi,
>>  
>> Sebastian is right:
>>  
>> On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote:
>>>  
>>> squatter is nowadays a bit of a misnomer, because it uses
>>> whatever index>>> you have configured. In cyrus 2.4, squatter would always 
>>> create
>>> a SQUAT>>> index. When you run squatter with Xapian, it will build the 
>>> index,
>>> but for>>> the index to actually work, you also need the conversationsdb.
>>  
>> conversations.db is indeed a misnomer now. The database was only used
>> to keep track of mail threads (hence the name), but its role
>> expanded. One of the indexes it stores is the SHA1 hashes of every
>> message, and separate hashes for each of that message MIME parts.
>> Such a hash is named the GUID, and for each GUID we store a list of
>> all mailbox:UID[bodypart] pairs where this content occurs in.>>  
>> For search, we keep track of the indexed messages by GUID, so we can
>> avoid reindexing duplicate mails. To return a search result, we can
>> now map that GUID back to its mailbox:message pairs. That's why we
>> need conversations.db for search.>>  
>> I can't help with upgrading from 2.4, unfortunately, but if you re-
>> index your mailboxes once in conversations.db, you should be all set.>>  
>> Cheers,
>> Robert
>> 
>> 
>>  Cyrus Home Page: http://www.cyrusimap.org/
>>  List Archives/Info:
>>  http://lists.andrew.cmu.edu/pipermail/info-cyrus/>>  To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Michael Menge

Hi,

Quoting Egoitz Aurrekoetxea :


And, by the way

when using Squatter instead of Xapian as a search engine what do we
really lost?. Just the fact of having a statistical worse results?. Is
it Xapian faster than squat engine?.



I didn't test Xapian but the Squatter search index is not used in  
cyrus-imapd 3.0


See https://github.com/cyrusimap/cyrus-imapd/issues/2598



Sorry for having so many questions but... I suppose I don't have the
implications of each one totally clear :)

Cheers!

---

EGOITZ AURREKOETXEA
Departamento de sistemas
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia)
ego...@sarenet.es
www.sarenet.es [1]
Antes de imprimir este correo electrónico piense si es necesario
hacerlo.

El 07-01-2019 14:51, Egoitz Aurrekoetxea escribió:


Hi mate!

This seems to take ages... I'm trying to figure the best way of  
implementing this and of clarifying concepts I'm running  
Squatter in rolling replication mode and exist the concept of  
conversations then. What is the exact role of each of them?.  
Squatter seems to index the mailbox but when something is not  
properly indexed instead of running Squatter you use  
ctl_conversations for reindexing some part again or...


Thanks a lot!!

---

EGOITZ AURREKOETXEA
Departamento de sistemas
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia)
ego...@sarenet.es
www.sarenet.es [1]
Antes de imprimir este correo electrónico piense si es necesario hacerlo.

El 07-01-2019 10:19, Sebastian Hagedorn escribió:
That sounds like the conversationsdb issue I was talking about.  
Have you tried these steps?


ctl_conversationsdb -z USER
ctl_conversationsdb -b USER

I have been testing Xapian searches. Have seen, it's not able to find
strings inside the body of the email. If I set in imap.conf
"search_fuzzy_always: 1" no content is displayed in the searches of a
Roundcube stock webmail. If I remove that config value from imap.conf
and restart services, then search results appear. Does Xapian not index
the body of emails?. Does Xapian, just index the headers?. But this
affirmation does not seem to be possible in my case too... as I have in
the config "search_index_headers: no".

I'm using the following config :

conversations: 1
search_engine: xapian
search_index_headers: no
search_batchsize: 8192
defaultsearchtier: t1
t1searchpartition-default: /expert/search
t1searchpartition-expert2: /expert2/search
t1searchpartition-expert3: /expert3/search

Could anyone help me mates?.

--
Sebastian Hagedorn - Weyertal 121, Zimmer 2.02
Regionales Rechenzentrum (RRZK)
Universität zu Köln / Cologne University - Tel. +49-221-470-89578



Links:
--
[1] http://www.sarenet.es





M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Hi Robert! 

Thank you so much for helping us (mainly which is the one boring the
list with questions :) although I promise I've checked the doc before
asking :) :)  ). 

When you have a master/slave config... in the slave one, when running
Squatter in rolling mode... does it update the conversations db too?. By
the way, Squatter in rolling mode only makes sense in slave machines
isn't it?. 

Many thanks! 

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 07-01-2019 16:42, Robert Stepanek escribió:

> Hi, 
> 
> Sebastian is right: 
> 
> On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote: 
> 
>> squatter is nowadays a bit of a misnomer, because it uses whatever index 
>> you have configured. In cyrus 2.4, squatter would always create a SQUAT 
>> index. When you run squatter with Xapian, it will build the index, but for 
>> the index to actually work, you also need the conversationsdb.
> 
> conversations.db is indeed a misnomer now. The database was only used to keep 
> track of mail threads (hence the name), but its role expanded. One of the 
> indexes it stores is the SHA1 hashes of every message, and separate hashes 
> for each of that message MIME parts. Such a hash is named the GUID, and for 
> each GUID we store a list of all mailbox:UID[bodypart] pairs where this 
> content occurs in. 
> 
> For search, we keep track of the indexed messages by GUID, so we can avoid 
> reindexing duplicate mails. To return a search result, we can now map that 
> GUID back to its mailbox:message pairs. That's why we need conversations.db 
> for search. 
> 
> I can't help with upgrading from 2.4, unfortunately, but if you re-index your 
> mailboxes once in conversations.db, you should be all set. 
> 
> Cheers, 
> Robert 
> 
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Robert Stepanek
On Mon, Jan 7, 2019, at 4:13 PM, Elías Halldór Ágústsson wrote:
> Regarding indexing and searching in body of emails; what if the body
> text is encoded in base64 or quoted-printable? It won't yield any
> unencoded search strings, or what?
If the MIME body part is of type text, then base64 and QP-encoded bodies
will get decoded for the search index. And they will get decoded before
presenting the search result in snippets.
Cheers,
Robert


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Hi Sebastian, 

I'll answer below (and in green) your answers!! Thanks a lot for your
explanations mate :) :)

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 07-01-2019 15:57, Sebastian Hagedorn escribió:

> Hi,
> 
> --On 7. Januar 2019 um 14:51:02 +0100 Egoitz Aurrekoetxea  
> wrote:
> 
>> This seems to take ages...
> 
> why don't you run it for a single account first, to make sure that it 
> actually helps? 
> 
> ==
>  
> 
> YES :) IT HAS WORKED AS EXPECTED FOR A SINGLE ACCOUNT! HAVE LAUNCHED LATER 
> THE : 
> 
> /USR/LOCAL/CYRUS/SBIN/CTL_CONVERSATIONSDB -R -Z 
> 
> (STILL RUNNING) 
> 
> AND LATER THE : 
> 
> /USR/LOCAL/CYRUS/SBIN/CTL_CONVERSATIONSDB -R -B 
> 
> ==
>  
> 
>> I'm trying to figure the best way of
>> implementing this and of clarifying concepts I'm running Squatter in
>> rolling replication mode and exist the concept of conversations then.
>> What is the exact role of each of them?. Squatter seems to index the
>> mailbox but when something is not properly indexed instead of running
>> Squatter you use ctl_conversations for reindexing some part again or...
> 
> squatter is nowadays a bit of a misnomer, because it uses whatever index you 
> have configured. In cyrus 2.4, squatter would always create a SQUAT index. 
> When you run squatter with Xapian, it will build the index, but for the index 
> to actually work, you also need the conversationsdb. squatter does not touch 
> the conversationsdb! The index is only a pointer to the conversationsdb, not 
> to actual messages. 
> 
> ==
>  
> 
> I SEE!! THIS WAS THE KEY POINT I WAS LOOSING OR WHICH I WAS NOT ABLE TO SEE 
> HOW ALL OF THEM WHERE RELATED... 
> 
> ==
>  
> 
> Robert can probably explain this much better than me, but I think the problem 
> is the following:
> 
> * when you have conversations enabled in imapd.conf, normal deliveries to the 
> mailboxes (e.g. using lmtpd) will update the conversationsdb
> 
> * syncing (at least using the "old" mechanism, not sure about sync between 
> instances running 3.0)  does *not* update the conversationsdb 
> 
> ==
>  
> 
> ALTHOUGH I ASSUME, HERE IF YOU RUN THE SQUATTER IN ROLLING MODE WITHIN A 
> SLAVE SERVER SEEMS TO UPDATE THE INDEX... ALTHOUGH NOT SURE IF IT COULD DO 
> TOO WITH CONVERSATIONSDB.. HOW DOES A SLAVE BEHAVE IN THIS SENSE?. COULD 
> ANYONE HELP US PLEASE :) ?? 
> 
> ==
>  
> 
> Once you have a running 3.0 server, you probably won't need to run 
> ctl_conversationsdb ever again. But when you are at the stage of syncing mail 
> from 2.4 to 3.0, you *will* need to rebuild each user's conversationdb at 
> least once, after you have finished with syncing that user. 
> 
> ==
>  
> 
> THIS IS IMPORTANT YES... 
> 
> ==
>  
> 
> Again, this is all based on my understanding and not an official answer. 
> 
> ==
>  
> 
> ANY I THANK A LOT ALL YOUR HELP!! MANY MANY THANKS MATE!! 
> 
> CHEERS!! 
> 
> ==
> 
> El 07-01-2019 10:19, Sebastian Hagedorn escribió:
> 
> That sounds like the conversationsdb issue I was talking about. Have you
> tried these steps?
> 
> ctl_conversationsdb -z USER
> ctl_conversationsdb -b USER

Mit freundlichen Grüßen

Sebastian Hagedorn 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Robert Stepanek
Hi,

Sebastian is right:

On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote:
> 
> squatter is nowadays a bit of a misnomer, because it uses
> whatever index> you have configured. In cyrus 2.4, squatter would always 
> create a
> SQUAT> index. When you run squatter with Xapian, it will build the
> index, but for> the index to actually work, you also need the 
> conversationsdb. 

conversations.db is indeed a misnomer now. The database was only used to
keep track of mail threads (hence the name), but its role expanded. One
of the indexes it stores is the SHA1 hashes of every message, and
separate hashes for each of that message MIME parts. Such a hash is
named the GUID, and for each GUID we store a list of all
mailbox:UID[bodypart] pairs where this content occurs in.
For search, we keep track of the indexed messages by GUID, so we can
avoid reindexing duplicate mails. To return a search result, we can now
map that GUID back to its mailbox:message pairs. That's why we need
conversations.db for search.
I can't help with upgrading from 2.4, unfortunately, but if you re-index
your mailboxes once in conversations.db, you should be all set.
Cheers,
Robert

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Elías Halldór Ágústsson
Regarding indexing and searching in body of emails; what if the body text is 
encoded in base64 or quoted-printable? It won't yield any unencoded search 
strings, or what?

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Sebastian Hagedorn
As I wrote before, this should only be necessary as long as you are in the 
syncing stage of the migration. Once all new mail is delivered to the 3.0 
server everything should just work – I think ;-)


--On 7. Januar 2019 um 15:41:59 +0100 Egoitz Aurrekoetxea 
 wrote:



ctl_conversationsdb -z and -b (with -r I assume for all users) should be
run only on new user accounts or... periodically for any user account?.
Or does squatter maintain too the conversations database?.


--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

pgp_XnhACfsqd.pgp
Description: PGP signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Sebastian Hagedorn
That's a really good question. I had been hoping that the performance of a 
Xapian search would be much better than a SQUAT search, but now I'm not so 
sure.


--On 7. Januar 2019 um 15:05:52 +0100 Egoitz Aurrekoetxea 
 wrote:



And, by the way

when using Squatter instead of Xapian as a search engine what do we
really lost?. Just the fact of having a statistical worse results?. Is
it Xapian faster than squat engine?.

--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

pgpL1VrQOOXtl.pgp
Description: PGP signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Sebastian Hagedorn

Hi,

--On 7. Januar 2019 um 14:51:02 +0100 Egoitz Aurrekoetxea 
 wrote:



This seems to take ages...


why don't you run it for a single account first, to make sure that it 
actually helps?



I'm trying to figure the best way of
implementing this and of clarifying concepts I'm running Squatter in
rolling replication mode and exist the concept of conversations then.
What is the exact role of each of them?. Squatter seems to index the
mailbox but when something is not properly indexed instead of running
Squatter you use ctl_conversations for reindexing some part again or...


squatter is nowadays a bit of a misnomer, because it uses whatever index 
you have configured. In cyrus 2.4, squatter would always create a SQUAT 
index. When you run squatter with Xapian, it will build the index, but for 
the index to actually work, you also need the conversationsdb. squatter 
does not touch the conversationsdb! The index is only a pointer to the 
conversationsdb, not to actual messages.


Robert can probably explain this much better than me, but I think the 
problem is the following:


• when you have conversations enabled in imapd.conf, normal deliveries to 
the mailboxes (e.g. using lmtpd) will update the conversationsdb


• syncing (at least using the "old" mechanism, not sure about sync 
between instances running 3.0)  does *not* update the conversationsdb


Once you have a running 3.0 server, you probably won't need to run 
ctl_conversationsdb ever again. But when you are at the stage of syncing 
mail from 2.4 to 3.0, you *will* need to rebuild each user's conversationdb 
at least once, after you have finished with syncing that user.


Again, this is all based on my understanding and not an official answer.


El 07-01-2019 10:19, Sebastian Hagedorn escribió:


That sounds like the conversationsdb issue I was talking about. Have you
tried these steps?

ctl_conversationsdb -z USER
ctl_conversationsdb -b USER




Mit freundlichen Grüßen

Sebastian Hagedorn
--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

pgpIMuJOdUtfn.pgp
Description: PGP signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Sorry for asking again.. 

ctl_conversationsdb -z and -b (with -r I assume for all users) should be
run only on new user accounts or... periodically for any user account?.
Or does squatter maintain too the conversations database?. 

Best regards,

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 07-01-2019 15:05, Egoitz Aurrekoetxea escribió:

> And, by the way 
> 
> when using Squatter instead of Xapian as a search engine what do we 
> really lost?. Just the fact of having a statistical worse results?. Is it 
> Xapian faster than squat engine?. 
> 
> Sorry for having so many questions but... I suppose I don't have the 
> implications of each one totally clear :) 
> 
> Cheers!
> 
> ---
> 
> EGOITZ AURREKOETXEA 
> Departamento de sistemas 
> 944 209 470
> Parque Tecnológico. Edificio 103
> 48170 Zamudio (Bizkaia) 
> ego...@sarenet.es 
> www.sarenet.es [1] 
> Antes de imprimir este correo electrónico piense si es necesario hacerlo. 
> 
> El 07-01-2019 14:51, Egoitz Aurrekoetxea escribió: 
> 
> Hi mate! 
> 
> This seems to take ages... I'm trying to figure the best way of implementing 
> this and of clarifying concepts I'm running Squatter in rolling 
> replication mode and exist the concept of conversations then. What is the 
> exact role of each of them?. Squatter seems to index the mailbox but when 
> something is not properly indexed instead of running Squatter you use 
> ctl_conversations for reindexing some part again or... 
> 
> Thanks a lot!!
> 
> ---
> 
> EGOITZ AURREKOETXEA 
> Departamento de sistemas 
> 944 209 470
> Parque Tecnológico. Edificio 103
> 48170 Zamudio (Bizkaia) 
> ego...@sarenet.es 
> www.sarenet.es [1] 
> Antes de imprimir este correo electrónico piense si es necesario hacerlo. 
> 
> El 07-01-2019 10:19, Sebastian Hagedorn escribió: 
> That sounds like the conversationsdb issue I was talking about. Have you 
> tried these steps?
> 
> ctl_conversationsdb -z USER
> ctl_conversationsdb -b USER
> 
> I have been testing Xapian searches. Have seen, it's not able to find
> strings inside the body of the email. If I set in imap.conf
> "search_fuzzy_always: 1" no content is displayed in the searches of a
> Roundcube stock webmail. If I remove that config value from imap.conf
> and restart services, then search results appear. Does Xapian not index
> the body of emails?. Does Xapian, just index the headers?. But this
> affirmation does not seem to be possible in my case too... as I have in
> the config "search_index_headers: no".
> 
> I'm using the following config :
> 
> conversations: 1
> search_engine: xapian
> search_index_headers: no
> search_batchsize: 8192
> defaultsearchtier: t1
> t1searchpartition-default: /expert/search
> t1searchpartition-expert2: /expert2/search
> t1searchpartition-expert3: /expert3/search
> 
> Could anyone help me mates?. 
> 
> --
> Sebastian Hagedorn - Weyertal 121, Zimmer 2.02
> Regionales Rechenzentrum (RRZK)
> Universität zu Köln / Cologne University - Tel. +49-221-470-89578
 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Hi mate! 

This seems to take ages... I'm trying to figure the best way of
implementing this and of clarifying concepts I'm running Squatter in
rolling replication mode and exist the concept of conversations then.
What is the exact role of each of them?. Squatter seems to index the
mailbox but when something is not properly indexed instead of running
Squatter you use ctl_conversations for reindexing some part again or... 

Thanks a lot!!

---

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

El 07-01-2019 10:19, Sebastian Hagedorn escribió:

> That sounds like the conversationsdb issue I was talking about. Have you 
> tried these steps?
> 
> ctl_conversationsdb -z USER
> ctl_conversationsdb -b USER
> 
>> I have been testing Xapian searches. Have seen, it's not able to find
>> strings inside the body of the email. If I set in imap.conf
>> "search_fuzzy_always: 1" no content is displayed in the searches of a
>> Roundcube stock webmail. If I remove that config value from imap.conf
>> and restart services, then search results appear. Does Xapian not index
>> the body of emails?. Does Xapian, just index the headers?. But this
>> affirmation does not seem to be possible in my case too... as I have in
>> the config "search_index_headers: no".
>> 
>> I'm using the following config :
>> 
>> conversations: 1
>> search_engine: xapian
>> search_index_headers: no
>> search_batchsize: 8192
>> defaultsearchtier: t1
>> t1searchpartition-default: /expert/search
>> t1searchpartition-expert2: /expert2/search
>> t1searchpartition-expert3: /expert3/search
>> 
>> Could anyone help me mates?.
> 
> --
> Sebastian Hagedorn - Weyertal 121, Zimmer 2.02
> Regionales Rechenzentrum (RRZK)
> Universität zu Köln / Cologne University - Tel. +49-221-470-89578
 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Xapian searches of the body of an email

2019-01-07 Thread Sebastian Hagedorn
That sounds like the conversationsdb issue I was talking about. Have you 
tried these steps?


ctl_conversationsdb -z USER
ctl_conversationsdb -b USER


I have been testing Xapian searches. Have seen, it's not able to find
strings inside the body of the email. If I set in imap.conf
"search_fuzzy_always: 1" no content is displayed in the searches of a
Roundcube stock webmail. If I remove that config value from imap.conf
and restart services, then search results appear. Does Xapian not index
the body of emails?. Does Xapian, just index the headers?. But this
affirmation does not seem to be possible in my case too... as I have in
the config "search_index_headers: no".

I'm using the following config :

conversations: 1
search_engine: xapian
search_index_headers: no
search_batchsize: 8192
defaultsearchtier: t1
t1searchpartition-default: /expert/search
t1searchpartition-expert2: /expert2/search
t1searchpartition-expert3: /expert3/search

Could anyone help me mates?.




--
Sebastian Hagedorn - Weyertal 121, Zimmer 2.02
Regionales Rechenzentrum (RRZK)
Universität zu Köln / Cologne University - Tel. +49-221-470-89578

pgpwbrC4csF0M.pgp
Description: PGP signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Xapian searches of the body of an email

2019-01-07 Thread Egoitz Aurrekoetxea
Good morning, 

I have been testing Xapian searches. Have seen, it's not able to find
strings inside the body of the email. If I set in imap.conf
"search_fuzzy_always: 1" no content is displayed in the searches of a
Roundcube stock webmail. If I remove that config value from imap.conf
and restart services, then search results appear. Does Xapian not index
the body of emails?. Does Xapian, just index the headers?. But this
affirmation does not seem to be possible in my case too... as I have in
the config "search_index_headers: no". 

I'm using the following config : 

conversations: 1
search_engine: xapian
search_index_headers: no
search_batchsize: 8192
defaultsearchtier: t1
t1searchpartition-default: /expert/search
t1searchpartition-expert2: /expert2/search
t1searchpartition-expert3: /expert3/search 

Could anyone help me mates?. 

Best regards,

-- 

EGOITZ AURREKOETXEA 
Departamento de sistemas 
944 209 470
Parque Tecnológico. Edificio 103
48170 Zamudio (Bizkaia) 
ego...@sarenet.es 
www.sarenet.es [1] 
Antes de imprimir este correo electrónico piense si es necesario
hacerlo. 

Links:
--
[1] http://www.sarenet.es
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus