Re: FTS-lucene errors : language not available for stemming

2020-05-21 Thread Joan Moreau
Hello 

Indexer does not run as root 

It runs as "mail_uid = xxx" (based on your config) 


dovecot-fts-xapian is easy to configure, but has a big downside compared
to solr in that the indexer runs as root.

Re: FTS-lucene errors : language not available for stemming

2020-05-20 Thread David Gessel



On 2020-05-19 16:48, Stuart Henderson wrote:

On 2020-05-19, David Gessel  wrote:

I'm getting some log errors with clucene that I am having no luck tracking down 
on the interwebs.

This looks relevant:

https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html


Thanks Stuart & Jan - no_snowball seems to have cleared up the errors.

relevant  config now reads:

plugin {
  fts = lucene
  # Lucene-specific settings, good ones are:
  fts_lucene = whitespace_chars=@. mime_parts no_snowball
}


May 20 04:40:50 
indexer-worker(ges...@blackrosetech.com)<26130>:
 Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: 
IndexWriter::addDocument() failed (#4): language not available for stemming
May 20 04:40:50 indexer-worker: Error:
May 20 04:40:50 
indexer-worker(ges...@blackrosetech.com)<26130>:
 Error: Mailbox Lists.Spamassassin: Mail search failed: Internal error occurred. Refer to 
server log for more information. [2020-05-20 04:40:50]
May 20 04:40:50 
indexer-worker(ges...@blackrosetech.com)<26130>:
 Error: Mailbox Lists.Spamassassin: Transaction commit failed: FTS transaction commit 
failed: transaction context (attempted to index 2 messages (UIDs 7..8))
May 20 04:45:05 master: Warning: Killed with signal 15 (by pid=81740 uid=0 
code=kill)
May 20 04:46:39 
indexer-worker(ges...@blackrosetech.com)<87087><5jtvLp8YxV4tVAEA0J78UA:NexHM58YxV4vVAEA0J78UA>:
 Warning: fts-lucene: Settings have changed, rebuilding index for mailbox

(no further errors, various mailboxes being indexed.)



I am considering switch to xapian (solr and java... pls noe) as the
port is quite tempting from an ease of integration perspective, but the
easiest solution would be to resolve these odd indexing errors.  Anyone
have a clue?

dovecot-fts-xapian is easy to configure, but has a big downside compared
to solr in that the indexer runs as root.





-David



Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Benny Pedersen

On 2020-05-19 16:28, Aki Tuomi wrote:


 Also if you were looking carefully what happens, you'd notice dovecot
calls seteuid() before actually doing the indexing work.


would be more sense in to have a dovecot shell that all commands must be 
issued in, like postgresql, setuid is nice still but its not well known 
it happends the same as apache starts as root for port under 1024 but 
after start a fork that is not running as root


Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Benny Pedersen

On 2020-05-19 16:22, Aki Tuomi wrote:


 Thats doveadm though, not indexer.


and dovecot allows it


Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Benny Pedersen

On 2020-05-19 16:18, Stuart Henderson wrote:


It does in the not entirely uncommon case where you have setup
dovecot-fts-xapian, have multiple system users rather than a single
uid owning all mailboxes, and need to index all mailboxes.

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
COMMAND
44468 root   20   11M   18M sleep/6   netio 1:41 68.36% 
doveadm index -A *


With solr the indexing is done out-of-process and typically under a
safe uid.


no doveconf -n, no problem

if you really have a bug it should be solved

i think you could

su --user=non-root-user doveadm index -A

to not have it run as root, it works well for fangfrisch, dovecot should 
not allow commands as root without thinking of consequences


Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Aki Tuomi


 
 
  
   
  
  
   
On 19/05/2020 17:22 Aki Tuomi  wrote:
   
   

   
   

   
   

   
   

 On 19/05/2020 17:18 Stuart Henderson  wrote:


 


 


 On 2020/05/19 17:04, Aki Tuomi wrote:


 
  On 19/05/2020 16:48 Stuart Henderson  wrote:
 
 
  
 
 
  
 
 
  On 2020-05-19, David Gessel  wrote:
 
 
  
 
 
  I'm getting some log errors with clucene that I am having no luck tracking down on the
 
 
  interwebs.
 
 
  
 
 
  This looks relevant:
 
 
  
 
 
  https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html
 
 
  
 
 
  
 
 
  I am considering switch to xapian (solr and java... pls noe) as the
 
 
  port is quite tempting from an ease of integration perspective, but the
 
 
  easiest solution would be to resolve these odd indexing errors. Anyone
 
 
  have a clue?
 
 
  
 
 
  dovecot-fts-xapian is easy to configure, but has a big downside compared
 
 
  to solr in that the indexer runs as root.
 
 
  
 
 
  Dovecot indexer does not run as root.
 
 
  
 
 
  ---
 
 
  Aki Tuomi
 
 
  
 


 It does in the not entirely uncommon case where you have setup


 dovecot-fts-xapian, have multiple system users rather than a single


 uid owning all mailboxes, and need to index all mailboxes.


 


 PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND


 44468 root 2 0 11M 18M sleep/6 netio 1:41 68.36% doveadm index -A *


 


 With solr the indexing is done out-of-process and typically under a


 safe uid.

   
   
Thats doveadm though, not indexer.
   
   
---
Aki Tuomi
   
  
  
   
  
  
   Also if you were looking carefully what happens, you'd notice dovecot calls seteuid() before actually doing the indexing work.
  
  
   ---
Aki Tuomi
  
 



Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Aki Tuomi


 
 
  
   
  
  
   
On 19/05/2020 17:18 Stuart Henderson  wrote:
   
   

   
   

   
   
On 2020/05/19 17:04, Aki Tuomi wrote:
   
   

 On 19/05/2020 16:48 Stuart Henderson  wrote:


 


 


 On 2020-05-19, David Gessel  wrote:


 


 I'm getting some log errors with clucene that I am having no luck tracking down on the


 interwebs.


 


 This looks relevant:


 


 https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html


 


 


 I am considering switch to xapian (solr and java... pls noe) as the


 port is quite tempting from an ease of integration perspective, but the


 easiest solution would be to resolve these odd indexing errors. Anyone


 have a clue?


 


 dovecot-fts-xapian is easy to configure, but has a big downside compared


 to solr in that the indexer runs as root.


 


 Dovecot indexer does not run as root.


 


 ---


 Aki Tuomi


 

   
   
It does in the not entirely uncommon case where you have setup
   
   
dovecot-fts-xapian, have multiple system users rather than a single
   
   
uid owning all mailboxes, and need to index all mailboxes.
   
   

   
   
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
   
   
44468 root 2 0 11M 18M sleep/6 netio 1:41 68.36% doveadm index -A *
   
   

   
   
With solr the indexing is done out-of-process and typically under a
   
   
safe uid.
   
  
  
   Thats doveadm though, not indexer.
  
  
   ---
Aki Tuomi
  
 



Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Stuart Henderson
On 2020/05/19 17:04, Aki Tuomi wrote:
> 
> On 19/05/2020 16:48 Stuart Henderson  wrote:
> 
> 
> On 2020-05-19, David Gessel  wrote:
> 
> I'm getting some log errors with clucene that I am having no luck 
> tracking down on the
> interwebs.
> 
> This looks relevant:
> 
> https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html
> 
> 
> I am considering switch to xapian (solr and java... pls noe) as the
> port is quite tempting from an ease of integration perspective, but 
> the
> easiest solution would be to resolve these odd indexing errors.  
> Anyone
> have a clue?
> 
> dovecot-fts-xapian is easy to configure, but has a big downside compared
> to solr in that the indexer runs as root.
> 
> Dovecot indexer does not run as root.
> 
> ---
> Aki Tuomi
> 

It does in the not entirely uncommon case where you have setup
dovecot-fts-xapian, have multiple system users rather than a single
uid owning all mailboxes, and need to index all mailboxes.

  PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
44468 root   20   11M   18M sleep/6   netio 1:41 68.36% doveadm 
index -A *

With solr the indexing is done out-of-process and typically under a
safe uid.



Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Aki Tuomi


 
 
  
   
  
  
   
On 19/05/2020 16:48 Stuart Henderson  wrote:
   
   

   
   

   
   
On 2020-05-19, David Gessel  wrote:
   
   

 I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.

   
   
This looks relevant:
   
   

   
   
https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html
   
   

   
   

 I am considering switch to xapian (solr and java... pls noe) as the


 port is quite tempting from an ease of integration perspective, but the


 easiest solution would be to resolve these odd indexing errors.  Anyone


 have a clue?

   
   
dovecot-fts-xapian is easy to configure, but has a big downside compared
   
   
to solr in that the indexer runs as root.
   
  
  
   Dovecot indexer does not run as root.
  
  
   ---
Aki Tuomi
  
 



Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Stuart Henderson
On 2020-05-19, David Gessel  wrote:
> I'm getting some log errors with clucene that I am having no luck tracking 
> down on the interwebs.

This looks relevant:

https://www.mail-archive.com/dovecot@dovecot.org/msg66366.html

> I am considering switch to xapian (solr and java... pls noe) as the
> port is quite tempting from an ease of integration perspective, but the
> easiest solution would be to resolve these odd indexing errors.  Anyone
> have a clue?

dovecot-fts-xapian is easy to configure, but has a big downside compared
to solr in that the indexer runs as root.




Re: FTS-lucene errors : language not available for stemming

2020-05-19 Thread Jan Bramkamp

On 19.05.20 15:15, David Gessel wrote:

I'm getting some log errors with clucene that I am having no luck 
tracking down on the interwebs.



Errors:

May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>: 
Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: 
IndexWriter::addDocument() failed (#4): language not available for 
stemming

May 19 05:05:16 indexer-worker: Error:
May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>: 
Error: Mailbox Security: Mail search failed: Internal error occurred. 
Refer to server log for more information. [2020-05-19 05:05:16]
May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>: 
Error: Mailbox Security: Transaction commit failed: FTS transaction 
commit failed: transaction context (attempted to index 1 messages 
(UIDs 152736..152736))



Config:

FreeBSD 11.3-RELEASE-p8 #0 r360490

dovecot-2.3.10_3

clucene-2.3.3.4_19

py37-pystemmer-2.0.0.1

py37-snowballstemmer-1.2.1

icu-67.1,1

plugin {
  #setting_name = value
    expire = Trash
    mail_log_events = delete undelete expunge copy mailbox_delete 
mailbox_rename

    mail_log_fields = uid box msgid size
    fts_autoindex=yes
    #zlib_save_level = 6 # 1..9
    #zlib_save = gz # or bz2
}

plugin {
  fts = lucene
  # Lucene-specific settings, good ones are:
  fts_lucene = whitespace_chars=@. mime_parts
}

I am considering switch to xapian (solr and java... pls noe) as the 
port is quite tempting from an ease of integration perspective, but 
the easiest solution would be to resolve these odd indexing errors.  
Anyone have a clue?


I ran into the same problem a few weeks back. The workaround I found was 
to add no_snowball to fts_lucene. It disables the snowball algorithm.


FTS-lucene errors : language not available for stemming

2020-05-19 Thread David Gessel

I'm getting some log errors with clucene that I am having no luck tracking down 
on the interwebs.


Errors:

May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>:
 Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: 
IndexWriter::addDocument() failed (#4): language not available for stemming
May 19 05:05:16 indexer-worker: Error:
May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>:
 Error: Mailbox Security: Mail search failed: Internal error occurred. Refer to server log 
for more information. [2020-05-19 05:05:16]
May 19 05:05:16 
indexer-worker(ges...@blackrosetech.com)<62971>:
 Error: Mailbox Security: Transaction commit failed: FTS transaction commit failed: 
transaction context (attempted to index 1 messages (UIDs 152736..152736))


Config:

FreeBSD 11.3-RELEASE-p8 #0 r360490

dovecot-2.3.10_3

clucene-2.3.3.4_19

py37-pystemmer-2.0.0.1

py37-snowballstemmer-1.2.1

icu-67.1,1

plugin {
  #setting_name = value
    expire = Trash
    mail_log_events = delete undelete expunge copy mailbox_delete 
mailbox_rename
    mail_log_fields = uid box msgid size
    fts_autoindex=yes
    #zlib_save_level = 6 # 1..9
    #zlib_save = gz # or bz2
}

plugin {
  fts = lucene
  # Lucene-specific settings, good ones are:
  fts_lucene = whitespace_chars=@. mime_parts
}



I am considering switch to xapian (solr and java... pls noe) as the port is 
quite tempting from an ease of integration perspective, but the easiest 
solution would be to resolve these odd indexing errors.  Anyone have a clue?

-David