I'm running,
dovecot --version
2.3.11.3 (502c39af9)
solr -version
8.6.3
uname -rm
5.8.13-200.fc32.x86_64 x86_64
grep _NAME /etc/os-release
PRETTY_NAME="Fedora 32 (Server Edition)"
CPE_NAME="cpe:/o:fedoraproject:fedora:32"
Solr FTS plugin is enabled/configured,
mail_plugins = virtual acl fts fts_solr
plugin {
fts = solr
fts_autoindex = yes
fts_solr = url=https://solr.example.com:8984/solr/dovecot/
fts_enforced = body
fts_filters = normalizer-icu stopwords snowball
fts_language_config = /usr/share/libexttextcat/fpdb.conf
fts_languages = en es de fr it pt
soft_commit = yes
}
IMAP capability returns,
a OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE SORT
SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT MULTIAPPEND
URL-PARTIAL CATENATE UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED
I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH
LIST-STATUS BINARY MOVE SNIPPET=FUZZY PREVIEW=FUZZY STATUS=SIZE SAVEDATE
SPECIAL-USE LITERAL+ NOTIFY SPECIAL-USE QUOTA ACL RIGHTS=texk] Logged in
I've got two messages in my IMAP store,
cd /data/vmail/example.com/myuser/Maildir/cur/
ls -altr | grep S= | /bin/tail -n2
-rw------- 1 vmail vmail 1.3K Oct 11 14:05
1602450306.M393628P65260.mx.example.com,S=1278,W=1304:2,S
-rw------- 1 vmail vmail 1.3K Oct 11 14:05
1602450353.M756184P65260.mx.example.com,S=1277,W=1303:2,S
that differ in BODY CONTENT --
-- one message has ascii txt with NO character accents
-- the other has the same text, but with ON character accent
cat "1602450306.M393628P65260.mx.example.com,S=1278,W=1304:2,S"
...
From: M User <[email protected]>
Subject: test
Reply-To: [email protected]
To: "User, My" <[email protected]>
Message-ID:
<[email protected]>
Date: Sun, 11 Oct 2020 14:05:06 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0)
Gecko/20100101
Thunderbird/78.3.2
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
!!!! también
cat 1602450353.M756184P65260.mx.example.com,S=1277,W=1303:2,S
...
From: M User <[email protected]>
Subject: test
Reply-To: [email protected]
To: "User, My" <[email protected]>
Message-ID:
<[email protected]>
Date: Sun, 11 Oct 2020 14:05:53 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0)
Gecko/20100101
Thunderbird/78.3.2
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
!!!! tambien
i manually re-scan & index
doveadm fts rescan -u [email protected]
doveadm index -u [email protected] -q '*'
...
==> /var/log/dovecot/dovecot-info.log <==
2020-10-11 15:06:34
indexer-worker([email protected])<OyUmLeqBg18fDAEA+IOfAw>: Info: Indexed 21
messages in accts (UIDs 14399..130699)
2020-10-11 15:06:34
indexer-worker([email protected])<6NnOMuqBg18fDAEA+IOfAw>: Info: Indexed 16
messages in accts/v007132 (UIDs 13414..14778)
...
with no errors.
then search in mail client, here TBird 78, with
[X] Run Search on Server
for _un_accented "tambien", match is correctly -- and quickly -- returned.
in logs,
==> /var/log/dovecot/dovecot-info.log <==
2020-10-11 14:57:05 imap-login: Info: Login: user=<[email protected]>,
method=PLAIN, rip=10.0.1.7, lip=10.0.1.50, mpid=67743, TLS
2020-10-11 14:57:16
indexer-worker([email protected])<3ZUzQ2yx2JKsHgsH:9gu0MbF/g1+hCAEA+IOfAw>:
Info: Indexed 4788 messages in INBOX (UIDs 135476..140263)
BUT, repeating search for ACCENTED "también" returns *no* match/result.
No errors in log, simply no match.
Attempting to test/debug from from cmd line,
doveadm fts lookup -u [email protected] body "tambien"
causes a PANIC
doveadm([email protected]): Panic: file mail-storage.c: line 2112
(mailbox_get_open_status): assertion failed: (box->opened)
doveadm([email protected]): Error: Raw backtrace:
/usr/lib64/dovecot/libdovecot.so.0(backtrace_append+0x46) [0x7f3ee94accc6] ->
/usr/lib64/dovecot/libdovecot.so.0(backtrace_get+0x22) [0x7f3ee94acde2] ->
/usr/lib64/dovecot/libdovecot.so.0(+0x10025b) [0x7f3ee94b625b] ->
/usr/lib64/dovecot/libdovecot.so.0(+0x100297) [0x7f3ee94b6297] ->
/usr/lib64/dovecot/libdovecot.so.0(+0x59bc6) [0x7f3ee940fbc6] ->
/usr/lib64/dovecot/libdovecot-storage.so.0(+0x4779e) [0x7f3ee95c379e] ->
/usr/lib64/dovecot/lib21_fts_solr_plugin.so(+0x5849) [0x7f3ee9015849] ->
/usr/lib64/dovecot/lib20_fts_plugin.so(fts_backend_lookup+0x51)
[0x7f3ee8c37491] ->
/usr/lib64/dovecot/doveadm/lib20_doveadm_fts_plugin.so(+0x3280)
[0x7f3ee8ba9280] -> doveadm(+0x343cd) [0x5637e99443cd] -> doveadm(+0x34fe0)
[0x5637e9944fe0] -> doveadm(doveadm_cmd_ver2_to_mail_cmd_wrapper+0x22d)
[0x5637e9945e2d] -> doveadm(doveadm_cmd_run_ver2+0x4e8) [0x5637e99568d8] ->
doveadm(doveadm_cmd_try_run_ver2+0x3e) [0x5637e995692e] -> doveadm(main+0x1d4)
[0x5637e9934cf4] -> /lib64/libc.so.6(__libc_start_main+0xf2) [0x7f3ee9071042]
-> doveadm(_start+0x2e) [0x5637e99351ce]
Aborted
(1) What config -- dovecot &/or solr -- is needed to match on accented
characters?
(2) What add'l detail, if any, is needed for troubleshooting the panic?