We'll go thru all your updates, and try to update the wiki with what there is. Your effort is really appreciated here.
Aki > On 04 January 2019 at 02:38 Joan Moreau via dovecot <[email protected]> > wrote: > > > Hi > > This is the summary of my work with SOLR-Dovecot, in my QUEST TO > REPRODUCE THE PREVIOULSY EXCELLENT WORK OF FTS_SQUAT > > @Aki : Based on the time I have spent on this, I would love to see you > updating the Wiki with those improvements, and adding my name somewhere > > @All : Hope it helps > > - INSTALLATION: > > -> Create a clean install using the default, (at least in the Archlinux > package), and do a "sudo -u solr solr create -c dovecot ". The config > files are then in /opt/solr/server/solr/dovecot/conf and datafiles in > /opt/solr/server/solr/dovecot/data > > -> In /opt/solr/server/solr/dovecot/conf/solrconfig.xml: > > * around line 313, change <openSearcher>false</openSearcher> to > <openSearcher>true</openSearcher> > > * around line 147, set <writeLockTimeout>2000</writeLockTimeout> > (or above) > > * around line 1127, before <updateProcessor > class="solr.UUIDUpdateProcessorFactory" name="uuid"/>, add > <schemaFactory class="ClassicIndexSchemaFactory"></schemaFactory> > > * around line 1161, delete the whole <updateProcessor > class="solr.AddSchemaFieldsUpdateProcessorFactory" > name="add-schema-fields"> > > * around line 1192, remove the whole <updateRequestProcessorChain > name="add-unknown-fields-to-the-schema" ... /> > > -> Remove /opt/solr/server/solr/dovecot/conf/managed-schema > > -> Change "schema.xml" by the one below to reproduce fts_squat behavior > (equivalent to " fts_squat = partial=3 full=25" in dovecot.conf) (note : > such a huge trouble to replace a single line setup, anyway...) > > -> Move /opt/solr/server/solr (or the subfolder data) to a partition > with *space*, ideally ext4 or faster file system (it looks like Solr is > not considering using a simple mysql database, which would make sense to > avoid all the fuzz and let it transit to a non-java state, but that is > another story) > > -> Config of dovecot.conf is as below > > -> The systemd unit shall specify high ulimit for files and proc (see > below) > > -> Increase the memory available for the JavaVM (I put 12Gb as I have > quite a space on my server, but you may adapt it as per your specs) : in > /opt/solr/bin/solr.in.sh, set SOLR_HEAP="12288m" > > -> As Solr is complaining a lot, you may consider a filter for it in > your syslog-ng or journald as it pollutes greatly your audit files > > -> (re)Start solr (first) and dovecot by systemctl > > -> Launch redindex ( doveadm fts rescan -u <username> ) > > -> wait for a big while to let the system re-index all your mail boxes > > - BUGS SO FAR > > -> Line 620 of fts_solr dovecot plugin : the size oof header is > improperly calculated ("huge header" warning for a simple email, which > kilss the index of that considered email, so basically MOST emails as > the calculation is wrong) > > -> The UID returned by SOlr is to be considered as a STRING (and that is > maybe the source of problem of the "out of bound" errors in fts_solr > dovecot, as "long" is not enough) > > -> Java errors : A lot of non sense for me, I am not expert in Java. > But, with increased memory, it seems not crashing, even if complaining > quite a lot in the logs > > -------SCHEMA.XML IN /OPT/SOLR/SERVER/SOLR/DOVECOT/CONF > > <?xml version="1.0" encoding="UTF-8"?> > <schema name="dovecot" version="2.0"> > <uniqueKey>id</uniqueKey> > <fieldType name="dovecottext" class="solr.TextField" > autoGeneratePhraseQueries="true" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.ClassicTokenizerFactory"/> > <filter class="solr.WordDelimiterGraphFilterFactory" catenateNumbers="1" > generateNumberParts="1" splitOnCaseChange="1" generateWordParts="1" > splitOnNumerics="1" catenateAll="1" catenateWords="1" > preserveOriginal="1"/> > <filter class="solr.FlattenGraphFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > <fieldType name="dovecotfield" class="solr.TextField" > autoGeneratePhraseQueries="true"> > <analyzer type="index"> > <tokenizer class="solr.ClassicTokenizerFactory"/> > <filter class="solr.NGramFilterFactory" minGramSize="3" > maxGramSize="25"/> > <filter class="solr.TrimFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > <fieldType name="string" class="solr.StrField"/> > <field name="_version_" type="string" indexed="true" stored="true"/> > <field name="bcc" type="string" indexed="false" stored="false"/> > <field name="body" type="dovecottext" indexed="true" stored="false"/> > <field name="box" type="string" indexed="true" required="true" > stored="true"/> > <field name="cc" type="dovecotfield" indexed="true" stored="false"/> > <field name="from" type="dovecotfield" indexed="true" stored="false"/> > <field name="hdr" type="string" indexed="false" stored="false"/> > <field name="id" type="string" indexed="true" required="true" > stored="true"/> > <field name="subject" type="dovecottext" indexed="true" stored="false"/> > <field name="to" type="dovecotfield" indexed="true" stored="false"/> > <field name="uid" type="string" indexed="true" required="true" > stored="true"/> > <field name="user" type="string" indexed="true" required="true" > stored="true"/> > </schema> > > -- DOVECOT.CONF > > mail_plugins = fts fts_solr > > plugin { > plugin = fts fts_solr managesieve sieve > > fts = solr > fts_autoindex = yes > fts_enforced = yes > fts_solr = url=http://127.0.0.1:8983/solr/dovecot/ > > (replace 127.0.0.1 by your solr server if you want to use an external > server) > (...) > > } > > -- /ETC/SYSTEMD/SYSTEM/MULTI-USER.TARGET.WANTS/SOLR.SERVICE > > [Unit] > Description=Solr full text search engine > After=network.target > > [Service] > Type=simple > User=solr > Group=solr > PrivateTmp=yes > WorkingDirectory=/opt/solr > LIMITNOFILE=65000 > LIMITNPROC=65000 > ExecStart=/opt/solr/bin/solr start -f > > [Install] > WantedBy=multi-user.target
