Hi John,
Solr 4.4. is very old - if you are to upgrade to 8.8 (and you should
really) you'll be able to take very little with you. Your old
configuration can serve as inspiration but not a direct source. It will
be a big task, but if you approach it the right way you'll be fine.
Solr 8.8 does support Danish in terms of stemming etc.
https://solr.apache.org/guide/8_0/language-analysis.html and there are
plenty of people using Solr in Denmark (e.g. the Danish Library - see
Tove's excellent blog https://sbdevel.wordpress.com/author/eskildsen/
and an old client of mine, Infomedia). There has been some work on a
Danish soundex algorithm it seems (old blog from Findwise here
https://findwise.com/blog/things-to-consider-when-implementing-phonetic-search/)
but it was originally built for English it seems. I'd think about what
soundex will actually help with though, there may be other ways to help
your users.
Spell suggestion is often done using the index, not a dictionary,
although you can use a dictionary if you like (e.g. via Hunspell). It
thus corrects to words you know reference items in your index.
Hope this helps. By the way it looks like you're in ecommerce, we've
been writing a lot about ecommerce and Solr recently as we're
contributing to a reference implementation, Chorus
https://opensourceconnections.com/blog/2020/10/29/a-tool-stack-for-open-source-ecommerce-site-search/
Cheers
Charlie
On 15/03/2021 12:07, John Nielsen wrote:
Hi all,
We are currently running solr 4.4 on a 3 node cluster. We have never had an
incentive to upgrade. That old solr version is heavily integrated into our
infrastructure, so we always considered an upgrade to be a monumental task.
We recently started looking at the new solr features and we are starting to
reconsider if it might be worth it to upgrade after all.
My google-fu has failed me on some points and I was hoping that someone
here might help with them.
We are still using the old static core loading system which was deprecated
in version 4.0. We will obviously need to redo that part.
Apart from that, would it be reasonably safe to assume that our current
schema and configuration would work out of the box with with solr 8.8 or
should we expect to need to redo the configuration?
Phonetic search is the most enticing of the new features for us. It doesn't
look like it has support for Danish, however. Does that mean that this
feature is a no-go for us, or are there other ways of making it work, like
a "generic" language setting? I couldn't find any information regarding
this.
We have looked at the new spell checking and search suggestions. The
documentation references dictionary support, yet i couldn't find anything
for Danish. Are the dictionaries something which is mean to be hand-crafted
or is there an external source we might use? If not, would an index based
spellchecker be a better choice for us?
--
Charlie Hull - Managing Consultant at OpenSource Connections Limited
<www.o19s.com>
Founding member of The Search Network <https://thesearchnetwork.com/>
and co-author of Searching the Enterprise
<https://opensourceconnections.com/about-us/books-resources/>
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828