Hi,
I’ve spent quite a lot time working on a similar issue but I did not think
about it much since (at the time it was Solr 1.3) so some new features could
push me to some other direction, but here is what I remember: You cannot rely
on users entering standardised address format even within one
Hi Yeikel,
I want to stress on three things:
1. If you know the probable words which can be written in different ways
(like street), you can use Synonyms.
2. The longer queries can have different mm's. The mm parameter supports
different values for different word lengths. We
Thank you for jumping in @hastings.recurs...@gmail.com
I have an index with raw addresses in a nonstandardized format such as "123
main street" or "main street 123", and I am looking to search this index and
pull the closest addresses from another raw input with a similar unpredictable
format.
I’ll add to that since I’m up. Stopwords are in a practical sense useless and
serve no purpose. It’s an old way to save index size that’s not needed any
more. You’d need very specific use cases to want to use them. Maybe you do, but
generally you never do unless it’s for training a machine or
That makes sense, thank you for the clarification!
@wun...@wunderwood.org If you can, please build on your explanation as It
sounds relevant.
-Original Message-
From: Dave
Sent: Monday, December 2, 2019 7:38 PM
To: solr-user@lucene.apache.org
Cc: jornfra...@gmail.com
Subject: Re: Is
It clarifies yes. You need new fields. In this case something like
Address_us
Address_uk
And index and search them accordingly with different stopword files used in
different field types, hence the copy field from “address” into as many new
fields as needed
> On Dec 2, 2019, at 7:33 PM,
To clarify, a document would look like this :
{
address: "123 main Street",
country : "US"
}
What I'd like to do when I configure my index is to apply a set of different
stop words to the address field depending on the value of the country. For
example, something like this :
If (country
The best approach is to not use stop words at all. That gives better relevance
with less configuration, so it is a total win.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 2, 2019, at 12:24 PM, Jörn Franke wrote:
>
> You can have different
You can have different fields by country. I am not sure about your stop words
but if they are not occurring in the other languages then you have not a
problem.
On the other hand: it you need more than stop words (eg lemmatizing,
specialized way of tokenization etc) then you need a different
Hi,
I have an index that stores addresses from different countries.
As every country has different stop words, I was wondering if it is possible to
apply a different set of stop words depending on the value of a field.
Or do I need different indexes/do itnat the ETL step to accomplish
10 matches
Mail list logo