Hello Trevor. Can you help me better understand this approach? If we have a text "wifi router" and inject "internet device" at indexing time, terms reside at the same positions. How to avoid false positive match for query "wifi device"?
On Mon, Jan 2, 2023 at 4:16 PM Trevor Nicholls <tre...@castingthevoid.com> wrote: > Hi Anh > > The two links Michael shared relate to questions I asked when I was trying > to get synonym matching with our application. > > I really do have multi-term synonym matching working at this point; > there's always scope for improvement of course but with the hints suppled > in those threads I was able to index our documents and search them using a > variety of synonymous terms, both single words and phrases. > > Our application does not use either BooleanQuery or SynonymQuery; I have > just used the standard QueryParser. Instead the synonym processing occurs > in the indexing phase, which is not only simpler (one search pattern, one > query), but also I think you would also find it gives you superior > performance (because the synonym processing occurs once at indexing time > and not at all during searching - and I'm sure you'll be doing far more > searching than indexing). > > cheers > T > > > -----Original Message----- > From: Michael Wechner <michael.wech...@wyona.com> > Sent: Thursday, 29 December 2022 08:56 > To: java-user@lucene.apache.org > Subject: Re: Question for SynonymQuery > > Hi Anh > > The following Stackoverflow link might help > > > https://stackoverflow.com/questions/73240494/can-someone-assist-me-with-a-multi-word-synonym-problem-in-lucene > > The following thread seems to confirm, that escaping the space with a > backslash does not help > > https://lists.apache.org/list?java-user@lucene.apache.org:2022-3 > > HTH > > Michael > > > Am 27.12.22 um 20:22 schrieb Anh Dũng Bùi: > > Hi Lucene users, > > > > I recently came across SynonymQuery and found out that it only > > supports single-term synonyms (since it accepts a list of Term which > > will be considered as synonyms). We have some multi-term synonyms like > > "internet device" <-> "wifi router" or "dns" <-> "domain name > > service". Am I right that I need to use something like a BooleanQuery > for these cases? > > > > I have 2 other follow-up questions: > > - Does SynonymQuery have any advantage over BooleanQuery? Or is it > > only different in how scores are computed? As I understand > > SynonymWeight will consider all terms as exactly the same while > > BooleanQuery will favor the documents with more matched terms. > > - Is it worth it to support multi-term synonyms in SynonymQuery? My > > feeling is that it's better to just use BooleanQuery in those cases, > > since to support multi-term synonyms it needs to accept a list of > > Query, which would make it behave like a BooleanQuery. Also how > > scoring works with multi-term is another problem. > > > > Thanks & Regards! > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Sincerely yours Mikhail Khludnev https://t.me/MUST_SEARCH A caveat: Cyrillic!