WordDelimiterGraphFactory with preserveOriginal issue

2019-05-20 Thread rodio
Hi everybody,

I'm using solr 8.0.0 and I'm stuck in a weird behaviour that I cannot solve
by myself.
This is my fieldType config:


  






  
  





  


The problem is that I need preserveOriginal=1 in both analyzers and the
results are not right when launch a query with another field.

For example if a run this query:

idWeb: X AND name:(Leimhzolz 18x600x200 mm)

The parsed query is:

+idWeb:X +(name:leimhzolz (name:18x600x200 (+name:18 +name:x +name:600
+name:x +name:200)) name:mm)

Only docs with "18x600x200" or "mm" are scored. No score for "18" or "x" or
"600"

If a run this query:

name:(Leimhzolz 18x600x200 mm)

The parsed query is:

name:leimhzolz (name:18x600x200 (+name:18 +name:x +name:600 +name:x
+name:200)) name:mm

In this case, there are docs with "18", "x", "600" with score > 0

I have tried all kind of combination without success

I would be very glad if anyone has a solution for this matter

Many thanks in advance

Kind regards






--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Duplicated tokens in search string

2019-04-09 Thread rodio
Hi all,

We are trying to emulate in Solr 8.0 the behaviour of Solr 3.6 and we are
facing a problem that we cannot solve

When we have duplicated tokens:

- Solr 8.0 scores only once the token but it applies a huge boost
- Solr 3.6 scores individually each token and the final score is lower

We are using ClassicSimilarity algorythm but we cannot prevent that boosting

Example: table 60 cm 50 cm

Solr 8.0

/11.096966 = sum of:
  4.3195267 = sum of:
4.3195267 = weight(name:table in 138556) [ClassicSimilarity], result of:
  4.3195267 = score(freq=1.0), product of:
8.639053 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
  62381 = docFreq, number of documents containing term
  129615816 = docCount, total number of documents with field
1.0 = tf(freq=1.0), with freq of:
  1.0 = freq, occurrences of term within document
0.5 = fieldNorm
  2.7624812 = weight(name:60 in 138556) [ClassicSimilarity], result of:
2.7624812 = score(freq=1.0), product of:
  5.5249624 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
1404402 = docFreq, number of documents containing term
129615816 = docCount, total number of documents with field
  1.0 = tf(freq=1.0), with freq of:
1.0 = freq, occurrences of term within document
  0.5 = fieldNorm
  4.0149584 = weight(name:cm in 138556) [ClassicSimilarity], result of:
4.0149584 = score(freq=1.0), product of:
*  2.0 = boost*
  4.0149584 = idf, computed as log((docCount+1)/(docFreq+1)) + 1 from:
6357381 = docFreq, number of documents containing term
129615816 = docCount, total number of documents with field
  1.0 = tf(freq=1.0), with freq of:
1.0 = freq, occurrences of term within document
  0.5 = fieldNorm
/

Solr 3.6

/3.098446 = (MATCH) product of:
  3.8730574 = (MATCH) sum of:
2.120801 = (MATCH) sum of:
  2.120801 = (MATCH) weight(name:table in 101441), product of:
0.4913325 = queryWeight(name:table), product of:
  8.632854 = idf(docFreq=135231, maxDocs=279245306)
  0.05691426 = queryNorm
4.316427 = (MATCH) fieldWeight(name:table in 101441), product of:
  1.0 = tf(termFreq(name:table)=1)
  8.632854 = idf(docFreq=135231, maxDocs=279245306)
  0.5 = fieldNorm(field=name, doc=101441)
0.8427305 = (MATCH) weight(name:60 in 101441), product of:
  0.30972046 = queryWeight(name:60), product of:
5.4418783 = idf(docFreq=3287778, maxDocs=279245306)
0.05691426 = queryNorm
  2.7209392 = (MATCH) fieldWeight(name:60 in 101441), product of:
1.0 = tf(termFreq(name:60)=1)
5.4418783 = idf(docFreq=3287778, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.45476305 = (MATCH) weight(name:cm in 101441), product of:
  0.22751924 = queryWeight(name:cm), product of:
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.05691426 = queryNorm
  1.9987894 = (MATCH) fieldWeight(name:cm in 101441), product of:
1.0 = tf(termFreq(name:cm)=1)
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
0.45476305 = (MATCH) weight(name:cm in 101441), product of:
  0.22751924 = queryWeight(name:cm), product of:
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.05691426 = queryNorm
  1.9987894 = (MATCH) fieldWeight(name:cm in 101441), product of:
1.0 = tf(termFreq(name:cm)=1)
3.9975789 = idf(docFreq=13936507, maxDocs=279245306)
0.5 = fieldNorm(field=name, doc=101441)
  0.8 = coord(4/5)
/

Is it possible to configure this?

Thanks in advance!




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Strange behaviour building a query: SynonymQuery

2019-04-04 Thread rodio
Hi all!

It's solved!

I have seen that we are using a deprecated version!

Use WordDelimiterGraphFilterFactory instead of WordDelimiterFilterFactory
solves the problem

https://lucene.apache.org/solr/guide/7_2/filter-descriptions.html#word-delimiter-filter
https://lucene.apache.org/solr/guide/7_2/filter-descriptions.html#word-delimiter-graph-filter
 

name:whatever (name:7893ght23 (+name:7893 +name:ght +name:23))

Regards






--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Strange behaviour building a query: SynonymQuery

2019-04-04 Thread rodio
Hi all,

This is my first question in this forum, i'm newbye with Solr so I would be
very glad if someone can resolve my doubt.

We are evaluating new version of Solr 8

The problem is that when we build a query using WordDelimiterFilterFactory
with preserveOriginal = 1, the parsed query has a behaviour not expected.

The example could be:

/name:(Whatever 7893GHT23)/

The parsed query returns: 

/name:whatever Synonym(name:7893 name:7893ght23) name:ght name:23/

The expected parsed query must be:

/name:whatever Synonym(name:7893 name:7893ght23 name:ght name:23)/

We have tried to use synonymQueryStyle attribute with differents values in
TypeField without success, always get only two terms!!!


/
  






  
  





  
/

We have an old Solr (version 3.6), that builds the query properly:

/name:whatever (name:7893ght23 name:7893 name:ght name:23)/

It would be great if anyone knows what could be happen

Thanks in advance!!!






--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html