[
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582142#comment-13582142
]
Markus Jelsma commented on SOLR-1632:
-------------------------------------
It doesn't really seem to work, we're seeing lots of NPE's and if a response
comes through IDF is not consistent for all terms. Most request return one of
the NPE's below. Sometimes it works, and then the second request just fails.
{code}
java.lang.NullPointerException
at
org.apache.solr.search.stats.ExactStatsCache.sendGlobalStats(LRUStatsCache.java:202)
at
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}
{code}
java.lang.NullPointerException
at
org.apache.solr.search.stats.LRUStatsCache.sendGlobalStats(LRUStatsCache.java:228)
at
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}
We also see this one from time to time, it looks like this is thrown is there
are `no servers hosting shard`:
{code}
java.lang.NullPointerException
at
org.apache.solr.search.stats.LRUStatsCache.mergeToGlobalStats(LRUStatsCache.java:112)
at
org.apache.solr.handler.component.QueryComponent.updateStats(QueryComponent.java:743)
at
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:659)
at
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:634)
at ..
{code}
It's also imposes a huge performance penalty with both LRUStatsCache and
ExactStatsCache, if you're used to 40ms response times you'll see the average
jump to 2 seconds with very frequent 5 second spikes. Performance stays poor if
logging is disabled.
The logs are also swamped with logs like:
{code}
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5]
- : ## Missing global colStats info: <FIELD>, using local
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5]
- : ## Missing global termStats info: <FIELD>:<TERM>, using local
{code}
Both StatsCacheImpls behave like this. Each query logs lines like above. Maybe
performance is poor because it tries to look up terms everytime but i'm not
sure yet.
Finally something crazy i'd like to share :)
{code}
-Infinity = (MATCH) sum of:
-Infinity = (MATCH) max plus 0.35 times others of:
-Infinity = (MATCH) weight(content_nl:amsterdam^1.6 in 449) [], result of:
-Infinity = score(doc=449,freq=1.0 = termFreq=1.0
), product of:
1.6 = boost
-Infinity = idf(docFreq=29800090, docCount=-1)
1.0 = tfNorm, computed from:
1.0 = termFreq=1.0
1.2 = parameter k1
0.0 = parameter b (norms omitted for field)
{code}
If someone happens to recognize the issues above, i'm all ears :)
> Distributed IDF
> ---------------
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
> Issue Type: New Feature
> Components: search
> Affects Versions: 1.5
> Reporter: Andrzej Bialecki
> Fix For: 5.0
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch,
> distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch,
> SOLR-1632.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across
> non-uniform shards. This issue tracks the proposed implementation of an API
> to support this functionality in Solr.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]