[jira] [Commented] (SOLR-1632) Distributed IDF

2016-11-04 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15636656#comment-15636656
 ] 

Erick Erickson commented on SOLR-1632:
--

Please ask usage questions on the user's list, see "mailing lists" here: 
http://lucene.apache.org/solr/resources.html


You'll get a lot more eyeballs on the question and likely a much faster answer.

> Distributed IDF
> ---
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.5
>Reporter: Andrzej Bialecki 
>Assignee: Anshum Gupta
> Fix For: 5.0, 6.0
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
> distrib.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across 
> non-uniform shards. This issue tracks the proposed implementation of an API 
> to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2016-11-04 Thread blackwing (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635553#comment-15635553
 ] 

blackwing commented on SOLR-1632:
-

I've activated distrubted idf. I've two shards for my collection, shard1 
contains 1000 docs and shard2 contains 800 doc.

So maxDoc to calculate idf for a particular doc score is 1000+800?

> Distributed IDF
> ---
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.5
>Reporter: Andrzej Bialecki 
>Assignee: Anshum Gupta
> Fix For: 5.0, 6.0
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
> distrib.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across 
> non-uniform shards. This issue tracks the proposed implementation of an API 
> to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2015-09-18 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805205#comment-14805205
 ] 

Varun Thacker commented on SOLR-1632:
-

I think the check should be modified from {{ontrolScore.floatValue() > 
shardScore.floatValue())}} to {{controlScore.floatValue() >= 
shardScore.floatValue())}} .

I understand the motivation here that once a term starts getting 'rare'  the 
score will be higher as the stats are just from the individual shards. 

The first part of the test doesn't seem to be triggering this though:

{code}
del("*:*");
for (int i = 0; i < clients.size(); i++) {
  int shard = i + 1;
  for (int j = 0; j <= i; j++) {
index_specific(i, id, docId++, "a_t", "one two three",
"shard_i", shard);
  }
}
{code}



> Distributed IDF
> ---
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.5
>Reporter: Andrzej Bialecki 
>Assignee: Anshum Gupta
> Fix For: 5.0, Trunk
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
> distrib.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across 
> non-uniform shards. This issue tracks the proposed implementation of an API 
> to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2015-09-09 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737856#comment-14737856
 ] 

Yonik Seeley commented on SOLR-1632:


LUCENE-6758 removed part of the test of this issue:

{code}
--- 
lucene/dev/trunk/solr/core/src/test/org/apache/solr/search/stats/TestDefaultStatsCache.java
 2015/09/09 03:13:44 1701894
+++ 
lucene/dev/trunk/solr/core/src/test/org/apache/solr/search/stats/TestDefaultStatsCache.java
 2015/09/09 03:16:15 1701895
@@ -79,10 +79,6 @@
 if (clients.size() == 1) {
   // only one shard
   assertEquals(controlScore, shardScore);
-} else {
-  assertTrue("control:" + controlScore.floatValue() + " shard:"
-  + shardScore.floatValue(),
-  controlScore.floatValue() > shardScore.floatValue());
 }
   }
{code}

http://svn.apache.org/viewvc?view=revision=1701895

Was it testing something important, and can it be replaced with something else?

> Distributed IDF
> ---
>
> Key: SOLR-1632
> URL: https://issues.apache.org/jira/browse/SOLR-1632
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.5
>Reporter: Andrzej Bialecki 
>Assignee: Anshum Gupta
> Fix For: 5.0, Trunk
>
> Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
> SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
> distrib.patch
>
>
> Distributed IDF is a valuable enhancement for distributed search across 
> non-uniform shards. This issue tracks the proposed implementation of an API 
> to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2015-01-18 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282225#comment-14282225
 ] 

Anshum Gupta commented on SOLR-1632:


[~ysee...@gmail.com]: I did give it a thought but it would be tricky to support 
something like  stats=implementation for each request. We could however have 
something like  'stats=local' or 'stats=global' where in the later case, it 
uses the implementation specified in the config. But yes, we could evaluate 
that more.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-29 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14260436#comment-14260436
 ] 

ASF subversion and git services commented on SOLR-1632:
---

Commit 1648428 from [~anshumg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1648428 ]

SOLR-1632: Distributed IDF, finally. (merge from trunk)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-26 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259225#comment-14259225
 ] 

Yonik Seeley commented on SOLR-1632:


bq. This isn't switched on by default as it certainly comes at some cost

What would be really nice is to enable this on a per-request basis.  Perhaps 
via globalStats=true
We can open up a new issue if it's difficult enough...

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-22 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255813#comment-14255813
 ] 

Erick Erickson commented on SOLR-1632:
--

WhoooHo!

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-22 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256107#comment-14256107
 ] 

Shawn Heisey commented on SOLR-1632:


The commit is too large to digest easily.  I assume this is on by default?  Can 
it be enabled and disabled?

I will likely be using this once it's available, but do we have any idea what 
the performance impact is?


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-22 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256121#comment-14256121
 ] 

Anshum Gupta commented on SOLR-1632:


This isn't switched on by default as it certainly comes at some cost (there are 
no free lunches, remember?) :)

It can be switched on by specifying what implementation you want via top-level 
solrconfig setting or System property i.e.:
{code}
 statsCache class=org.apache.solr.search.stats.ExactStatsCache/
{code}

About the performance impact, I tested it on my machine (which is not really a 
great thing to do as there's barely any possibility of network issues here) for 
about 6mn (real and mocked up Jeopardy questions dataset) docs and regular 
queries and the performance impact was barely noticeable.

I still need to document this (which I'll add to the ref guide once this makes 
it into 5x) and I suppose things would be easier to understand for the end user 
then.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256378#comment-14256378
 ] 

Mark Miller commented on SOLR-1632:
---

We should get some results across real machines, but I also turned my micro 
bench work onto this. I didn't confirm that the settings are actually taking 
affect, or review the latest work, but I ran the benchmark twice, once with 
LocalStatsCache and once with ExactStatsCache. 

bq.  statsCache class=org.apache.solr.search.stats.ExactStatsCache/
bq.  statsCache class=org.apache.solr.search.stats.LocalStatsCache/

The test uses two machines, one to create and send the docs/queries, another to 
run the Solr JVMs. I ran a query test using a ton of wikipedia data across 6 
jvm instances, 6 shards, no replication. I indexed a ton of docs, and then used 
a bunch of threads and bunch of CloudSolrServer's to pound in some queries. 
Performance appeared nearly identical.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-22 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256394#comment-14256394
 ] 

Anshum Gupta commented on SOLR-1632:


Right, I saw similar behavior on my tests. I think the impact really would be 
when there's a ton of query terms across multiple shards that actually use the 
network.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255502#comment-14255502
 ] 

ASF subversion and git services commented on SOLR-1632:
---

Commit 1647253 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1647253 ]

SOLR-1632: Distributed IDF, finally.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-21 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14255504#comment-14255504
 ] 

Anshum Gupta commented on SOLR-1632:


Thanks to everyone who's contributed on this one! The list is long :)
I've committed this to trunk, if all stays well, will commit it into 5x later 
in the (coming) week.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-17 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250963#comment-14250963
 ] 

Anshum Gupta commented on SOLR-1632:


I plan on committing this sometime over the weekend.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-12-15 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246473#comment-14246473
 ] 

Anshum Gupta commented on SOLR-1632:


I think we should get this in now. This would not be enabled by default i.e. 
LocalStatsCache impl would be used anyways.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Anshum Gupta
 Fix For: 5.0, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-09-29 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152329#comment-14152329
 ] 

Anshum Gupta commented on SOLR-1632:


Thanks for updating the patch [~vzhovtiuk].
The tests pass now. I'm looking at the updated patch.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.9, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-09-26 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14149902#comment-14149902
 ] 

Anshum Gupta commented on SOLR-1632:


I've uploaded and updated patch that applies to current trunk but has a failing 
TestLRUStatsCache at the review board.
[~vzhovtiuk] Can you have a look at it too if you have time?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.9, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-5488.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-09-22 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142964#comment-14142964
 ] 

Anshum Gupta commented on SOLR-1632:


I'd created a reviewboard request to look and compare the last few patches. 
Thought I'd share that here.
https://reviews.apache.org/r/25855/

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.9, Trunk

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-5488.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-03-11 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930406#comment-13930406
 ] 

Markus Jelsma commented on SOLR-1632:
-

No, but i think this happened when the QueryCommand code
{code}
public StatsSource getStatsSource() { return statsSource; }
public QueryCommand setStatsSource(StatsSource dfSource) {
  this.statsSource = dfSource;
  return this;
}
{code}

got removed.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.7, 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2014-03-10 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925596#comment-13925596
 ] 

Markus Jelsma commented on SOLR-1632:
-

Hi Vitaly, are you sure it still works? I tried your and few older patches 
again but docCounts are no longer the sum of the cluster size. The GET_STATS 
query is executed though.

Two node test cluster:

{code}
 384841 [qtp1175813699-17] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={distrib=falsedebug=trackwt=javabinrequestPurpose=GET_TERM_STATSversion=2rows=10debugQuery=falseshard.url=http://127.0.1.1:8983/solr/collection1/NOW=139039677rid=-collection1-139039677-12shards.purpose=2q=wikiisShard=true}
 status=0 QTime=1 
384848 [qtp1175813699-17] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={distrib=falsedebug=trackwt=javabinrequestPurpose=GET_TOP_IDS,GET_STATS,GET_TERMS,GET_MLT_RESULTS,SET_TERM_STATSversion=2rows=10org.apache.solr.stats.colStats=content_nl,121630,115956,16436279,11372267org.apache.solr.stats.terms=content_nl:wikiNOW=139039677shard.url=http://127.0.1.1:8983/solr/collection1/debugQuery=falsefl=id,scoreshards.purpose=5636rid=-collection1-139039677-12start=0q=wikiorg.apache.solr.stats.termStats=content_nl:wiki,284,645isShard=truefsv=true}
 hits=138 status=0 QTime=1 
384863 [qtp1175813699-17] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={ids=http://nl.wikipedia.org/wiki/Overleg_sjabloon:Infobox_film,http://nl.wikipedia.org/wiki/Overleg_sjabloon:Navigatie_Bijbel,http://nl.wikipedia.org/wiki/Overleg_help:Gebruik_van_sjablonen,http://nl.wikipedia.org/wiki/Overleg_sjabloon:Citeer_boek,http://nl.wikipedia.org/wiki/Overleg_sjabloon:Wiktdistrib=falsedebug=trackwt=javabinrequestPurpose=GET_FIELDS,GET_DEBUGversion=2rows=10debugQuery=trueshard.url=http://127.0.1.1:8983/solr/collection1/NOW=139039677rid=-collection1-139039677-12shards.purpose=320q=wikiisShard=true}
 status=0 QTime=7 
384870 [qtp1175813699-13] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select params={debugQuery=trueq=wiki} 
rid=-collection1-139039677-12 hits=284 status=0 QTime=33 

{code}

{code}
380242 [qtp1175813699-16] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={distrib=falsedebug=trackwt=javabinrequestPurpose=GET_TERM_STATSversion=2rows=10debugQuery=falseshard.url=http://127.0.1.1:7574/solr/collection1/NOW=139039677rid=-collection1-139039677-12shards.purpose=2q=wikiisShard=true}
 status=0 QTime=0 
380249 [qtp1175813699-16] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={distrib=falsedebug=trackwt=javabinrequestPurpose=GET_TOP_IDS,GET_STATS,GET_TERMS,GET_MLT_RESULTS,SET_TERM_STATSversion=2rows=10org.apache.solr.stats.colStats=content_nl,121630,115956,16436279,11372267org.apache.solr.stats.terms=content_nl:wikiNOW=139039677shard.url=http://127.0.1.1:7574/solr/collection1/debugQuery=falsefl=id,scoreshards.purpose=5636rid=-collection1-139039677-12start=0q=wikiorg.apache.solr.stats.termStats=content_nl:wiki,284,645isShard=truefsv=true}
 hits=146 status=0 QTime=2 
380263 [qtp1175813699-16] INFO  org.apache.solr.core.SolrCore  – [collection1] 
webapp=/solr path=/select 
params={ids=http://nl.wikipedia.org/wiki/Overleg_sjabloon:Navigatie,http://nl.wikipedia.org/wiki/Overleg_help:Waarom_staat_mijn_bestand_op_de_beoordelingslijst,http://nl.wikipedia.org/wiki/Overleg_help:Wikipediachat,http://nl.wikipedia.org/wiki/Overleg_sjabloon:Coördinaten,http://nl.wikipedia.org/wiki/Overleg_sjabloon:Sjabdoc/docdistrib=falsedebug=trackwt=javabinrequestPurpose=GET_FIELDS,GET_DEBUGversion=2rows=10debugQuery=trueshard.url=http://127.0.1.1:7574/solr/collection1/NOW=139039677rid=-collection1-139039677-12shards.purpose=320q=wikiisShard=true}
 status=0 QTime=6 
{code}

But i get these scores:

{code}

12.8123455 = (MATCH) weight(content_nl:wiki in 18636) [], result of:
  12.8123455 = score(doc=18636,freq=33.0 = termFreq=33.0
), product of:
6.0355678 = idf(docFreq=138, docCount=57897)
2.122807 = tfNorm, computed from:
  33.0 = termFreq=33.0
  1.2 = parameter k1
  0.0 = parameter b (norms omitted for field)
{code}

{code}

12.558066 = (MATCH) weight(content_nl:wiki in 60634) [], result of:
  12.558066 = score(doc=60634,freq=25.0 = termFreq=25.0
), product of:
5.982207 = idf(docFreq=146, docCount=58059)
2.0992365 = tfNorm, computed from:
  25.0 = termFreq=25.0
  1.2 = parameter k1
  0.0 = parameter b (norms omitted for field)
{code}

Did it work for you?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects 

[jira] [Commented] (SOLR-1632) Distributed IDF

2014-03-10 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13925920#comment-13925920
 ] 

Mark Miller commented on SOLR-1632:
---

bq.  I tried your and few older patches again but docCounts are no longer the 
sum of the cluster size. 

Do you see what is missing in the tests to catch this?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.7, 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-12-09 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843025#comment-13843025
 ] 

Markus Jelsma commented on SOLR-1632:
-

It is much faster now, even usable. But i haven't tried it in a larger cluster 
yet.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-12-07 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842271#comment-13842271
 ] 

Mark Miller commented on SOLR-1632:
---

[~markus17], how was performance with your most recent patch compared to what 
you first reported?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1301.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-12-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842077#comment-13842077
 ] 

Mark Miller commented on SOLR-1632:
---

I've got two main concerns - the thread local and it looks like the statscache 
is not thread safe but shared across threads.

The threadlocal is concerning because you can have thousands of threads and 
each will cache how many stats? I wish we could do something better.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-11-26 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833392#comment-13833392
 ] 

Mark Miller commented on SOLR-1632:
---

I'm looking at a couple of the test fails before I go to bed tonight:

{quote}
   [junit4] Tests with failures:
   [junit4]   - 
org.apache.solr.handler.component.QueryElevationComponentTest.testGroupedQuery
   [junit4]   - org.apache.solr.TestDistributedSearch.testDistribSearch
   [junit4]   - org.apache.solr.search.stats.TestLRUStatsCache.testDistribSearch
   [junit4]   - 
org.apache.solr.TestGroupingSearch.testGroupingGroupSortingScore_basicWithGroupSortEqualToSort
   [junit4]   - 
org.apache.solr.TestGroupingSearch.testGroupingGroupSortingScore_withTotalGroupCount
   [junit4]   - 
org.apache.solr.TestGroupingSearch.testGroupingGroupSortingScore_basic
   [junit4]   - 
org.apache.solr.search.stats.TestExactStatsCache.testDistribSearch
   [junit4]   - org.apache.solr.update.AddBlockUpdateTest.testXML
   [junit4]   - org.apache.solr.update.AddBlockUpdateTest.testSolrJXML
{quote}

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-11-26 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833478#comment-13833478
 ] 

Mark Miller commented on SOLR-1632:
---

The config you need to use to turn this on is now:

statsCache class=org.apache.solr.search.stats.ExactStatsCache/

It needs to go in the top level config section.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-11-26 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13833483#comment-13833483
 ] 

Mark Miller commented on SOLR-1632:
---

The thread local still scares me ... need to look closer at that.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 5.0, 4.7

 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-10-23 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803043#comment-13803043
 ] 

David commented on SOLR-1632:
-

is this patch currently working in 5.0?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-10-23 Thread David (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803040#comment-13803040
 ] 

David commented on SOLR-1632:
-

It seems like this task should have a much higher priority. Distributed IDF is 
very important for scoring across non-uniform shards. I am currently using Solr 
Cloud with grouping and without distributed IDF my boost functions are rendered 
nearly useless in terms of the result ordering expected.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-10-23 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803064#comment-13803064
 ] 

Markus Jelsma commented on SOLR-1632:
-

No, it does not work at all. I did spend some time on it but had other things 
to do. In the end i removed my (not working) changes and uploaded a patch that 
at least compiles against the revision of that time.



 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582022#comment-13582022
 ] 

Markus Jelsma commented on SOLR-1632:
-

No, not yet. Please let me do some real tests, there must be issues, the patch 
is over a year old! :)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582142#comment-13582142
 ] 

Markus Jelsma commented on SOLR-1632:
-

It doesn't really seem to work, we're seeing lots of NPE's and if a response 
comes through IDF is not consistent for all terms. Most request return one of 
the NPE's below. Sometimes it works, and then the second request just fails.

{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.ExactStatsCache.sendGlobalStats(LRUStatsCache.java:202)
at 
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at 
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}

{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.LRUStatsCache.sendGlobalStats(LRUStatsCache.java:228)
at 
org.apache.solr.handler.component.QueryComponent.createMainQuery(QueryComponent.java:783)
at 
org.apache.solr.handler.component.QueryComponent.regularDistributedProcess(QueryComponent.java:618)
at...
{code}

We also see this one from time to time, it looks like this is thrown is there 
are `no servers hosting shard`:
{code}
java.lang.NullPointerException
at 
org.apache.solr.search.stats.LRUStatsCache.mergeToGlobalStats(LRUStatsCache.java:112)
at 
org.apache.solr.handler.component.QueryComponent.updateStats(QueryComponent.java:743)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:659)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:634)
at ..
{code}

It's also imposes a huge performance penalty with both LRUStatsCache and 
ExactStatsCache, if you're used to 40ms response times you'll see the average 
jump to 2 seconds with very frequent 5 second spikes. Performance stays poor if 
logging is disabled.

The logs are also swamped with logs like:
{code}
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] 
- : ## Missing global colStats info: FIELD, using local
2013-02-20 11:54:48,091 WARN [search.stats.LRUStatsCache] - [http-8080-exec-5] 
- : ## Missing global termStats info: FIELD:TERM, using local
{code}

Both StatsCacheImpls behave like this. Each query logs lines like above. Maybe 
performance is poor because it tries to look up terms everytime but i'm not 
sure yet.


Finally something crazy i'd like to share :)
{code}
-Infinity = (MATCH) sum of:
  -Infinity = (MATCH) max plus 0.35 times others of:
-Infinity = (MATCH) weight(content_nl:amsterdam^1.6 in 449) [], result of:
  -Infinity = score(doc=449,freq=1.0 = termFreq=1.0
), product of:
1.6 = boost
-Infinity = idf(docFreq=29800090, docCount=-1)
1.0 = tfNorm, computed from:
  1.0 = termFreq=1.0
  1.2 = parameter k1
  0.0 = parameter b (norms omitted for field)
{code}

If someone happens to recognize the issues above, i'm all ears :)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582178#comment-13582178
 ] 

Mark Miller commented on SOLR-1632:
---

Hmm, that makes it look like the current tests for this must be pretty weak 
then.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-20 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582188#comment-13582188
 ] 

Markus Jelsma commented on SOLR-1632:
-

Things have changed a lot in the past 13 months and i haven't figured it all 
out yet. I'll try to make sense out of it but some expert opinion and trial on 
the patch and all would be more than helpful. Is Andrzej not around? 

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-02-19 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581945#comment-13581945
 ] 

Mark Miller commented on SOLR-1632:
---

Nice. I mentioned this to AB not too long ago, but I'm of the mind to simply 
commit this. It will default to off, and we can continue to work on it.

So unless someone steps in, I'll commit what Markus has put up.

Markus, have you tried this out at all beyond the unit tests - eg on a cluster?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Fix For: 5.0

 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch, 
 SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2013-01-03 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543008#comment-13543008
 ] 

Markus Jelsma commented on SOLR-1632:
-

Any progress to report or does anyone have a patch that is updated for trunk?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, distrib-2.patch, 
 distrib.patch, SOLR-1632.patch, SOLR-1632.patch, SOLR-1632.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195619#comment-13195619
 ] 

Yonik Seeley commented on SOLR-1632:


bq. There is nothing different from a MTQ generated BQ than a huge BQ a solr 
user submits.

Multi-term queries like range query, prefix query, etc, do not depend on term 
stats, and can consist of millions of terms.  It's a waste to attempt to return 
term stats for them (estimated or not).

It would also be a shame to use estimates rather than exact numbers for what 
will be the common case (i.e. when there's really only a couple of terms you 
need stats for):
 +title:blue whale  +title_whole:[a TO g}
  or
 +title:blue whale  +date:[2001-01-01 TO 2010-01-01}

Ideally, we wouldn't even do a rewrite in order to collect terms - rewrite 
itself has gotten much more expensive in some circumstances (i.e. iterating the 
first 350 terms to determine what style of rewrite should be used)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195632#comment-13195632
 ] 

Robert Muir commented on SOLR-1632:
---

{quote}
Multi-term queries like range query, prefix query, etc, do not depend on term 
stats, and can consist of millions of terms. 
{quote}
No, they cannot.

it can't be millions of terms because a million exceeds the
boolean max clause count, in which it will always use a filter.

{quote}
Ideally, we wouldn't even do a rewrite in order to collect terms
{quote}

You don't have to, Lucene's test case (ShardSearchingTestBase) doesn't do an 
extra rewrite to collect terms.
{code}
@Override
public Query rewrite(Query original) throws IOException {
  final Query rewritten = super.rewrite(original);
  final SetTerm terms = new HashSetTerm();
  rewritten.extractTerms(terms);

  // Make a single request to remote nodes for term
  // stats:
  ...
  return rewritten;
}
{code}

{quote}
 - rewrite itself has gotten much more expensive in some circumstances (i.e. 
iterating the first 350 terms to determine what style of rewrite should be used)
{quote}
Got any benchmarks to back this up with?

Its incorrect to say rewrite has gotten more expensive? More expensive than 
what? 
Its the opposite: its actually much faster when rewriting to boolean queries in 
4.0 because it always works per-segment.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195635#comment-13195635
 ] 

Yonik Seeley commented on SOLR-1632:


bq. it can't be millions of terms because a million exceeds the boolean max 
clause count, in which it will always use a filter.

So depending on exactly how many terms the range query covers, extractTerms may 
or may not return any.
So extractTerms() may return 300 terms the first time, and then after someone 
adds some docs to the index it may suddenly return 0.
This just strengthens the case that we should be consistent and just always 
ignore the terms from these MTQs.

bq. Its incorrect to say rewrite has gotten more expensive? More expensive than 
what?

Sorry, I wasn't specific enough. I meant compared to back when Solr had it's 
own RangeFilter and PrefixFilter that it would wrap in a ConstantScoreQuery.  
There never was any rewrite-to-boolean-query or consulting the index, so it's 
obviously a faster rewrite().

But back to the original question - I still see no reason to 
request/return/cache terms/stats from these multi-term queries when by 
definition they should not change the results of the request.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195637#comment-13195637
 ] 

Uwe Schindler commented on SOLR-1632:
-

bq. Sorry, I wasn't specific enough. I meant compared to back when Solr had 
it's own RangeFilter and PrefixFilter that it would wrap in a 
ConstantScoreQuery. There never was any rewrite-to-boolean-query or consulting 
the index, so it's obviously a faster rewrite().

Just set in Solr the rewrite mode of MTQ to CONSTANT_SCORE_FILTER_REWRITE - 
done. There is no discussion needed and no custom RangeQuery in Solr.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195638#comment-13195638
 ] 

Yonik Seeley commented on SOLR-1632:


bq. Just set in Solr the rewrite mode of MTQ to CONSTANT_SCORE_FILTER_REWRITE - 
done.

Right - I was considering the best way to do this (passing that info around 
solr about when to use what method).
It solves both issues - relatively expensive rewrites that are not needed, and  
ignoring the MTQ terms.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-28 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195644#comment-13195644
 ] 

Robert Muir commented on SOLR-1632:
---

{quote}
But back to the original question - I still see no reason to 
request/return/cache terms/stats from these multi-term queries when by 
definition they should not change the results of the request.
{quote}

My original point (forgetting about the specifics of MTQ, how things are being 
scored, or anything) is still that its a general case of Query that can have 
lots of Terms.

So if there are concerns about lots of terms, I still think its worth 
considering having some 
limits on how many Terms would be exchanged. Maybe BooleanQuery's max clause 
count is already good
enough, but another way to do it would be to have an approximate implementation 
that approximates 
when the term count for a query gets too high.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194814#comment-13194814
 ] 

Robert Muir commented on SOLR-1632:
---

Thanks Andrzej: I think it will be nice that all of lucene's scoring algorithms 
can work in distributed mode.

Just one question about the patch: in StatsUtil I can't tell if termFromString 
matches termToString?
termToString seems to base64 encode the term text (a good idea, since terms can 
be binary), but I don't
see the corresponding decode in termFromString (there is an XXX: comment 
though).



 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Andrzej Bialecki (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194842#comment-13194842
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

Hmm, indeed...  I must have switched to toString() for debugging (its easier to 
eyeball an ascii string than a base64 string ;) ). This should use base64 
throughout. I'll prepare a patch shortly.

(BTW, I'm aware that passing around blobs of base64 inside SolrParams is ugly. 
I'm open to suggestions how to handle this better).

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194860#comment-13194860
 ] 

Yonik Seeley commented on SOLR-1632:


bq. (BTW, I'm aware that passing around blobs of base64 inside SolrParams is 
ugly. I'm open to suggestions how to handle this better).

I'd prefer non-base64 at the Solr transport level (e.g. 
termStats=how,now,brown,cow).  It will be both smaller, and much easier to 
debug other things.

Although Lucene can technically index arbitrary binary now, Solr does not use 
that anywhere (and won't for 4.0).  It would take a good amount of 
infrastructure work all over to truly allow that.  If/when we allow arbitrary 
binary terms, it should be relatively easy to extend the syntax we pick today 
to allow selectively base64 encoded terms.

There are already a number of places in Solr where we use StrUtil.join (a comma 
separated list of strings) to specify a list of terms (both in distrib faceting 
and distrib search for example).



 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194878#comment-13194878
 ] 

Robert Muir commented on SOLR-1632:
---

{quote}
Although Lucene can technically index arbitrary binary now, Solr does not use 
that anywhere (and won't for 4.0).
{quote}

Thats not actually true. Collation uses it already.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194893#comment-13194893
 ] 

Yonik Seeley commented on SOLR-1632:


bq. Thats not actually true. Collation uses it already.

Hmmm, that's normally just for sorting though.  I wonder if that works with 
distributed search today?

Anyway, we have a schema - that can allow us to do what makes sense depending 
on the field (i.e. only use base64 or \x?? for fields where there will be 
non-character terms)

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194899#comment-13194899
 ] 

Robert Muir commented on SOLR-1632:
---

Its also used for locale-sensitive range queries (and of course termquery etc 
works too, but thats not interesting).


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Andrzej Bialecki (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194907#comment-13194907
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

\x or %xx escaping could be ok, I guess - it's safe, and in most cases it's 
still readable, unlike base64.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194910#comment-13194910
 ] 

Yonik Seeley commented on SOLR-1632:


bq. Its also used for locale-sensitive range queries

Given that range queries (and other multi-term queries) are constant scoring 
and may contain *many* terms, hopefully we avoid requesting term stats for 
these?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Andrzej Bialecki (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194915#comment-13194915
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

bq. hopefully we avoid requesting term stats for these?
There is no provision for this yet in the current patch.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-27 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194921#comment-13194921
 ] 

Robert Muir commented on SOLR-1632:
---

{quote}
There is no provision for this yet in the current patch.
{quote}

There is nothing different from a MTQ generated BQ than a huge BQ a solr user 
submits.
In my opinion instead of saying screw scoring certain types of queries, this 
stuff should
be done by InExact implementations (and maybe that should be the default, 
fine). e.g. a nice
heuristic could look at the local stats and say: sure there are 100 terms but 
50 are low-freq,
lets assume additive constant C for those, batch the other terms into e.g. 5 
ranges and only request
stats on 5 surrogate terms representative of those groups.

Just make sure any heuristic is always *added* to what is surely present 
locally, e.g. distributed
docfreq is always = local docfreq. Then no scoring algorithms will break.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-24 Thread Shawn Heisey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192357#comment-13192357
 ] 

Shawn Heisey commented on SOLR-1632:


Is this something that can be added to branch_3x? With high fuzz and ignore 
whitespace, the patch applies, but then fails to compile.  It also fails to 
compile when I set fuzz to zero, pay attention to whitespace, and manually fix 
the patch rejects.  I couldn't figure out how to fix the problems.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-24 Thread Andrzej Bialecki (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192374#comment-13192374
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

bq. Is this something that can be added to branch_3x?

Not without porting - Lucene / Solr API-s have changed significantly, and this 
patch uses some low-level API-s that are different between trunk and 3x.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-24 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192394#comment-13192394
 ] 

Yonik Seeley commented on SOLR-1632:


Haven't had time to look this over that closely, but this did jump out at me:

+public class CollectionStats {
+  public String field;
+  public int maxDoc;
+  public int docCount;

Shouldn't we be using longs here so we can support more than 2B docs?


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-24 Thread Andrzej Bialecki (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192407#comment-13192407
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

Yeah, I was curious about this too. However, this is how CollectionStatistics 
is defined in Lucene, so it's something that we have to change in Lucene too.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: SOLR-1632.patch, SOLR-1632.patch, distrib-2.patch, 
 distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2012-01-24 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192438#comment-13192438
 ] 

Robert Muir commented on SOLR-1632:
---

{quote}
However, this is how CollectionStatistics is defined in Lucene, so it's 
something that we have to change in Lucene too.
{quote}

TermStatistics too. Lets open a separate issue for this.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: 3x_SOLR-1632_doesntwork.patch, SOLR-1632.patch, 
 SOLR-1632.patch, distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2011-11-02 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142603#comment-13142603
 ] 

Mark Miller commented on SOLR-1632:
---

Recently I updated this patch to trunk and got rid of the threadlocal usage and 
Query rewriting that was the reason we had to pull this from trunk long ago - 
then I attempted to override stats on IndexSearcher with global stats - this is 
when I realized that had no affect on scoring anymore - this will now be 
addressed LUCENE-3555. Unfortunately, I didn't pay attention and lost that 
code. It's unfortunate, because it would have been a nice head start on this 
issue - I think we may want to make other changes/improvements, but would have 
been a start with something working. It was a half pain to do since the patch 
has to be manually applied, but perhaps doing it a second time is faster...

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2011-11-02 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142634#comment-13142634
 ] 

Mark Miller commented on SOLR-1632:
---

Correction: i got rid of the rewrite that was added for the multi searcher type 
behavior - I hadn't solved the issue of rewrite to get the terms to retrieve 
stats for - that patch was not yet going to work with multiterm queries.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1632) Distributed IDF

2011-11-02 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142641#comment-13142641
 ] 

Mark Miller commented on SOLR-1632:
---

Although, actually I'm not even sure if that rewrite is really a problem - I 
almost don't think it will tickle the same issue as the rewrite that was 
happening before the search. I didn't have a chance to test it or look into it 
in depth or anything yet though.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1632) Distributed IDF

2011-02-23 Thread Thorsten Scherler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12998262#comment-12998262
 ] 

Thorsten Scherler commented on SOLR-1632:
-

Regarding the comment Perhaps one idea is to use a visitor pattern to decouple 
tree traversal with the operations being performed. can you please explain 
where to implement the Listener/visitor. I had a quick look at the patch and it 
seems to me that the main functionality is in 
trunk/src/java/org/apache/solr/search/SolrIndexSearcher.java and the rest is 
more caching concerns, right?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1632) Distributed IDF

2010-07-25 Thread LiLi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892167#action_12892167
 ] 

LiLi commented on SOLR-1632:


My solr version is 1.4. I patched it but failed.
SolrCacheString, Integer cache = perShardCache.get(shard);  it suggests that 
The type SolrCache is not generic; it cannot be parameterized with arguments 
String, Integer 

The SolrCache is a interface: public interface SolrCache extends SolrInfoMBean 

patching file src/common/org/apache/solr/common/params/ShardParams.java
patching file src/java/org/apache/solr/core/SolrConfig.java
Hunk #1 succeeded at 30 with fuzz 2 (offset 2 lines).
Hunk #2 FAILED at 197.
1 out of 2 hunks FAILED -- saving rejects to file src/java/org/apache/solr/core/
SolrConfig.java.rej
patching file src/java/org/apache/solr/core/SolrCore.java
Hunk #5 succeeded at 821 (offset 3 lines).
patching file src/java/org/apache/solr/handler/component/QueryComponent.java
Hunk #1 succeeded at 40 with fuzz 2 (offset -2 lines).
Hunk #6 succeeded at 302 (offset 13 lines).
Hunk #7 succeeded at 324 with fuzz 2 (offset 12 lines).
Hunk #8 succeeded at 343 (offset 21 lines).
Hunk #9 succeeded at 367 (offset 21 lines).
Hunk #10 succeeded at 423 (offset 28 lines).
patching file src/java/org/apache/solr/handler/component/SearchHandler.java
patching file src/java/org/apache/solr/handler/component/ShardRequest.java
Hunk #1 FAILED at 37.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/solr/handle
r/component/ShardRequest.java.rej
patching file src/java/org/apache/solr/search/DFCache.java
patching file src/java/org/apache/solr/search/DFSource.java
patching file src/java/org/apache/solr/search/DefaultDFCache.java
patching file src/java/org/apache/solr/search/ExactDFCache.java
patching file src/java/org/apache/solr/search/LRUDFCache.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 succeeded at 77 (offset 3 lines).
Hunk #2 succeeded at 149 (offset 3 lines).
Hunk #3 succeeded at 699 (offset 46 lines).
Hunk #4 succeeded at 927 (offset 59 lines).
Hunk #5 succeeded at 1041 (offset 59 lines).
Hunk #6 succeeded at 1190 with fuzz 1 (offset 180 lines).
Hunk #7 FAILED at 1276.
Hunk #8 FAILED at 1311.
Hunk #9 succeeded at 1608 (offset 104 lines).
Hunk #10 succeeded at 1716 (offset 113 lines).
Hunk #11 succeeded at 1774 (offset 113 lines).
2 out of 11 hunks FAILED -- saving rejects to file src/java/org/apache/solr/sear
ch/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/util/SolrPluginUtils.java
can't find file to patch at input line 1206
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--
|Index: trunk/src/test/org/apache/solr/BaseDistributedSearchTestCase.java
|===
|--- trunk/src/test/org/apache/solr/BaseDistributedSearchTestCase.java  (revisio
n 893413)
|+++ trunk/src/test/org/apache/solr/BaseDistributedSearchTestCase.java  (working
 copy)
--
File to patch:
Skip this patch? [y] n
File to patch:
Skip this patch? [y]
Skipping patch.
4 out of 4 hunks ignored
patching file src/test/org/apache/solr/search/TestDefaultDFCache.java
patching file src/test/org/apache/solr/search/TestExactDFCache.java
patching file src/test/org/apache/solr/search/TestLRUDFCache.java
patching file src/test/test-files/solr/conf/solrconfig-defaultdfcache.xml
patching file src/test/test-files/solr/conf/solrconfig-exactdfcache.xml
patching file src/test/test-files/solr/conf/solrconfig-lrudfcache.xml

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1632) Distributed IDF

2010-04-13 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856517#action_12856517
 ] 

Yonik Seeley commented on SOLR-1632:


Rewrite not working through function query is not the end of the problems 
either... there is also stuff like extractTerms.

There is also the issue of Lucene changing rapidly... and the difficulty of 
adding new methods to ValueSource and making sure that all implementations 
correctly propagate them through to sub ValueSources.  Perhaps one idea is to 
use a visitor pattern to decouple tree traversal with the operations being 
performed.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (SOLR-1632) Distributed IDF

2010-04-12 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856220#action_12856220
 ] 

Yonik Seeley commented on SOLR-1632:


Was looking into this a little offline with Mark, who noticed that some queries 
were not being rewritten, and would thus throw an exception during weighting.

It looks like the issue is this: rewrite() doesn't work for function queries 
(there is no propagation mechanism to go through value sources).  This is a 
problem when real queries are embedded in function queries.

Solr Function queries do have a mechanism to weight (via 
ValueSource.createWeight()).
QueryValueSource does Weight w = q.weight(searcher); and that implementation 
of weight
calls   Query query = searcher.rewrite(this);

This patch calls rewrite explicitly (which does nothing for embedded queries), 
and then when using the DFSource implementation of searcher, rewrite does 
nothing, and hence the embedded query is never rewritten and the subsequent 
createWeight() throws an exception.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib-2.patch, distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (SOLR-1632) Distributed IDF

2009-12-21 Thread Marc Sturlese (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793283#action_12793283
 ] 

Marc Sturlese commented on SOLR-1632:
-

Wich should be the value of the parameter shard.purpose to enable or disable 
the exact version of global IDF? 

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1632) Distributed IDF

2009-12-11 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789174#action_12789174
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

I'm not sure what approach you are referring to. Following the terminology in 
that thread, this implementation follows the approach where there is a single 
merged big idf map at the master, and it's sent out to slaves on each query. 
However, when exactly this merging and sending happens is 
implementation-specific - in the ExactDFSource it happens on every query, but I 
hope the API can support other scenarios as well.

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1632) Distributed IDF

2009-12-11 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789379#action_12789379
 ] 

Otis Gospodnetic commented on SOLR-1632:


I didn't look a the patch, but from your comments it looks like you already 
have that 1 merged big idf map, which is really what I was aiming at, so 
that's good!

I was just thinking that this map (file) would be periodically updated and 
pushed to slaves, so that slaves can compute the global IDF *locally* instead 
of any kind of extra requests.


 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1632) Distributed IDF

2009-12-11 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789607#action_12789607
 ] 

Andrzej Bialecki  commented on SOLR-1632:
-

I believe the API that I propose would support such implementation as well. 
Please note that it's usually not feasible to compute and distribute the 
complete IDF table for all terms - you would have to replicate a union of all 
term dictionaries across the cluster. In practice, you limit the amount of 
information by various means, e.g. only distributing data related to the 
current request (this implementation) or reducing the frequency of updates 
(e.g. LRU caching), or approximating global DF with a constant for frequent 
terms (where the contribution of their IDF to the score would be negligible 
anyway).

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1632) Distributed IDF

2009-12-10 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789120#action_12789120
 ] 

Otis Gospodnetic commented on SOLR-1632:


What about this approach: http://markmail.org/message/mjfmpzfspguepixx ?

 Distributed IDF
 ---

 Key: SOLR-1632
 URL: https://issues.apache.org/jira/browse/SOLR-1632
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.5
Reporter: Andrzej Bialecki 
 Attachments: distrib.patch


 Distributed IDF is a valuable enhancement for distributed search across 
 non-uniform shards. This issue tracks the proposed implementation of an API 
 to support this functionality in Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.