Re: Getting the distribution information of scores from query

2012-09-27 Thread Amit Nithian
Thanks! That did the trick! Although it required some more work in the
component level of generating the same query key as the index searcher
else when you go to try and fetch scores for a cached query result, I
got a lot of NPE since the stats are computed in the collector level
which for me isn't set since the cache hit bypasses the lucene level.
I'll write up what I did and probably try and open source the work for
others to see. The stuff with PostFiltering is nice but needs some
examples and documentation.. hopefully mine will help the cause.

Thanks again
Amit

On Wed, Sep 26, 2012 at 5:13 AM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
 I suggest to create a component, put it after QueryComponent. in prepare it
 should add own PostFilter into list of request filters, your post filter
 will be able to inject own DelegatingCollector, then you can just add
 collected histogram into result named list
  http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/

 On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian anith...@gmail.com wrote:

 We have a federated search product that issues multiple parallel
 queries to solr cores and fetches the results and blends them. The
 approach we were investigating was taking the scores, normalizing them
 based on some distribution (normal distribution seems reasonable) and
 use that z score as the way to blend the results (else you'll be
 blending scores on different scales). To accomplish this, I was
 looking to get the distribution of the scores for the query as an
 analog to the stats component but seem to see the only way to
 accomplish this would be to create a custom collector that would
 accumulate and store this information (mean, std-dev etc) since the
 stats component only operates on indexed fields.

 Is there an easy way to tell Solr to use a custom collector without
 having to modify the SolrIndexSearcher class? Maybe is there an
 alternative way to get this information?

 Thanks
 Amit




 --
 Sincerely yours
 Mikhail Khludnev
 Tech Lead
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com


Re: Getting the distribution information of scores from query

2012-09-26 Thread Mikhail Khludnev
I suggest to create a component, put it after QueryComponent. in prepare it
should add own PostFilter into list of request filters, your post filter
will be able to inject own DelegatingCollector, then you can just add
collected histogram into result named list
 http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/

On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian anith...@gmail.com wrote:

 We have a federated search product that issues multiple parallel
 queries to solr cores and fetches the results and blends them. The
 approach we were investigating was taking the scores, normalizing them
 based on some distribution (normal distribution seems reasonable) and
 use that z score as the way to blend the results (else you'll be
 blending scores on different scales). To accomplish this, I was
 looking to get the distribution of the scores for the query as an
 analog to the stats component but seem to see the only way to
 accomplish this would be to create a custom collector that would
 accumulate and store this information (mean, std-dev etc) since the
 stats component only operates on indexed fields.

 Is there an easy way to tell Solr to use a custom collector without
 having to modify the SolrIndexSearcher class? Maybe is there an
 alternative way to get this information?

 Thanks
 Amit




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Distribution Information?

2007-09-10 Thread Bill Au
I guess your solr home isn't configured correctly.  FYI, you can set
master_status_dir to use full path name (ie /opt/solr/logs/clients in your
case).

Bill

On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:

 OK. I made the change, but it seemed not to pick up the files.

 When I changed distrobutiondump.jsp to say...

 File masterdir = new File(/opt/solr/logs/clients);

 it worked. Thank you for your help!

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 7, 2007, at 2:21 PM, Bill Au wrote:

  I just double checked distribution.jsp.  The directory where it
  looks for
  status files is hard coded to logs/clients.  So for now
  master_status_dir in
  your solr/conf/scripts.conf has to be set to that so the scripts
  will put
  the status files there.  It looks like they are currently in you logs
  directory.  The status files are snapshot.current.search2 and
  snapshot.status.search2.
 
  Bill
 
  On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  Actually I don't have the clients directory...
 
  [EMAIL PROTECTED]: .../logs]$ pwd
  /opt/solr/logs
  [EMAIL PROTECTED]: .../logs]$ ls
  rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log
  snapshooter.log  snapshot.current.search2  snapshot.status.search2
  [EMAIL PROTECTED]: .../logs]$
 
 
  It does look like it could be a path issue. I wonder why, though, no
  clients sub directory was created.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 7, 2007, at 7:43 AM, Bill Au wrote:
 
  I that case, definitely take a look at SOLR-333:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  On the master there should be a logs/clients directory.  Do you
  have any
  files in there?
 
  Bill
 
  On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  Well, I do get...
 
  Distribution Info
  Master Server
 
  No distribution info present
 
  ...
 
  But there appears to be no information filled in.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 6, 2007, at 6:09 AM, Bill Au wrote:
 
  That is very strange.  Even if there is something wrong with the
  config or
  code, the static HTML contained in distributiondump.jsp should
  show
  up.
 
  Are you using the latest version of the JSP?  There has been a
  recent fix:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  Bill
 
  On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  When I load the distrobutiondump.jsp, there is no output in my
  catalina.out file.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:
 
  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers
  use
  the
  replication scripts to pull updates from it every few
  minutes. My
  distribution information screen is blank.. and I couldn't
  find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this
  page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 
 
 
 
 
 
 




Re: Distribution Information?

2007-09-07 Thread Bill Au
I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you have any
files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:

 Well, I do get...

 Distribution Info
 Master Server

 No distribution info present

 ...

 But there appears to be no information filled in.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 6, 2007, at 6:09 AM, Bill Au wrote:

  That is very strange.  Even if there is something wrong with the
  config or
  code, the static HTML contained in distributiondump.jsp should show
  up.
 
  Are you using the latest version of the JSP?  There has been a
  recent fix:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  Bill
 
  On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  When I load the distrobutiondump.jsp, there is no output in my
  catalina.out file.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:
 
  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers use
  the
  replication scripts to pull updates from it every few minutes. My
  distribution information screen is blank.. and I couldn't find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 
 
 




Re: Distribution Information?

2007-09-07 Thread Matthew Runo

Actually I don't have the clients directory...

[EMAIL PROTECTED]: .../logs]$ pwd
/opt/solr/logs
[EMAIL PROTECTED]: .../logs]$ ls
rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log   
snapshooter.log  snapshot.current.search2  snapshot.status.search2

[EMAIL PROTECTED]: .../logs]$


It does look like it could be a path issue. I wonder why, though, no  
clients sub directory was created.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 7, 2007, at 7:43 AM, Bill Au wrote:


I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you  
have any

files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:


Well, I do get...

Distribution Info
Master Server

No distribution info present

...

But there appears to be no information filled in.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 6, 2007, at 6:09 AM, Bill Au wrote:


That is very strange.  Even if there is something wrong with the
config or
code, the static HTML contained in distributiondump.jsp should show
up.

Are you using the latest version of the JSP?  There has been a
recent fix:

http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:


When I load the distrobutiondump.jsp, there is no output in my
catalina.out file.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:


Not that I've noticed. I'll do a more careful grep soon here - I
just got back from a long weekend.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use
the
replication scripts to pull updates from it every few  
minutes. My
distribution information screen is blank.. and I couldn't  
find any

information on fixing this in the wiki.

Any chance someone would be able to explain how to get this page
working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++

















Re: Distribution Information?

2007-09-07 Thread Bill Au
I just double checked distribution.jsp.  The directory where it looks for
status files is hard coded to logs/clients.  So for now master_status_dir in
your solr/conf/scripts.conf has to be set to that so the scripts will put
the status files there.  It looks like they are currently in you logs
directory.  The status files are snapshot.current.search2 and
snapshot.status.search2.

Bill

On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:

 Actually I don't have the clients directory...

 [EMAIL PROTECTED]: .../logs]$ pwd
 /opt/solr/logs
 [EMAIL PROTECTED]: .../logs]$ ls
 rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log
 snapshooter.log  snapshot.current.search2  snapshot.status.search2
 [EMAIL PROTECTED]: .../logs]$


 It does look like it could be a path issue. I wonder why, though, no
 clients sub directory was created.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 7, 2007, at 7:43 AM, Bill Au wrote:

  I that case, definitely take a look at SOLR-333:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  On the master there should be a logs/clients directory.  Do you
  have any
  files in there?
 
  Bill
 
  On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  Well, I do get...
 
  Distribution Info
  Master Server
 
  No distribution info present
 
  ...
 
  But there appears to be no information filled in.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 6, 2007, at 6:09 AM, Bill Au wrote:
 
  That is very strange.  Even if there is something wrong with the
  config or
  code, the static HTML contained in distributiondump.jsp should show
  up.
 
  Are you using the latest version of the JSP?  There has been a
  recent fix:
 
  http://issues.apache.org/jira/browse/SOLR-333
 
  Bill
 
  On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:
 
  When I load the distrobutiondump.jsp, there is no output in my
  catalina.out file.
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
  On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:
 
  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers use
  the
  replication scripts to pull updates from it every few
  minutes. My
  distribution information screen is blank.. and I couldn't
  find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 
 
 
 
 




Re: Distribution Information?

2007-09-07 Thread Matthew Runo

OK. I made the change, but it seemed not to pick up the files.

When I changed distrobutiondump.jsp to say...

File masterdir = new File(/opt/solr/logs/clients);

it worked. Thank you for your help!

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 7, 2007, at 2:21 PM, Bill Au wrote:

I just double checked distribution.jsp.  The directory where it  
looks for
status files is hard coded to logs/clients.  So for now  
master_status_dir in
your solr/conf/scripts.conf has to be set to that so the scripts  
will put

the status files there.  It looks like they are currently in you logs
directory.  The status files are snapshot.current.search2 and
snapshot.status.search2.

Bill

On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote:


Actually I don't have the clients directory...

[EMAIL PROTECTED]: .../logs]$ pwd
/opt/solr/logs
[EMAIL PROTECTED]: .../logs]$ ls
rsyncd-enabled  rsyncd.log  rsyncd.pid  snapcleaner.log
snapshooter.log  snapshot.current.search2  snapshot.status.search2
[EMAIL PROTECTED]: .../logs]$


It does look like it could be a path issue. I wonder why, though, no
clients sub directory was created.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 7, 2007, at 7:43 AM, Bill Au wrote:


I that case, definitely take a look at SOLR-333:

http://issues.apache.org/jira/browse/SOLR-333

On the master there should be a logs/clients directory.  Do you
have any
files in there?

Bill

On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote:


Well, I do get...

Distribution Info
Master Server

No distribution info present

...

But there appears to be no information filled in.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 6, 2007, at 6:09 AM, Bill Au wrote:


That is very strange.  Even if there is something wrong with the
config or
code, the static HTML contained in distributiondump.jsp should  
show

up.

Are you using the latest version of the JSP?  There has been a
recent fix:

http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:


When I load the distrobutiondump.jsp, there is no output in my
catalina.out file.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:


Not that I've noticed. I'll do a more careful grep soon here - I
just got back from a long weekend.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers  
use

the
replication scripts to pull updates from it every few
minutes. My
distribution information screen is blank.. and I couldn't
find any
information on fixing this in the wiki.

Any chance someone would be able to explain how to get this  
page

working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++




















Re: Distribution Information?

2007-09-06 Thread Bill Au
That is very strange.  Even if there is something wrong with the config or
code, the static HTML contained in distributiondump.jsp should show up.

Are you using the latest version of the JSP?  There has been a recent fix:

http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:

 When I load the distrobutiondump.jsp, there is no output in my
 catalina.out file.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:

  Not that I've noticed. I'll do a more careful grep soon here - I
  just got back from a long weekend.
 
  ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
  ++
 
 
  On Aug 31, 2007, at 6:12 PM, Bill Au wrote:
 
  Are there any error message in your appserver log files?
 
  Bill
 
  On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
  Hello!
 
  /solr/admin/distributiondump.jsp
 
  This server is set up as a master server, and other servers use the
  replication scripts to pull updates from it every few minutes. My
  distribution information screen is blank.. and I couldn't find any
  information on fixing this in the wiki.
 
  Any chance someone would be able to explain how to get this page
  working, or what I'm doing wrong?
 
  ++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
  ++
 
 
 
 
 




Re: Distribution Information?

2007-09-06 Thread Matthew Runo

Well, I do get...

Distribution Info
Master Server

No distribution info present

...

But there appears to be no information filled in.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 6, 2007, at 6:09 AM, Bill Au wrote:

That is very strange.  Even if there is something wrong with the  
config or
code, the static HTML contained in distributiondump.jsp should show  
up.


Are you using the latest version of the JSP?  There has been a  
recent fix:


http://issues.apache.org/jira/browse/SOLR-333

Bill

On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote:


When I load the distrobutiondump.jsp, there is no output in my
catalina.out file.

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++


On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote:


Not that I've noticed. I'll do a more careful grep soon here - I
just got back from a long weekend.

++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use  
the

replication scripts to pull updates from it every few minutes. My
distribution information screen is blank.. and I couldn't find any
information on fixing this in the wiki.

Any chance someone would be able to explain how to get this page
working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++














Re: Distribution Information?

2007-09-05 Thread Matthew Runo
Not that I've noticed. I'll do a more careful grep soon here - I just  
got back from a long weekend.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Aug 31, 2007, at 6:12 PM, Bill Au wrote:


Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use the
replication scripts to pull updates from it every few minutes. My
distribution information screen is blank.. and I couldn't find any
information on fixing this in the wiki.

Any chance someone would be able to explain how to get this page
working, or what I'm doing wrong?

++
  | Matthew Runo
  | Zappos Development
  | [EMAIL PROTECTED]
  | 702-943-7833
++









Distribution Information?

2007-08-31 Thread Matthew Runo

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use the  
replication scripts to pull updates from it every few minutes. My  
distribution information screen is blank.. and I couldn't find any  
information on fixing this in the wiki.


Any chance someone would be able to explain how to get this page  
working, or what I'm doing wrong?


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++




Re: Distribution Information?

2007-08-31 Thread Bill Au
Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
 Hello!

 /solr/admin/distributiondump.jsp

 This server is set up as a master server, and other servers use the
 replication scripts to pull updates from it every few minutes. My
 distribution information screen is blank.. and I couldn't find any
 information on fixing this in the wiki.

 Any chance someone would be able to explain how to get this page
 working, or what I'm doing wrong?

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++