Re: Getting the distribution information of scores from query
Thanks! That did the trick! Although it required some more work in the component level of generating the same query key as the index searcher else when you go to try and fetch scores for a cached query result, I got a lot of NPE since the stats are computed in the collector level which for me isn't set since the cache hit bypasses the lucene level. I'll write up what I did and probably try and open source the work for others to see. The stuff with PostFiltering is nice but needs some examples and documentation.. hopefully mine will help the cause. Thanks again Amit On Wed, Sep 26, 2012 at 5:13 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: I suggest to create a component, put it after QueryComponent. in prepare it should add own PostFilter into list of request filters, your post filter will be able to inject own DelegatingCollector, then you can just add collected histogram into result named list http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/ On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian anith...@gmail.com wrote: We have a federated search product that issues multiple parallel queries to solr cores and fetches the results and blends them. The approach we were investigating was taking the scores, normalizing them based on some distribution (normal distribution seems reasonable) and use that z score as the way to blend the results (else you'll be blending scores on different scales). To accomplish this, I was looking to get the distribution of the scores for the query as an analog to the stats component but seem to see the only way to accomplish this would be to create a custom collector that would accumulate and store this information (mean, std-dev etc) since the stats component only operates on indexed fields. Is there an easy way to tell Solr to use a custom collector without having to modify the SolrIndexSearcher class? Maybe is there an alternative way to get this information? Thanks Amit -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Getting the distribution information of scores from query
I suggest to create a component, put it after QueryComponent. in prepare it should add own PostFilter into list of request filters, your post filter will be able to inject own DelegatingCollector, then you can just add collected histogram into result named list http://searchhub.org/dev/2012/02/10/advanced-filter-caching-in-solr/ On Tue, Sep 25, 2012 at 10:03 PM, Amit Nithian anith...@gmail.com wrote: We have a federated search product that issues multiple parallel queries to solr cores and fetches the results and blends them. The approach we were investigating was taking the scores, normalizing them based on some distribution (normal distribution seems reasonable) and use that z score as the way to blend the results (else you'll be blending scores on different scales). To accomplish this, I was looking to get the distribution of the scores for the query as an analog to the stats component but seem to see the only way to accomplish this would be to create a custom collector that would accumulate and store this information (mean, std-dev etc) since the stats component only operates on indexed fields. Is there an easy way to tell Solr to use a custom collector without having to modify the SolrIndexSearcher class? Maybe is there an alternative way to get this information? Thanks Amit -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Distribution Information?
I guess your solr home isn't configured correctly. FYI, you can set master_status_dir to use full path name (ie /opt/solr/logs/clients in your case). Bill On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote: OK. I made the change, but it seemed not to pick up the files. When I changed distrobutiondump.jsp to say... File masterdir = new File(/opt/solr/logs/clients); it worked. Thank you for your help! ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 2:21 PM, Bill Au wrote: I just double checked distribution.jsp. The directory where it looks for status files is hard coded to logs/clients. So for now master_status_dir in your solr/conf/scripts.conf has to be set to that so the scripts will put the status files there. It looks like they are currently in you logs directory. The status files are snapshot.current.search2 and snapshot.status.search2. Bill On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote: Actually I don't have the clients directory... [EMAIL PROTECTED]: .../logs]$ pwd /opt/solr/logs [EMAIL PROTECTED]: .../logs]$ ls rsyncd-enabled rsyncd.log rsyncd.pid snapcleaner.log snapshooter.log snapshot.current.search2 snapshot.status.search2 [EMAIL PROTECTED]: .../logs]$ It does look like it could be a path issue. I wonder why, though, no clients sub directory was created. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 7:43 AM, Bill Au wrote: I that case, definitely take a look at SOLR-333: http://issues.apache.org/jira/browse/SOLR-333 On the master there should be a logs/clients directory. Do you have any files in there? Bill On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote: Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
I that case, definitely take a look at SOLR-333: http://issues.apache.org/jira/browse/SOLR-333 On the master there should be a logs/clients directory. Do you have any files in there? Bill On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote: Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
Actually I don't have the clients directory... [EMAIL PROTECTED]: .../logs]$ pwd /opt/solr/logs [EMAIL PROTECTED]: .../logs]$ ls rsyncd-enabled rsyncd.log rsyncd.pid snapcleaner.log snapshooter.log snapshot.current.search2 snapshot.status.search2 [EMAIL PROTECTED]: .../logs]$ It does look like it could be a path issue. I wonder why, though, no clients sub directory was created. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 7:43 AM, Bill Au wrote: I that case, definitely take a look at SOLR-333: http://issues.apache.org/jira/browse/SOLR-333 On the master there should be a logs/clients directory. Do you have any files in there? Bill On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote: Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
I just double checked distribution.jsp. The directory where it looks for status files is hard coded to logs/clients. So for now master_status_dir in your solr/conf/scripts.conf has to be set to that so the scripts will put the status files there. It looks like they are currently in you logs directory. The status files are snapshot.current.search2 and snapshot.status.search2. Bill On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote: Actually I don't have the clients directory... [EMAIL PROTECTED]: .../logs]$ pwd /opt/solr/logs [EMAIL PROTECTED]: .../logs]$ ls rsyncd-enabled rsyncd.log rsyncd.pid snapcleaner.log snapshooter.log snapshot.current.search2 snapshot.status.search2 [EMAIL PROTECTED]: .../logs]$ It does look like it could be a path issue. I wonder why, though, no clients sub directory was created. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 7:43 AM, Bill Au wrote: I that case, definitely take a look at SOLR-333: http://issues.apache.org/jira/browse/SOLR-333 On the master there should be a logs/clients directory. Do you have any files in there? Bill On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote: Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
OK. I made the change, but it seemed not to pick up the files. When I changed distrobutiondump.jsp to say... File masterdir = new File(/opt/solr/logs/clients); it worked. Thank you for your help! ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 2:21 PM, Bill Au wrote: I just double checked distribution.jsp. The directory where it looks for status files is hard coded to logs/clients. So for now master_status_dir in your solr/conf/scripts.conf has to be set to that so the scripts will put the status files there. It looks like they are currently in you logs directory. The status files are snapshot.current.search2 and snapshot.status.search2. Bill On 9/7/07, Matthew Runo [EMAIL PROTECTED] wrote: Actually I don't have the clients directory... [EMAIL PROTECTED]: .../logs]$ pwd /opt/solr/logs [EMAIL PROTECTED]: .../logs]$ ls rsyncd-enabled rsyncd.log rsyncd.pid snapcleaner.log snapshooter.log snapshot.current.search2 snapshot.status.search2 [EMAIL PROTECTED]: .../logs]$ It does look like it could be a path issue. I wonder why, though, no clients sub directory was created. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 7, 2007, at 7:43 AM, Bill Au wrote: I that case, definitely take a look at SOLR-333: http://issues.apache.org/jira/browse/SOLR-333 On the master there should be a logs/clients directory. Do you have any files in there? Bill On 9/6/07, Matthew Runo [EMAIL PROTECTED] wrote: Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
Well, I do get... Distribution Info Master Server No distribution info present ... But there appears to be no information filled in. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 6, 2007, at 6:09 AM, Bill Au wrote: That is very strange. Even if there is something wrong with the config or code, the static HTML contained in distributiondump.jsp should show up. Are you using the latest version of the JSP? There has been a recent fix: http://issues.apache.org/jira/browse/SOLR-333 Bill On 9/5/07, Matthew Runo [EMAIL PROTECTED] wrote: When I load the distrobutiondump.jsp, there is no output in my catalina.out file. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Sep 5, 2007, at 1:55 PM, Matthew Runo wrote: Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
Not that I've noticed. I'll do a more careful grep soon here - I just got back from a long weekend. ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++ On Aug 31, 2007, at 6:12 PM, Bill Au wrote: Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Distribution Information?
Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++
Re: Distribution Information?
Are there any error message in your appserver log files? Bill On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote: Hello! /solr/admin/distributiondump.jsp This server is set up as a master server, and other servers use the replication scripts to pull updates from it every few minutes. My distribution information screen is blank.. and I couldn't find any information on fixing this in the wiki. Any chance someone would be able to explain how to get this page working, or what I'm doing wrong? ++ | Matthew Runo | Zappos Development | [EMAIL PROTECTED] | 702-943-7833 ++