Re: Field Collapsing (was Re: Schema for group/child entity setup)
All work and progress on this patch is done under the JIRA issue: https://issues.apache.org/jira/browse/SOLR-236 R. Tan wrote: The patch which will be committed soon will add this functionality. Where can I follow the progress of this patch? On Mon, Sep 7, 2009 at 3:38 PM, Uri Boness ubon...@gmail.com wrote: Great. Nice site and very similar to my requirements. thanks. So, right now, you get all field values by default? Right now, no field values are returned for the collapsed documents. The patch which will be committed soon will add this functionality. R. Tan wrote: Great. Nice site and very similar to my requirements. There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. So, right now, you get all field values by default? On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness ubon...@gmail.com wrote: You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has changed since then. We're currently using this patch on another project, but it's not live yet. Uri R. Tan wrote: Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent
Re: Field Collapsing (was Re: Schema for group/child entity setup)
The current patch definitely supports facet before and after the collapsing. Stephen Weiss wrote: I just noticed this and it reminded me of an issue I've had with collapsed faceting with an older version of the patch in Solr 1.3. Would it be possible, if we can get the terms for all the collapsed documents on a field, to then facet each collapsed document on the unique terms it has collectively? What I mean is for example: Doc 1, 2, 3 collapse together on some other field Doc 1 is the main document and has the colors blue and red Doc 2 has red Doc 3 has green For the purposes of faceting, it would be ideal in our case for faceting on color to count one each for blue, red, and green on this document (the user drills down on this value to yet another collapsed set). Right now, when you facet after collapse you just get blue and red (green is dropped because it collapses out). To the user it makes the counts seem inaccurate, like they're missing something. Instead we facet before collapsing and get an inflated value (which ticks 2 for red - but when you drill down, you still only get 1 because Doc 1 and Doc 2 collapse together again). Either way it's not ideal. At the time (many months ago) there was no way to account for this but it sounds like this patch could make it possible, maybe. Thanks! -- Steve On Sep 5, 2009, at 5:57 AM, Uri Boness wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri
Re: Field Collapsing (was Re: Schema for group/child entity setup)
The patch which will be committed soon will add this functionality. Where can I follow the progress of this patch? On Mon, Sep 7, 2009 at 3:38 PM, Uri Boness ubon...@gmail.com wrote: Great. Nice site and very similar to my requirements. thanks. So, right now, you get all field values by default? Right now, no field values are returned for the collapsed documents. The patch which will be committed soon will add this functionality. R. Tan wrote: Great. Nice site and very similar to my requirements. There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. So, right now, you get all field values by default? On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness ubon...@gmail.com wrote: You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has changed since then. We're currently using this patch on another project, but it's not live yet. Uri R. Tan wrote: Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information
Re: Field Collapsing (was Re: Schema for group/child entity setup)
I just noticed this and it reminded me of an issue I've had with collapsed faceting with an older version of the patch in Solr 1.3. Would it be possible, if we can get the terms for all the collapsed documents on a field, to then facet each collapsed document on the unique terms it has collectively? What I mean is for example: Doc 1, 2, 3 collapse together on some other field Doc 1 is the main document and has the colors blue and red Doc 2 has red Doc 3 has green For the purposes of faceting, it would be ideal in our case for faceting on color to count one each for blue, red, and green on this document (the user drills down on this value to yet another collapsed set). Right now, when you facet after collapse you just get blue and red (green is dropped because it collapses out). To the user it makes the counts seem inaccurate, like they're missing something. Instead we facet before collapsing and get an inflated value (which ticks 2 for red - but when you drill down, you still only get 1 because Doc 1 and Doc 2 collapse together again). Either way it's not ideal. At the time (many months ago) there was no way to account for this but it sounds like this patch could make it possible, maybe. Thanks! -- Steve On Sep 5, 2009, at 5:57 AM, Uri Boness wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Great. Nice site and very similar to my requirements. thanks. So, right now, you get all field values by default? Right now, no field values are returned for the collapsed documents. The patch which will be committed soon will add this functionality. R. Tan wrote: Great. Nice site and very similar to my requirements. There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. So, right now, you get all field values by default? On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness ubon...@gmail.com wrote: You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has changed since then. We're currently using this patch on another project, but it's not live yet. Uri R. Tan wrote: Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan
Re: Field Collapsing (was Re: Schema for group/child entity setup)
There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has changed since then. We're currently using this patch on another project, but it's not live yet. Uri R. Tan wrote: Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Great. Nice site and very similar to my requirements. There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. So, right now, you get all field values by default? On Sun, Sep 6, 2009 at 3:58 AM, Uri Boness ubon...@gmail.com wrote: You can check out http://www.ilocal.nl. If you search for a bank in Amsterdam then you'll see that a lot of the results are collapsed. For this we used an older version of this patch (which works on 1.3) but a lot has changed since then. We're currently using this patch on another project, but it's not live yet. Uri R. Tan wrote: Thanks Uri. Your personal suggestion is appreciated and I think I'll follow your advice. We're still early in development and 1.4 would be a good choice. I hope I can get field collapsing to work with my requirements. Do you know any live site using field collapsing already? On Sat, Sep 5, 2009 at 5:57 PM, Uri Boness ubon...@gmail.com wrote: There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned. AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build). BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion. cheers, Uri R. Tan wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Anybody using it on public site? Would love to see some live examples. On Sat, Sep 5, 2009 at 12:50 AM, R. Tan tanrihae...@gmail.com wrote: Okay. Thanks for giving an insight on how it works in general. Without trying it myself, are the field values for the collapsed ones also part of the results data? What is the latest build that is safe to use on a production environment? I'd probably go for that and use field collapsing. Thank you very much. On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness ubon...@gmail.com wrote: The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Re: Field Collapsing (was Re: Schema for group/child entity setup)
The collapsed documents are represented by one master document which can be part of the normal search result (the doc list), so pagination just works as expected, meaning taking only the returned documents in account (ignoring the collapsed ones). As for the scoring, the master document is actually the document with the highest score in the collapsed group. As for Solr 1.3 compatibility... well... it's very hart to tell. All latest patch are certainly *not* 1.3 compatible (I think they're also depending on some changes in lucene which are not available for solr 1.3). I guess you'll have to try some of the old patches, but I'm not sure about their stability. cheers, Uri R. Tan wrote: Thanks Uri. How does paging and scoring work when using field collapsing? What patch works with 1.3? Is it production ready? R On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness ubon...@gmail.com wrote: The development on this patch is quite active. It works well for single solr instance, but distributed search (ie. shards) is not yet supported. Using this page you can group search results based on a specific field. There are two flavors of field collapsing - adjacent and non-adjacent, the former collapses only document which happen to be located next to each other in the otherwise-non-collapsed results set. The later (the non-adjacent) one collapses all documents with the same field value (regardless of their position in the otherwise-non-collapsed results set). Note, that non-adjacent performs better than adjacent one. There's currently discussion to extend this support so in addition to collapsing the documents, extra information will be returned for the collapsed documents (see the discussion on the issue page). Uri R. Tan wrote: I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R
Field Collapsing (was Re: Schema for group/child entity setup)
I think this is what I'm looking for. What is the status of this patch? On Thu, Sep 3, 2009 at 12:00 PM, R. Tan tanrihae...@gmail.com wrote: Hi Solrers, I would like to get your opinion on how to best approach a search requirement that I have. The scenario is I have a set of business listings that may be group into one parent business (such as 7-eleven having several locations). On the results page, I only want 7-eleven to show up once but also show how many locations matched the query (facet filtered by state, for example) and maybe a preview of the some of the locations. Searching for the business name is straightforward but the locations within the a result is quite tricky. I can do the opposite, searching for the locations and faceting on business names, but it will still basically be the same thing and repeat results with the same business name. Any advice? Thanks, R