[jira] [Commented] (SOLR-6312) CloudSolrServer doesn't honor updatesToLeaders constructor argument
[ https://issues.apache.org/jira/browse/SOLR-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208362#comment-16208362 ] Jeff Wartes commented on SOLR-6312: --- As of 7.0.1, three years later, yes, I think it is still an open issue. CloudSolrServer.Builder has two functions that have no effect: sendUpdatesOnlyToShardLeaders and sendUpdatesToAllReplicasInShard. They are not marked depreciated, and the javadoc implies functionality. > CloudSolrServer doesn't honor updatesToLeaders constructor argument > --- > > Key: SOLR-6312 > URL: https://issues.apache.org/jira/browse/SOLR-6312 > Project: Solr > Issue Type: Bug >Affects Versions: 4.9 >Reporter: Steve Davids > Fix For: 4.10 > > Attachments: SOLR-6312.patch > > > The CloudSolrServer doesn't use the updatesToLeaders property - all SolrJ > requests are being sent to the shard leaders. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7269) ZK as truth for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838581#comment-15838581 ] Jeff Wartes commented on SOLR-7269: --- Any life still here? I've always thought it was strange that Solr effectively had two sources of truth. (disk and Zk) > ZK as truth for SolrCloud > - > > Key: SOLR-7269 > URL: https://issues.apache.org/jira/browse/SOLR-7269 > Project: Solr > Issue Type: Improvement >Reporter: Varun Thacker > > We have been wanting to do this for a long time. > Mark listed out what all should go into this here - > https://issues.apache.org/jira/browse/SOLR-7248?focusedCommentId=14363441=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14363441 > The best approach as Mark suggested would be to work on these under > legacyCloud=false and once we are confident switch over to it as default. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816053#comment-15816053 ] Jeff Wartes commented on SOLR-5170: --- Well, yes, I'm interested. I've got enough other work projects going at the moment I'm not sure if I'll be able to dedicate much time in the next month or two, but I wouldn't mind trying to chip at it. I don't want to pollute this issue, so if you have a few minutes, and could drop me an email with any pointers about the code areas involved, or references to any prior art you're aware of, I expect that'd accelerate things a lot. Thanks. > Spatial multi-value distance sort via DocValues > --- > > Key: SOLR-5170 > URL: https://issues.apache.org/jira/browse/SOLR-5170 > Project: Solr > Issue Type: New Feature > Components: spatial >Reporter: David Smiley >Assignee: David Smiley > Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, > SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, > SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt > > > The attached patch implements spatial multi-value distance sorting. In other > words, a document can have more than one point per field, and using a > provided function query, it will return the distance to the closest point. > The data goes into binary DocValues, and as-such it's pretty friendly to > realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812871#comment-15812871 ] Jeff Wartes commented on SOLR-5170: --- It's coming up on two years, and I'm aware there have been some significant changes to areas like docvalues and geospatial since the last update to this issue. What's the state of the world now? If you have entities with multiple locations, and you want to filter and sort, is this patch still the highest-performance option available? I'm more willing to give up on the real-time-friendliness these days, if that changes the answer. > Spatial multi-value distance sort via DocValues > --- > > Key: SOLR-5170 > URL: https://issues.apache.org/jira/browse/SOLR-5170 > Project: Solr > Issue Type: New Feature > Components: spatial >Reporter: David Smiley >Assignee: David Smiley > Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, > SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, > SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt > > > The attached patch implements spatial multi-value distance sorting. In other > words, a document can have more than one point per field, and using a > provided function query, it will return the distance to the closest point. > The data goes into binary DocValues, and as-such it's pretty friendly to > realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743325#comment-15743325 ] Jeff Wartes commented on SOLR-4735: --- Understood, and not all cores are part of a collection. But if it matches the solrcloud convention, it would be pretty nice to use it. (and the node name if it doesn't) I could've sworn I saw an existing function for picking a node name apart somewhere, but I can't seem to find it now - maybe it was in a patch I read or something. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743207#comment-15743207 ] Jeff Wartes commented on SOLR-4735: --- That's almost perfect. Can we replace those underscores with dots? That would mean the dashboard doesn't need to regex the "name" in order to group similar metrics. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743201#comment-15743201 ] Jeff Wartes commented on SOLR-4735: --- Oh, one thing just occurred to me though. There are essentially two classes of request to a collection - the top-level request, and the per-shard fan-out requests. I guess you can sort of derive the metrics of the top-level request from the per-core metrics, but it requires you know the number of shards, and still only works if the two classes of request are not mixed together. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742740#comment-15742740 ] Jeff Wartes edited comment on SOLR-4735 at 12/12/16 6:38 PM: - I've fallen behind keeping up with your changes, but for what it's worth, I agree with this. Collection-level metrics are at the cluster level, in aggregate. It's up to the thing you're reporting the metrics into to do the aggregation. For example, what I really want on my dashboard in grafana is a line, something like: AVG(solr.[all nodes].[all cores belonging to a particular collection].latency.p95) Then I can drill into a particular node, or core, in my reporting tool if I want. There's a requirement that the metrics namespaces being reported allows for aggregation like this, which might mean a core needs to know the collection to which it belongs, but I don't think the node itself should needs to report collection metrics. was (Author: jwartes): I've fallen behind keeping up with your changes, but for what it's worth, I agree with this. Collection-level metrics are at the cluster level, in aggregate. It's up to the thing you're reporting the metrics into to do the aggregation. For example, what I really want on my dashboard in grafana is a line, something like: AVG(solr.{all nodes}.{all cores belonging to a particular collection}.latency.p95) Then I can drill into a particular node, or core, in my reporting tool if I want. There's a requirement that the metrics namespaces being reported allows for aggregation like this, which might mean a core needs to know the collection to which it belongs, but I don't think the node itself should needs to report collection metrics. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742740#comment-15742740 ] Jeff Wartes commented on SOLR-4735: --- I've fallen behind keeping up with your changes, but for what it's worth, I agree with this. Collection-level metrics are at the cluster level, in aggregate. It's up to the thing you're reporting the metrics into to do the aggregation. For example, what I really want on my dashboard in grafana is a line, something like: AVG(solr.{all nodes}.{all cores belonging to a particular collection}.latency.p95) Then I can drill into a particular node, or core, in my reporting tool if I want. There's a requirement that the metrics namespaces being reported allows for aggregation like this, which might mean a core needs to know the collection to which it belongs, but I don't think the node itself should needs to report collection metrics. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch, screenshot-1.png > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15716772#comment-15716772 ] Jeff Wartes commented on SOLR-4735: --- That seems pretty viable too. As I mentioned, the memory overhead of a registry is pretty low, just a concurrent map and a list. Plus, the actual metric objects in the map would be shared by both registries, so I'd be more concerned about the work involved keeping them synchronized then with just having multiple registries. I confess though, I don't have a clear idea whether that's more or less overhead than multiple identically-configured reporters. It feels like most of the possible performance issues here are linear, so it may not matter. Two reporters iterating through 10 metrics each sounds pretty much the same as one reporter iterating over 20 to me, all else being equal. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15716247#comment-15716247 ] Jeff Wartes commented on SOLR-4735: --- Yeah, I get that. I like this line of thought because it means we can create as many registries as make sense, (cores, collections, logical code sections, etc) without worrying about how to get everything reported. We only have to pick some names. What about a class that extends MetricRegistry and also implements MetricRegistryListener? Call that a ListeningMetricRegistry or something. When the configuration asks for a reporter on some set of (registry) names, we create a new, perhaps non-shared ListeningMetricRegistry, use registerAll to scoop the metrics in the desired registries into it, and then call addListener on all the desired registries with the ListeningMetricRegistry so everything stays in sync? So that could still mean a single registry with a ton of metrics, but only in cases where there's been an explicit request for a reporter on a ton of metrics. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715880#comment-15715880 ] Jeff Wartes commented on SOLR-4735: --- `MetricRegistry` is really just a bunch of convenience methods and thread-safety around a `MetricSet`. There isn't much overhead difference between the two. But really, when I think of a MetricRegistry, I think of it as "a set of metrics I want to attach a reporter to", nothing more. It's a bit disappointing that reporters take a Registry instead of a MetricSet, since a Registry isa MetricSet. With that in mind, one strategy would be have every logical grouping of metrics use its own dedicated (probably shared) registry, and then bind the reporter-registry concept together at reporter definition time. That is, create a non-shared registry explicitly for the purpose of attaching a reporter to it, and only when asked to define a reporter. The reporter definition would then include the names of the registries to be reported. Under the hood, a new registry would be created as the union of the requested registries, and the reporter instantiated and attached to that. We'd have to make sure the namespace of all the metrics in the metric groups is unique, so that arbitrary groups can be combined without conflict, but that sounds desirable regardless. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707000#comment-15707000 ] Jeff Wartes commented on SOLR-4735: --- Heh, I wondered whether something like that would happen if I commented on github. Should I constrain myself to talking in Jira? > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696479#comment-15696479 ] Jeff Wartes commented on SOLR-4735: --- I had a scheme for collapsable namespaced registries in my original PR for SOLR-8785. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693874#comment-15693874 ] Jeff Wartes commented on SOLR-4735: --- >From what I see, SolrMetricManager only needs the SolrCore for the >config-based reporter instantiation, but that's a pretty nice thing to have. How about SolrMetricManager takes, as an optional second parameter to the constructor, the name of a SharedMetricRegistry. If absent, then it creates a new, isolated registry. With a name though, that means the config-based reporters you attach are actually being attached to the shared registry, pulling whatever happens to be in there too. Of course, then the core unregister action needs to be careful to only replace/reset those metrics that it'd added to the registry, instead of all of them as currently written. It could remove/replace the reporters with no real issue on every core reload (aside from possibly a blip in the reporting interval) though. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Andrzej Bialecki >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690992#comment-15690992 ] Jeff Wartes commented on SOLR-4735: --- For what it's worth, this looks like really great stuff to me. I'm still unconvinced that metrics should always get reset on core reload, which is a source of some complexity, but doing so is certainly consistent with the prior behavior, so I can hardly complain. I think I can see a path to providing reportable metrics outside of the RequestHandler. I'd be interested in Kelvin's thoughts on that subject though, since he chose not to use SharedMetricsRegistries. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch, > SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8785) Use Metrics library for core metrics
[ https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15684725#comment-15684725 ] Jeff Wartes commented on SOLR-8785: --- Understood, I'm all for incremental change, and I don't see "how to make a Reporter" as part of this issue. I will be slightly disappointed though, if we convert to the library without also providing a recommended access path for the use of that library. Gathering metrics you can't report on is useless, and one of the things I liked about the original patch was this: {code} if(this.pluginInfo==null) { // if a request handler has a name, use a persistent, reportable timer under that name if (pluginInfo.name != null) requestTimes = Metrics.namedTimer(Metrics.mkName(this.getClass(), pluginInfo.name), REGISTRY_NAME); this.pluginInfo = pluginInfo; } {code} This meant that I automatically got access to all the relevant metrics for any named request handler, using any Reporters (Log, JMX, Graphite, whatever) I cared to attach. This, in turn, was only possible because all those metrics were in a well-defined and accessible location. > Use Metrics library for core metrics > > > Key: SOLR-8785 > URL: https://issues.apache.org/jira/browse/SOLR-8785 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Jeff Wartes > Labels: patch, patch-available > Attachments: SOLR-8785-increment.patch, SOLR-8785.patch, > SOLR-8785.patch > > > The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a > well-known way to track metrics about applications. > In SOLR-1972, latency percentile tracking was added. The comment list is > long, so here’s my synopsis: > 1. An attempt was made to use the Metrics library > 2. That attempt failed due to a memory leak in Metrics v2.1.1 > 3. Large parts of Metrics were then copied wholesale into the > org.apache.solr.util.stats package space and that was used instead. > Copy/pasting Metrics code into Solr may have been the correct solution at the > time, but I submit that it isn’t correct any more. > The leak in Metrics was fixed even before SOLR-1972 was released, and by > copy/pasting a subset of the functionality, we miss access to other important > things that the Metrics library provides, particularly the concept of a > Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters) > Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s > used in two contrib modules. (map-reduce and morphines-core) > I’m proposing that: > 1. Metrics as bundled with Solr be upgraded to the current v3.1.2 > 2. Most of the org.apache.solr.util.stats package space be deleted outright, > or gutted and replaced with simple calls to Metrics. Due to the copy/paste > origin, the concepts should mostly map 1:1. > I’d further recommend a usage pattern like: > SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”, > “solr-registry”)) > There are all kinds of areas in Solr that could benefit from metrics tracking > and reporting. This pattern allows diverse areas of code to track metrics > within a single, named registry. This well-known-name then becomes a handle > you can use to easily attach a Reporter and ship all of those metrics off-box. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4735) Improve Solr metrics reporting
[ https://issues.apache.org/jira/browse/SOLR-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626442#comment-15626442 ] Jeff Wartes commented on SOLR-4735: --- I have, and am, by instantiating a SharedMetricRegistry and GraphiteReporter directly in the jetty.xml. (Which is hacky, but in lieu of SOLR-8785, does work fine) I'm also using the logging and JVM metrics plugins quite happily. > Improve Solr metrics reporting > -- > > Key: SOLR-4735 > URL: https://issues.apache.org/jira/browse/SOLR-4735 > Project: Solr > Issue Type: Improvement >Reporter: Alan Woodward >Assignee: Alan Woodward >Priority: Minor > Attachments: SOLR-4735.patch, SOLR-4735.patch, SOLR-4735.patch > > > Following on from a discussion on the mailing list: > http://search-lucene.com/m/IO0EI1qdyJF1/codahale=Solr+metrics+in+Codahale+metrics+and+Graphite+ > It would be good to make Solr play more nicely with existing devops > monitoring systems, such as Graphite or Ganglia. Stats monitoring at the > moment is poll-only, either via JMX or through the admin stats page. I'd > like to refactor things a bit to make this more pluggable. > This patch is a start. It adds a new interface, InstrumentedBean, which > extends SolrInfoMBean to return a > [[Metrics|http://metrics.codahale.com/manual/core/]] MetricRegistry, and a > couple of MetricReporters (which basically just duplicate the JMX and admin > page reporting that's there at the moment, but which should be more > extensible). The patch includes a change to RequestHandlerBase showing how > this could work. The idea would be to eventually replace the getStatistics() > call on SolrInfoMBean with this instead. > The next step would be to allow more MetricReporters to be defined in > solrconfig.xml. The Metrics library comes with ganglia and graphite > reporting modules, and we can add contrib plugins for both of those. > There's some more general cleanup that could be done around SolrInfoMBean > (we've got two plugin handlers at /mbeans and /plugins that basically do the > same thing, and the beans themselves have some weirdly inconsistent data on > them - getVersion() returns different things for different impls, and > getSource() seems pretty useless), but maybe that's for another issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8785) Use Metrics library for core metrics
[ https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595671#comment-15595671 ] Jeff Wartes commented on SOLR-8785: --- For the record, it looks like I wrote this patch against master, around about version 6.1. I recall I had some concern at the time that the metrics namespace generation was too flexible (complicated), so that's something to look at. > Use Metrics library for core metrics > > > Key: SOLR-8785 > URL: https://issues.apache.org/jira/browse/SOLR-8785 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Jeff Wartes > Labels: patch, patch-available > > The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a > well-known way to track metrics about applications. > In SOLR-1972, latency percentile tracking was added. The comment list is > long, so here’s my synopsis: > 1. An attempt was made to use the Metrics library > 2. That attempt failed due to a memory leak in Metrics v2.1.1 > 3. Large parts of Metrics were then copied wholesale into the > org.apache.solr.util.stats package space and that was used instead. > Copy/pasting Metrics code into Solr may have been the correct solution at the > time, but I submit that it isn’t correct any more. > The leak in Metrics was fixed even before SOLR-1972 was released, and by > copy/pasting a subset of the functionality, we miss access to other important > things that the Metrics library provides, particularly the concept of a > Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters) > Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s > used in two contrib modules. (map-reduce and morphines-core) > I’m proposing that: > 1. Metrics as bundled with Solr be upgraded to the current v3.1.2 > 2. Most of the org.apache.solr.util.stats package space be deleted outright, > or gutted and replaced with simple calls to Metrics. Due to the copy/paste > origin, the concepts should mostly map 1:1. > I’d further recommend a usage pattern like: > SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”, > “solr-registry”)) > There are all kinds of areas in Solr that could benefit from metrics tracking > and reporting. This pattern allows diverse areas of code to track metrics > within a single, named registry. This well-known-name then becomes a handle > you can use to easily attach a Reporter and ship all of those metrics off-box. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-4449: -- Labels: patch patch-available (was: patch-available) > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Labels: patch, patch-available > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8785) Use Metrics library for core metrics
[ https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-8785: -- Labels: patch patch-available (was: patch-available) > Use Metrics library for core metrics > > > Key: SOLR-8785 > URL: https://issues.apache.org/jira/browse/SOLR-8785 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Jeff Wartes > Labels: patch, patch-available > > The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a > well-known way to track metrics about applications. > In SOLR-1972, latency percentile tracking was added. The comment list is > long, so here’s my synopsis: > 1. An attempt was made to use the Metrics library > 2. That attempt failed due to a memory leak in Metrics v2.1.1 > 3. Large parts of Metrics were then copied wholesale into the > org.apache.solr.util.stats package space and that was used instead. > Copy/pasting Metrics code into Solr may have been the correct solution at the > time, but I submit that it isn’t correct any more. > The leak in Metrics was fixed even before SOLR-1972 was released, and by > copy/pasting a subset of the functionality, we miss access to other important > things that the Metrics library provides, particularly the concept of a > Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters) > Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s > used in two contrib modules. (map-reduce and morphines-core) > I’m proposing that: > 1. Metrics as bundled with Solr be upgraded to the current v3.1.2 > 2. Most of the org.apache.solr.util.stats package space be deleted outright, > or gutted and replaced with simple calls to Metrics. Due to the copy/paste > origin, the concepts should mostly map 1:1. > I’d further recommend a usage pattern like: > SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”, > “solr-registry”)) > There are all kinds of areas in Solr that could benefit from metrics tracking > and reporting. This pattern allows diverse areas of code to track metrics > within a single, named registry. This well-known-name then becomes a handle > you can use to easily attach a Reporter and ship all of those metrics off-box. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-4449: -- Labels: patch-available (was: ) > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Labels: patch-available > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8785) Use Metrics library for core metrics
[ https://issues.apache.org/jira/browse/SOLR-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-8785: -- Labels: patch-available (was: ) > Use Metrics library for core metrics > > > Key: SOLR-8785 > URL: https://issues.apache.org/jira/browse/SOLR-8785 > Project: Solr > Issue Type: Improvement >Affects Versions: 4.1 >Reporter: Jeff Wartes > Labels: patch-available > > The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a > well-known way to track metrics about applications. > In SOLR-1972, latency percentile tracking was added. The comment list is > long, so here’s my synopsis: > 1. An attempt was made to use the Metrics library > 2. That attempt failed due to a memory leak in Metrics v2.1.1 > 3. Large parts of Metrics were then copied wholesale into the > org.apache.solr.util.stats package space and that was used instead. > Copy/pasting Metrics code into Solr may have been the correct solution at the > time, but I submit that it isn’t correct any more. > The leak in Metrics was fixed even before SOLR-1972 was released, and by > copy/pasting a subset of the functionality, we miss access to other important > things that the Metrics library provides, particularly the concept of a > Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters) > Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s > used in two contrib modules. (map-reduce and morphines-core) > I’m proposing that: > 1. Metrics as bundled with Solr be upgraded to the current v3.1.2 > 2. Most of the org.apache.solr.util.stats package space be deleted outright, > or gutted and replaced with simple calls to Metrics. Due to the copy/paste > origin, the concepts should mostly map 1:1. > I’d further recommend a usage pattern like: > SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”, > “solr-registry”)) > There are all kinds of areas in Solr that could benefit from metrics tracking > and reporting. This pattern allows diverse areas of code to track metrics > within a single, named registry. This well-known-name then becomes a handle > you can use to easily attach a Reporter and ship all of those metrics off-box. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6581) Efficient DocValues support and numeric collapse field implementations for Collapse and Expand
[ https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392504#comment-15392504 ] Jeff Wartes commented on SOLR-6581: --- For what it's worth, I recall having a bad experience with that hint in a Solr 5.4 cluster late last year. I never did dig into why though. I had a similar case where I was collapsing on a highly distinct field, and as Joel indicates, the memory allocation rate was bad enough I had to give up on the whole thing. Joel and I discussed this a little in SOLR-9125 if you're curious. > Efficient DocValues support and numeric collapse field implementations for > Collapse and Expand > -- > > Key: SOLR-6581 > URL: https://issues.apache.org/jira/browse/SOLR-6581 > Project: Solr > Issue Type: Bug >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Minor > Fix For: 5.0, 6.0 > > Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, > renames.diff > > > The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent > are optimized to work with a top level FieldCache. Top level FieldCaches have > a very fast docID to top-level ordinal lookup. Fast access to the top-level > ordinals allows for very high performance field collapsing on high > cardinality fields. > LUCENE-5666 unified the DocValues and FieldCache api's so that the top level > FieldCache is no longer in regular use. Instead all top level caches are > accessed through MultiDocValues. > This ticket does the following: > 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the > default approach when collapsing on String fields > 2) Provides an option to use a top level FieldCache if the performance of > MultiDocValues is a blocker. The mechanism for switching to the FieldCache is > a new "hint" parameter. If the hint parameter is set to "top_fc" then the > top-level FieldCache would be used for both Collapse and Expand. > Example syntax: > {code} > fq={!collapse field=x hint=TOP_FC} > {code} > 3) Adds numeric collapse field implementations. > 4) Resolves issue SOLR-6066 > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9335) Move solr stats collections to use LongAdder
[ https://issues.apache.org/jira/browse/SOLR-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392297#comment-15392297 ] Jeff Wartes commented on SOLR-9335: --- fwiw, SOLR-8241 involves cache implementations that (among other improvements) uses LongAddr, and the author has been having trouble getting committer attention. > Move solr stats collections to use LongAdder > > > Key: SOLR-9335 > URL: https://issues.apache.org/jira/browse/SOLR-9335 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Varun Thacker >Assignee: Varun Thacker >Priority: Minor > Fix For: 6.2, master (7.0) > > Attachments: SOLR-9335.patch, SOLR-9335.patch > > > With Java 8 we can use LongAdder which has more throughput under high > contention . > These classes of Solr should benefit from LongAdder > - Caches ( ConcurentLRUCache / LRUCache ) > - Searches ( RequestHandlerBase ) > - Updates ( DirectUpdateHandler2 ) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9133) UUID FieldType shouldn't be stored as a String
Jeff Wartes created SOLR-9133: - Summary: UUID FieldType shouldn't be stored as a String Key: SOLR-9133 URL: https://issues.apache.org/jira/browse/SOLR-9133 Project: Solr Issue Type: Improvement Reporter: Jeff Wartes This came up in passing on SOLR-6741 last year, but as far as I can tell, the solr UUIDField still indexes those UUIDs as strings, not as a 128bit number. So really, the only point of the UUIDField instead of using a StringField is that there's some validation and the possibility of a newly-generated value. Seems a little misleading. >From what I can tell, Lucene has added a bunch of support for arbitrary sized >numbers and binary primitives (LUCENE-7043?), so it seems like the Solr UUID >field should save some space and actually index UUIDs as what they are. Of course, since this would change the encoding of an existing field type, it might take the form of a new "CompressedUUIDField" or something instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9125) CollapseQParserPlugin allocations are index based, not query based
[ https://issues.apache.org/jira/browse/SOLR-9125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15287339#comment-15287339 ] Jeff Wartes commented on SOLR-9125: --- Isn't there a chicken-and-egg situation there? You need the set of matching docs to figure out the HLL.cardinality to specify the initial size of the map you're going to save the set of matching docs in? Or maybe collect() would just throw every doc in the FBS, and finish() would do all the finding group heads and collapsing? > CollapseQParserPlugin allocations are index based, not query based > -- > > Key: SOLR-9125 > URL: https://issues.apache.org/jira/browse/SOLR-9125 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Jeff Wartes >Priority: Minor > Labels: collapsingQParserPlugin > > Among other things, CollapsingQParserPlugin’s OrdScoreCollector allocates > space per-query for: > 1 int (doc id) per ordinal > 1 float (score) per ordinal > 1 bit (FixedBitSet) per document in the index > > So the higher the cardinality of the thing you’re grouping on, and the more > documents in the index, the more memory gets consumed per query. Since high > cardinality and large indexes are the use-cases CollapseQParserPlugin was > designed for, I thought I'd point this out. > My real issue is that this does not vary based on the number of results in > the query, either before or after collapsing, so a query that results in one > doc consumes the same amount of memory as one that returns all of them. All > of the Collectors suffer from this to some degree, but I think OrdScore is > the worst offender. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9125) CollapseQParserPlugin allocations are index based, not query based
[ https://issues.apache.org/jira/browse/SOLR-9125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286940#comment-15286940 ] Jeff Wartes commented on SOLR-9125: --- I messed around a little bit, but I don't have a solution for this. I thought I'd file the issue anyway just to shine some light. I had attempted to use CollapseQParserPlugin on a very large index using a collapse on a field whose cardinality was about 1/7th the doc count... it didn't go well. Worse, the issue didn't come up until pretty late in the game, because at low query rate and/or on smaller indexes, the problem isn't evident. I abandoned the attempt. Some stuff I tried: - I thought about replacing the FBS with a DocIdSetBuilder, but DelegatingCollector.finish() gets called twice, and you can't DocIdSetBuilder.build() twice on the same builder. We'd need to save the first build() result and use it to initialize a new builder for the second, but I wasn't convinced I understood the distinction between the two passes. - I did one quick test where I replaced the "ords" and "scores" arrays with an IntIntScatterMap IntFloatScatterMap, thinking those would work better for small result sets. That ended up being worse (from a total allocations standpoint) for the queries I was trying, probably due to the map resizing necessary. It might be possible to set initial size values from statistics and help this case that way. It would also be possible to encode the docId/score into a long and just use one IntLongScatterMap, but I didn't try that. > CollapseQParserPlugin allocations are index based, not query based > -- > > Key: SOLR-9125 > URL: https://issues.apache.org/jira/browse/SOLR-9125 > Project: Solr > Issue Type: Improvement > Components: query parsers >Reporter: Jeff Wartes >Priority: Minor > Labels: collapsingQParserPlugin > > Among other things, CollapsingQParserPlugin’s OrdScoreCollector allocates > space per-query for: > 1 int (doc id) per ordinal > 1 float (score) per ordinal > 1 bit (FixedBitSet) per document in the index > > So the higher the cardinality of the thing you’re grouping on, and the more > documents in the index, the more memory gets consumed per query. Since high > cardinality and large indexes are the use-cases CollapseQParserPlugin was > designed for, I thought I'd point this out. > My real issue is that this does not vary based on the number of results in > the query, either before or after collapsing, so a query that results in one > doc consumes the same amount of memory as one that returns all of them. All > of the Collectors suffer from this to some degree, but I think OrdScore is > the worst offender. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9125) CollapseQParserPlugin allocations are index based, not query based
Jeff Wartes created SOLR-9125: - Summary: CollapseQParserPlugin allocations are index based, not query based Key: SOLR-9125 URL: https://issues.apache.org/jira/browse/SOLR-9125 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Jeff Wartes Priority: Minor Among other things, CollapsingQParserPlugin’s OrdScoreCollector allocates space per-query for: 1 int (doc id) per ordinal 1 float (score) per ordinal 1 bit (FixedBitSet) per document in the index So the higher the cardinality of the thing you’re grouping on, and the more documents in the index, the more memory gets consumed per query. Since high cardinality and large indexes are the use-cases CollapseQParserPlugin was designed for, I thought I'd point this out. My real issue is that this does not vary based on the number of results in the query, either before or after collapsing, so a query that results in one doc consumes the same amount of memory as one that returns all of them. All of the Collectors suffer from this to some degree, but I think OrdScore is the worst offender. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8697) Fix LeaderElector issues
[ https://issues.apache.org/jira/browse/SOLR-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282958#comment-15282958 ] Jeff Wartes commented on SOLR-8697: --- Does this fix SOLR-6498? > Fix LeaderElector issues > > > Key: SOLR-8697 > URL: https://issues.apache.org/jira/browse/SOLR-8697 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4.1 >Reporter: Scott Blum >Assignee: Mark Miller > Labels: patch, reliability, solrcloud > Fix For: 5.5.1, 6.0 > > Attachments: OverseerTestFail.log, SOLR-8697-followup.patch, > SOLR-8697.patch > > > This patch is still somewhat WIP for a couple of reasons: > 1) Still debugging test failures. > 2) This will more scrutiny from knowledgable folks! > There are some subtle bugs with the current implementation of LeaderElector, > best demonstrated by the following test: > 1) Start up a small single-node solrcloud. it should be become Overseer. > 2) kill -9 the solrcloud process and immediately start a new one. > 3) The new process won't become overseer. The old process's ZK leader elect > node has not yet disappeared, and the new process fails to set appropriate > watches. > NOTE: this is only reproducible if the new node is able to start up and join > the election quickly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267176#comment-15267176 ] Jeff Wartes commented on LUCENE-7258: - Ok, yeah, that’s a reasonable thing to assume. We usually think of it in terms of cpu work, but filter caches would be an equally great way to mitigate allocations. But a cache is really only useful when you’ve got non-uniform query distributions, or enough time-locality at your query rate that your rare queries haven’t faced a cache eviction yet. I’m indexing address-type data. Not uncommon. I think that if my typical geospatial search were based on some hyper-local phone location, we’d be done talking, since a filter cache would be useless. So maybe we should assume I’m not doing that. Let’s assume I can get away with something coarse. Let’s assume I can convert all location based queries to the center point of a city. Let’s further assume that I only care about one radius per city. Finally, let’s assume I’m only searching in the US. There are some 40,000 cities in the US, so those assumptions yield 40,000 possible queries. That’s not too bad. With a 100M-doc core, I think that’s about 12.5Mb per filter cache entry. It could be less, I think, particularly with the changes in SOLR-8922, but since we’re only going with coarse queries, it’s reasonable to assume there’s going to be a lot of hits. I don’t need every city in the cache, of course, so maybe… 5%? That’s only some 25G of heap. Doable, especially since it saves allocation size and you could probably trade in more of the eden space. (Although this would make warmup more of a pain) I’d probably have to cross the CompressedOops boundary at 32G of heap to do that too though, so add another 16G to get back to baseline. Fortunately, the top 5% of cities probably maps to more than 5% of queries. More populated cities are also more likely targets for searching in most query corpuses. So assuming it’s the biggest 5% that are in the cache, maybe we can assume a 15% hit rate? 20%? Ok, so now I’ve spent something like 41G of heap, and I’ve reduced allocations by 20%. Is this pretty good? I suppose it’s worth noting that this also assumes a perfect cache eviction policy, (I’m pretty interested in SOLR-8241) and that there’s no other filter cache pressure. (At the least, I’m using facets - SOLR-8171) > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265829#comment-15265829 ] Jeff Wartes commented on LUCENE-7258: - There are actually three threads going on this ticket right now, there’s the “what threshold and expansion to use for geospatial” that I’d originally intended and provided a patch for, there’s the “what expansion for DocIdSetBuilder is generically optimal”, and there’s the “FBS is 50% of my allocation rate, can we pool” conversation. I think the latter is a worthy conversation, and I don’t have a better place for it, so I’m going to continue to respond to the comments along those lines, (with apologies for the book I’m writing here) but I wanted to point out the divergence. So, I certainly understand a knee-jerk reaction around using object pools of any kind. Yes, this IS what the JVM is for. It’s easier and simpler and lower maintenance to just use what’s provided. But I could also argue that Arrays.sort has all those same positive attributes and that hasn’t stopped several hand-written sort algorithms get into this codebase. The question is actually whether the easy and simple thing is good enough, or whether the harder thing has a sufficient offsetting benefit. Everyone on this thread is a highly experienced programmer, we all know this. In this case, that means the question is actually whether the allocation rate is “good enough” or if there's a sufficiently offsetting opportunity for improvement, and arguments should ideally come from that analysis. I can empirically state that for my large Solr index, that GC pause is the single biggest detriment to my 90+th percentile query latency. Put another way, Lucene is fantastically fast, at least when the JVM isn’t otherwise occupied. Because of shard fan-out, a per-shard p90 latency very quickly becomes a p50 latency for queries overall. (Even with mitigations like SOLR-4449) I don’t think there’s anything particularly unique to my use-case in anything I just said, except possibly the word “large”. As such, I consider this an opportunity for improvement, so I’ve suggested a mitigation strategy. It clearly has some costs. I’d be delighted to entertain any alternative strategies. Actually, [~dsmiley] did bring up one alternative suggestion for improvement, so let’s talk about -Xmn: First, let’s assume that Lucene’s policy on G1 hasn’t changed, and we’re still talking about ParNew/CMS. Second, with the exception of a few things like cache, most of the allocations in a Solr/Lucene index are very short-lived. So it follows that given a young generation of sufficient size, the tenured generation would actually see very little activity. The major disadvantage to just using a huge young generation then is that there aren’t any concurrent young-generation collectors. The bigger it is, the less frequently you need to collect, but the longer the stop-the-world GC pause when you do. On the other end of the scale, a very small young space means shorter pauses, but far more frequent. Since almost all garbage is short-lived, maybe now you're doing young-collections so often that you’ve got the tenured collector doing a bunch of the work cleaning up short-lived objects too. (This can actually be a good thing, since the CMS collector is mostly concurrent) There’s some theoretical size that optimizes frequency vs pause for averaged latency. Perhaps even by deliberately allowing some premature overflow into tenured simply because tenured can be collected concurrently. This kind of thing is extremely delicate to tune for though, especially since query rate (and query type distribution) can fluctuate. It’s easy to get it wrong, such that a sudden large-allocation slams past the rate CMS was expecting and triggers a full-heap stop-the-world pause. I’m focusing on FBS here because: 1. _Fifty Percent_. 2. These are generally larger objects, so mitigating those allocations seemed like a good way to mitigate unexpected changes in allocation rate and allow more stable tuning. There’s probably also at least one Jira issue around looking at object count allocation rate (vs size) since I suspect the single biggest factor in collector pause is the object count. Certainly I can point to objects that get allocated (by count) in orders of magnitude greater frequency than then next highest count. But since I don’t have a good an understanding of the use cases, let alone have any suggestions yet, I’ve left that for another time. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: >
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264309#comment-15264309 ] Jeff Wartes commented on LUCENE-7258: - I'm not sure I understand how the dangers of large FBS size would be any different with a pooling mechanism than they are right now. If a query needs several of them, then it needs several of them, whether they're freshly allocated or not. The only real difference I see might be whether that memory exists in the tenured space, rather than thrashing the eden space every time. I don't think it'd need to be per-thread. I don't mind points of synchronization if they're tight and well understood. Allocation rate by count is generally lower here. One thought: https://gist.github.com/randomstatistic/87caefdea8435d6af4ad13a3f92d2698 To anticipate some objections, there are likely lockless data structures you could use, and yes, you might prefer to control size in terms of memory instead of count. I can think of a dozen improvements per minute I spend looking at this. But you get the idea. Anyone anywhere who knows for *sure* they're done with a FBS can offer it up for reuse, and anyone can potentially get some reuse by just changing their "new" to "request". If everybody does this, you end up with a fairly steady pool of FBS instances large enough for most uses. If only some places use it, there's no chance of an unbounded leak, you might get some gain, and worst-case you haven't lost much. If nobody uses it, you've lost nothing. Last I checked, something like a full 50% of (my) allocations by size were FixedBitSets despite a low allocation rate by count, or I wouldn't be harping on the subject. As a matter of principle, I'd gladly pay heap to reduce GC. The fastest search algorithm in the world doesn't help me if I'm stuck waiting for the collector to finish all the time. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15260960#comment-15260960 ] Jeff Wartes commented on LUCENE-7258: - I'd be interested in trying TimSort, or something like [~yo...@apache.org] suggested where an ExpandingIntArray-style array of arrays is fed directly into the Radix sort, but I'm not sure I'm going to be able to commit much more time to this for a bit. That said, in the process of thinking about this, I do have a few git stashes saved off with sketches for things like using TimSort and using ExpandingIntArray that I could try to clean and post if anyone is interested. I also have one sketch I started for using a loose pool mechanism to front acquiring a FixedBitSet, but I didn't get deep enough to be able to tell with confidence that a FBS was actually not being used anymore. Things like the public FixedBitSet.getBits() method make it scary, although I'm convinced even a very small pool of large FixedBitSets could be extremely advantageous. There aren't that many in use at any given time, and a large FBS can still be used for a small use-case. If anyone has some pointers around the lifecycle here, I'd love to hear them. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated LUCENE-7258: Attachment: LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch After 24 hours, I found I could discern a penalty to cpu on my patched node. I removed the change in sort algorithm, and that seems to have resolved it without too significantly changing the allocation savings. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258929#comment-15258929 ] Jeff Wartes commented on LUCENE-7258: - Random aside: I did do one test run where I changed all usages ArrayUtil.oversize to use an expansion of 2x. I recall this increased overall allocations on my test query corpus by about 4%, when compared to the 256th/2x applied to only the IntersectsPrefixTreeQuery. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated LUCENE-7258: Attachment: allocation_plot.jpg Attaching the graph directly. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch, > allocation_plot.jpg > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258848#comment-15258848 ] Jeff Wartes commented on LUCENE-7258: - I put this patch on a production node this morning, looks like allocation rate went down about 10%, which I think is pretty good considering only about 15% of my queries even have a geospatial component. CPU usage has not changed enough for me to notice. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated LUCENE-7258: Attachment: LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch This patch does the following: 1. Moves the FBS threshold from 1/128th to 1/256th for IntersectsPrefixTreeQuery. 2. Changes the expansion policy to 2x when used by IntersectsPrefixTreeQuery 3. Changed the sort algorithm in DocIdSetBuilder (for ALL usages) to InPlaceMergeSorter, since LSBRadixSorter requires allocating a new array of size N. 4. In order to do #1 & #2, I had to add parameter support for the threshold and expansion policies. Justifications: 1. Since Geospatial data is typically non-uniform, a smaller threshold seemed reasonable. 2. A more aggressive expansion policy results in less wasted allocations, particularly for short-lived data structures. 3. This one might be controversial since it affects more than just geospatial search, but I thought I'd see what happened if I saved the memory. I also considered TimSort, which has a configurable memory cost, but LUCENE-5140 gave me some pause. > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > Attachments: > LUCENE-7258-Tune-memory-allocation-rate-for-Intersec.patch > > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7258) Tune DocIdSetBuilder allocation rate
[ https://issues.apache.org/jira/browse/LUCENE-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258834#comment-15258834 ] Jeff Wartes commented on LUCENE-7258: - The "eia" label represents using the ExpandingIntArray approach from SOLR-8922. It suffered somewhat in my plot because I accounted for the fact that when you're done collecting, you need to convert it to a single array for sorting purposes. (if you haven't overflowed into a FBS, anyway, and want to use the usual Sorters.) > Tune DocIdSetBuilder allocation rate > > > Key: LUCENE-7258 > URL: https://issues.apache.org/jira/browse/LUCENE-7258 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/spatial >Reporter: Jeff Wartes > > LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but > didn't actually reduce garbage generation for my Solr index. > Since something like 40% of my garbage (by space) is now attributed to > DocIdSetBuilder.growBuffer, I charted a few different allocation strategies > to see if I could tune things more. > See here: http://i.imgur.com/7sXLAYv.jpg > The jump-then-flatline at the right would be where DocIdSetBuilder gives up > and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index > curve/cutoff looked similar) > Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is > terrible from an allocation standpoint if you're doing a lot of expansions, > and is especially terrible when used to build a short-lived data structure > like this one. > By the time it goes with the FBS, it's allocated around twice as much memory > for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-7258) Tune Spatial RPT Intersects allocation rate
Jeff Wartes created LUCENE-7258: --- Summary: Tune Spatial RPT Intersects allocation rate Key: LUCENE-7258 URL: https://issues.apache.org/jira/browse/LUCENE-7258 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: Jeff Wartes LUCENE-7211 converted IntersectsPrefixTreeQuery to use DocIdSetBuilder, but didn't actually reduce garbage generation for my Solr index. Since something like 40% of my garbage (by space) is now attributed to DocIdSetBuilder.growBuffer, I charted a few different allocation strategies to see if I could tune things more. See here: http://i.imgur.com/7sXLAYv.jpg The jump-then-flatline at the right would be where DocIdSetBuilder gives up and allocates a FixedBitSet for a 100M-doc index. (The 1M-doc index curve/cutoff looked similar) Perhaps unsurprisingly, the 1/8th growth factor in ArrayUtil.oversize is terrible from an allocation standpoint if you're doing a lot of expansions, and is especially terrible when used to build a short-lived data structure like this one. By the time it goes with the FBS, it's allocated around twice as much memory for the buffer as it would have needed for just the FBS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8944) Improve geospatial garbage generation
[ https://issues.apache.org/jira/browse/SOLR-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237853#comment-15237853 ] Jeff Wartes commented on SOLR-8944: --- Results from applying this patch were quite positive, but for more subtle reasons than I'd expected. To my surprise, the quantity of garbage generated (by size) over my test run was mostly unchanged, as was the frequency of collections. However, the garbage collector (ParNew) seemed to have a *much* easier time with what was being generated. Avg GC pause went down 45%, and max GC pause for the run was cut in half. I'm not sure I can even speculate on what makes for easier work within ParNew. >From an allocation rate standpoint, I'm guessing that my test run sits near >the edge of where the DocIdSetBuilder's buffer remains efficient from an >allocation size perspective. Naively that looks like about a hit rate >threshold of 25%, but suspect it's a lot more complicated than that, since >DocIdSetBuilder grows the buffer in 1/8th increments and throws away the old >allocations, which generates more garbage. (By contrast, SOLR-8922 uses 1/64 >as the threshold instead of 1/128, but allocates additional space in 2x >increments, and doesn't throw away what's already been allocated) Looking at some before/after memory snapshots, the allocation size attributed to long[] in FixedBitSet is indeed down, but mostly replaced by lots of int[] allocations attributed to DocIdSetBuilder.growBuffer, as we might expect given that overall allocation size didn't change much. In general, this is a desirable enough patch for my index that I'd be willing to move it into a Lucene issue just on it's face, but it still feels like there is some room for improvement. I suppose I should have made this a Lucene issue in the first place, but given that I'm running with and testing with Solr I wasn't sure how that fit. > Improve geospatial garbage generation > - > > Key: SOLR-8944 > URL: https://issues.apache.org/jira/browse/SOLR-8944 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Labels: spatialrecursiveprefixtreefieldtype > Attachments: > SOLR-8944-Use-DocIdSetBuilder-instead-of-FixedBitSet.patch > > > I’ve been continuing some analysis into JVM garbage sources in my Solr index. > (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus) > After applying SOLR-8922, I find my biggest source of garbage by a literal > order of magnitude (by size) is the long[] allocated by FixedBitSet. From the > backtraces, it appears the biggest source of FixBitSet creation in my case > (by two orders of magnitude) is my use of queries that involve geospatial > filtering. > Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here: > https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60 > Has this been considered for optimization? I can think of a few paths: > 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, > which presumably changes less frequently than queries are issued. If an > existing FixedBitSet were not available from a pool, the worst case (create a > new one) would be no worse than the current behavior. The complication would > be enforcement around when to return the object to the pool, but it looks > like this has some lifecycle hooks already. > 2. I note that a thing called a SparseFixedBitSet already exists, and puts > considerable effort into allocating smaller chunks only as necessary. Is this > not usable for this purpose? How significant is the performance difference? > I'd be happy to spend some time on a patch, but I was hoping for a little > more data around the current choices before choosing an approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8944) Improve geospatial garbage generation
[ https://issues.apache.org/jira/browse/SOLR-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-8944: -- Attachment: SOLR-8944-Use-DocIdSetBuilder-instead-of-FixedBitSet.patch [~dsmiley]'s suggestion was almost too trivial a change to create a patch for, but here it is. This was against 5.4. The path of the class has changed in master, but the contents have not, so the patch should apply there too. > Improve geospatial garbage generation > - > > Key: SOLR-8944 > URL: https://issues.apache.org/jira/browse/SOLR-8944 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Labels: spatialrecursiveprefixtreefieldtype > Attachments: > SOLR-8944-Use-DocIdSetBuilder-instead-of-FixedBitSet.patch > > > I’ve been continuing some analysis into JVM garbage sources in my Solr index. > (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus) > After applying SOLR-8922, I find my biggest source of garbage by a literal > order of magnitude (by size) is the long[] allocated by FixedBitSet. From the > backtraces, it appears the biggest source of FixBitSet creation in my case > (by two orders of magnitude) is my use of queries that involve geospatial > filtering. > Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here: > https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60 > Has this been considered for optimization? I can think of a few paths: > 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, > which presumably changes less frequently than queries are issued. If an > existing FixedBitSet were not available from a pool, the worst case (create a > new one) would be no worse than the current behavior. The complication would > be enforcement around when to return the object to the pool, but it looks > like this has some lifecycle hooks already. > 2. I note that a thing called a SparseFixedBitSet already exists, and puts > considerable effort into allocating smaller chunks only as necessary. Is this > not usable for this purpose? How significant is the performance difference? > I'd be happy to spend some time on a patch, but I was hoping for a little > more data around the current choices before choosing an approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8241) Evaluate W-TinyLfu cache
[ https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235935#comment-15235935 ] Jeff Wartes commented on SOLR-8241: --- Since Solr requires Java 8 as of 6.0, it seems like this patch could be applied pretty easily now? > Evaluate W-TinyLfu cache > > > Key: SOLR-8241 > URL: https://issues.apache.org/jira/browse/SOLR-8241 > Project: Solr > Issue Type: Wish > Components: search >Reporter: Ben Manes >Priority: Minor > Attachments: SOLR-8241.patch > > > SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). > The discussions seem to indicate that the higher hit rate (vs LRU) is offset > by the slower performance of the implementation. An original goal appeared to > be to introduce ARC, a patented algorithm that uses ghost entries to retain > history information. > My analysis of Window TinyLfu indicates that it may be a better option. It > uses a frequency sketch to compactly estimate an entry's popularity. It uses > LRU to capture recency and operate in O(1) time. When using available > academic traces the policy provides a near optimal hit rate regardless of the > workload. > I'm getting ready to release the policy in Caffeine, which Solr already has a > dependency on. But, the code is fairly straightforward and a port into Solr's > caches instead is a pragmatic alternative. More interesting is what the > impact would be in Solr's workloads and feedback on the policy's design. > https://github.com/ben-manes/caffeine/wiki/Efficiency -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229260#comment-15229260 ] Jeff Wartes commented on SOLR-8922: --- I stumbled onto SJK recently, which provides me a more lightweight way to measure allocation rate on my production nodes, and also eliminate startup noise from the measurement. According to this tool, the node with this patch is allocating heap space at roughly 60% of the rate that the others are. That's reasonably consistent with my other measurements, and a pretty big improvement. If anyone decides to pull this in, I'd appreciate it getting applied to the 5.5 branch as well, in case there's a 5.5.1 release. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8944) Improve geospatial garbage generation
[ https://issues.apache.org/jira/browse/SOLR-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15226948#comment-15226948 ] Jeff Wartes commented on SOLR-8944: --- I hadn't refreshed and didn't see this comment before I added mine, but thanks for the info, I appreciate the references and context. I'll take a look at what would be involved with DocIdSetBuilder. I also feel like I should mention though, that class will be the third case of a hardcoded magic fraction of maxDoc I've come across in the context of investigating allocations this last week. It might be worth considering whether the gyrations around avoiding the creation of these BitSets is more or less complicated than managing a pool would be. > Improve geospatial garbage generation > - > > Key: SOLR-8944 > URL: https://issues.apache.org/jira/browse/SOLR-8944 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Labels: spatialrecursiveprefixtreefieldtype > > I’ve been continuing some analysis into JVM garbage sources in my Solr index. > (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus) > After applying SOLR-8922, I find my biggest source of garbage by a literal > order of magnitude (by size) is the long[] allocated by FixedBitSet. From the > backtraces, it appears the biggest source of FixBitSet creation in my case > (by two orders of magnitude) is my use of queries that involve geospatial > filtering. > Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here: > https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60 > Has this been considered for optimization? I can think of a few paths: > 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, > which presumably changes less frequently than queries are issued. If an > existing FixedBitSet were not available from a pool, the worst case (create a > new one) would be no worse than the current behavior. The complication would > be enforcement around when to return the object to the pool, but it looks > like this has some lifecycle hooks already. > 2. I note that a thing called a SparseFixedBitSet already exists, and puts > considerable effort into allocating smaller chunks only as necessary. Is this > not usable for this purpose? How significant is the performance difference? > I'd be happy to spend some time on a patch, but I was hoping for a little > more data around the current choices before choosing an approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8944) Improve geospatial garbage generation
[ https://issues.apache.org/jira/browse/SOLR-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15226709#comment-15226709 ] Jeff Wartes commented on SOLR-8944: --- It was an easy test, so I tried simply using a SparseFixedBitSet instead. That only bought me about a 5% overall reduction in allocation rate. (Again, this is after applying SOLR-8922) Since I don't have any data on the performance impact (cpu/latency) of SparseFixedBitSet vs FixedBitSet, the relatively low difference in allocation rate makes it feel like an object pool approach might be worth the extra work. > Improve geospatial garbage generation > - > > Key: SOLR-8944 > URL: https://issues.apache.org/jira/browse/SOLR-8944 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Labels: spatialrecursiveprefixtreefieldtype > > I’ve been continuing some analysis into JVM garbage sources in my Solr index. > (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus) > After applying SOLR-8922, I find my biggest source of garbage by a literal > order of magnitude (by size) is the long[] allocated by FixedBitSet. From the > backtraces, it appears the biggest source of FixBitSet creation in my case > (by two orders of magnitude) is my use of queries that involve geospatial > filtering. > Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here: > https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60 > Has this been considered for optimization? I can think of a few paths: > 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, > which presumably changes less frequently than queries are issued. If an > existing FixedBitSet were not available from a pool, the worst case (create a > new one) would be no worse than the current behavior. The complication would > be enforcement around when to return the object to the pool, but it looks > like this has some lifecycle hooks already. > 2. I note that a thing called a SparseFixedBitSet already exists, and puts > considerable effort into allocating smaller chunks only as necessary. Is this > not usable for this purpose? How significant is the performance difference? > I'd be happy to spend some time on a patch, but I was hoping for a little > more data around the current choices before choosing an approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8944) Improve geospatial garbage generation
Jeff Wartes created SOLR-8944: - Summary: Improve geospatial garbage generation Key: SOLR-8944 URL: https://issues.apache.org/jira/browse/SOLR-8944 Project: Solr Issue Type: Improvement Reporter: Jeff Wartes I’ve been continuing some analysis into JVM garbage sources in my Solr index. (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus) After applying SOLR-8922, I find my biggest source of garbage by a literal order of magnitude (by size) is the long[] allocated by FixedBitSet. From the backtraces, it appears the biggest source of FixBitSet creation in my case (by two orders of magnitude) is my use of queries that involve geospatial filtering. Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here: https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60 Has this been considered for optimization? I can think of a few paths: 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, which presumably changes less frequently than queries are issued. If an existing FixedBitSet were not available from a pool, the worst case (create a new one) would be no worse than the current behavior. The complication would be enforcement around when to return the object to the pool, but it looks like this has some lifecycle hooks already. 2. I note that a thing called a SparseFixedBitSet already exists, and puts considerable effort into allocating smaller chunks only as necessary. Is this not usable for this purpose? How significant is the performance difference? I'd be happy to spend some time on a patch, but I was hoping for a little more data around the current choices before choosing an approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220342#comment-15220342 ] Jeff Wartes commented on SOLR-8922: --- Both of those appear to add capacity by declaring a new array and doing a System.arraycopy. Wouldn't that just result in more space allocated and then thrown away? > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220299#comment-15220299 ] Jeff Wartes commented on SOLR-8922: --- With some tweaking, I was able to get G1 pause to about the same ballpark as I get with ParNew/CMS. But without a compelling difference, the Lucene recommendation against G1 keep me away. This issue is more about garbage generation though. Less garbage should be a benefit regardless of the collector you choose. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220136#comment-15220136 ] Jeff Wartes commented on SOLR-8922: --- Ok, so after a little more than 12 hours on one of my production nodes, there was no noticeable change in CPU usage. Running before/after GC logs through GCViewer, it's a little hard to compare rate, since the logs were for different intervals and the "after" log included startup. That said, "Freed mem/minute" was down by 44%, and "Throughput" went from 87% to 93%. I also see noticeably reduced average pause time, and increased average pause interval. All positive signs. The only irritation I'm finding here is that it looks like the CMS collector is running more often. I expect that's simply because I changed the footing of a fairly tuned set of GC parameters though. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220052#comment-15220052 ] Jeff Wartes commented on SOLR-8922: --- Incidentally, I had one or two other findings from my garbage analysis. Solutions are less obvious there though, and probably involve some conversation. Is Jira the right place for that, or is there another medium more appropriate? > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220045#comment-15220045 ] Jeff Wartes commented on SOLR-8922: --- Absolutely. Memory pools were my first thought, between when I saw that 60% and when I looked at my hit rates and realized the allocation size was could just be changed. I had started poking around the internet for terms like "slab allocators" and "direct byte buffers", but even an on-heap persistent pool sounded good to me. Or, if you had persistent tracking of hit rates for the optimization, perhaps the size of the scratch array could optimize itself over time. All of that would be more complicated, of course. I did look one other place worth mentioning though. In Heliosearch the way the DocSetCollector handles the "scratch" array isn't any different, but it's interesting because it added a lifecycle with a close() method to the class, to support the native bitset implementation. Knowing that it's possible to impose a lifecycle on the class, checking things out and back into a persistent memory pool should be easy. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219066#comment-15219066 ] Jeff Wartes commented on SOLR-8922: --- Not yet. The major risk area would be the new ExpandingIntArray class, but it looked reasonable. It expands along powers of two, and although the add() and copyTo() calls are certainly more work than simple array assignment/retrieval, it still all looks like pretty simple stuff. A few ArrayList calls and some simple numeric comparisons mostly. I'm more worried about bugs in there than performance, I don't know how well [~steff1193] tested this, although I got the impression he was using it in production at the time. There may be better approaches, but this one was handy and I'm excited enough that I'm going to be doing a production test. I'll have more info in a day or two. As a side note, I got a similar garbage-related improvement on an earlier test by simply hard-coding the smallSetSize to 10 - the expanding arrays approach only bought me another 3%. But of course, that 10 is very index and query set dependant, so I didn't want to offer it as a general case. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218974#comment-15218974 ] Jeff Wartes commented on SOLR-8922: --- For my index, (86M-doc shards and a per-shard 99.9th percentile query hit count of 56k) this reduced total garbage generation by 33%, which naturally also brought significant improvements in gc pause and frequency. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-8922: -- Attachment: SOLR-8922.patch This is essentially the same patch as in SOLR-5444, but applies cleanly against (at least) 5.4 where I did some GC testing, and master. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218910#comment-15218910 ] Jeff Wartes commented on SOLR-8922: --- SOLR-5444 had a patch to help with this, (SOLR-5444_ExpandingIntArray_DocSetCollector_4_4_0.patch) but it was mixed in with some other things, and didn't get picked up with the other parts of the issue. > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
Jeff Wartes created SOLR-8922: - Summary: DocSetCollector can allocate massive garbage on large indexes Key: SOLR-8922 URL: https://issues.apache.org/jira/browse/SOLR-8922 Project: Solr Issue Type: Improvement Reporter: Jeff Wartes After reaching a point of diminishing returns tuning the GC collector, I decided to take a look at where the garbage was coming from. To my surprise, it turned out that for my index and query set, almost 60% of the garbage was coming from this single line: https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 This is due to the simple fact that I have 86M documents in my shards. Allocating a scratch array big enough to track a result set 1/64th of my index (1.3M) is also almost certainly excessive, considering my 99.9th percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212210#comment-15212210 ] Jeff Wartes commented on SOLR-7887: --- SOLR-7698 and SOLR-6377 both have attempts at logback watchers, I think. I'll also second the desire for async appenders, and I'd go further to suggest it as the default. Solr does a lot of logging, and this gets that work out of the critical path for query latency. > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8785) Use Metrics library for core metrics
Jeff Wartes created SOLR-8785: - Summary: Use Metrics library for core metrics Key: SOLR-8785 URL: https://issues.apache.org/jira/browse/SOLR-8785 Project: Solr Issue Type: Improvement Affects Versions: 4.1 Reporter: Jeff Wartes The Metrics library (https://dropwizard.github.io/metrics/3.1.0/) is a well-known way to track metrics about applications. In SOLR-1972, latency percentile tracking was added. The comment list is long, so here’s my synopsis: 1. An attempt was made to use the Metrics library 2. That attempt failed due to a memory leak in Metrics v2.1.1 3. Large parts of Metrics were then copied wholesale into the org.apache.solr.util.stats package space and that was used instead. Copy/pasting Metrics code into Solr may have been the correct solution at the time, but I submit that it isn’t correct any more. The leak in Metrics was fixed even before SOLR-1972 was released, and by copy/pasting a subset of the functionality, we miss access to other important things that the Metrics library provides, particularly the concept of a Reporter. (https://dropwizard.github.io/metrics/3.1.0/manual/core/#reporters) Further, Metrics v3.0.2 is already packaged with Solr anyway, because it’s used in two contrib modules. (map-reduce and morphines-core) I’m proposing that: 1. Metrics as bundled with Solr be upgraded to the current v3.1.2 2. Most of the org.apache.solr.util.stats package space be deleted outright, or gutted and replaced with simple calls to Metrics. Due to the copy/paste origin, the concepts should mostly map 1:1. I’d further recommend a usage pattern like: SharedMetricRegistries.getOrCreate(System.getProperty(“solr.metrics.registry”, “solr-registry”)) There are all kinds of areas in Solr that could benefit from metrics tracking and reporting. This pattern allows diverse areas of code to track metrics within a single, named registry. This well-known-name then becomes a handle you can use to easily attach a Reporter and ship all of those metrics off-box. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8725) Invalid name error with core names with hyphens
[ https://issues.apache.org/jira/browse/SOLR-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169572#comment-15169572 ] Jeff Wartes commented on SOLR-8725: --- Well, I'm glad I haven't tried moving to 5.5 yet, this would have been an unpleasant migration discovery. I use hyphens in both collection names and alias names. (although not as a leading character) Generally, I prefer to avoid using underscores anyplace that ends up in a URL. > Invalid name error with core names with hyphens > --- > > Key: SOLR-8725 > URL: https://issues.apache.org/jira/browse/SOLR-8725 > Project: Solr > Issue Type: Bug >Affects Versions: 5.5 >Reporter: Chris Beer > > In SOLR-8642, hyphens are no longer considered valid identifiers for cores > (and collections?). Our solr instance was successfully using hyphens in our > core names, and our affected cores now error with: > marc-profiler_shard1_replica1: > org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > Invalid name: 'marc-profiler_shard1_replica1' Identifiers must consist > entirely of periods, underscores and alphanumerics > Before starting to rename all of our collections, I wonder if this decision > could be revisited to be backwards compatible with previously created > collections. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117568#comment-15117568 ] Jeff Wartes commented on SOLR-8531: --- Looks like this was resolved in SOLR-8561 > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes resolved SOLR-8531. --- Resolution: Fixed Fix Version/s: 5.4.1 > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > Fix For: 5.4.1 > > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes closed SOLR-8531. - > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > Fix For: 5.4.1 > > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15091427#comment-15091427 ] Jeff Wartes commented on SOLR-8531: --- I just looked again, and 5.4 is indeed writing the leader data to both places. Perhaps 5.4 is only looking in the new place? This is speculation, but if so, a possible upgrade path might have been to try to get the first 5.4 node for each shard to be the leader, (preferredLeader property?) and then the rest of the rollout would work. As I mentioned, I didn't check what happened when I restarted a 5.3 node while 5.4 was leader though. > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090732#comment-15090732 ] Jeff Wartes commented on SOLR-8531: --- 1. A fully upgraded cluster behaves normally. 2. The problem is only occurs for collections with replicationFactor > 1, but by definition, this means you only have problems if you're trying an HA upgrade. Upgraded nodes got in line for leader election as normal, but could not figure out the current leader on start, and never executed replication recovery and became active. If I restarted 5.3 nodes for a given shard, the 5.4 shard would eventually get elected leader, and publish active state without intervention, but restarting the 5.4 shard again would mean a 5.3 shard got elected, and the 5.4 node would be stuck in 'down' state again. I did not test restarting a 5.3 shard while the 5.4 shard was leader. In my case I had sufficient production capacity to upgrade half my cluster, create a new collection in 5.4, copy the data into it, and then upgrade the rest of the cluster, so I did that. As mentioned, taking downtime and upgrading the whole cluster at once would also have worked. > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8531) ZK leader path changed in 5.4
[ https://issues.apache.org/jira/browse/SOLR-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15090706#comment-15090706 ] Jeff Wartes commented on SOLR-8531: --- I was imagining a note https://lucene.apache.org/solr/5_4_0/changes/Changes.html#v5.4.0.upgrading_from_solr_5.3 But I could understand that being driven off of an immutable release tag. I haven't fully read the SOLR-7844 patch for comprehension, but the change to ZkStateReader.java looks like the reason: https://github.com/apache/lucene-solr/commit/65cb72631b0833f8ddcf34dfa3d4a91f2c5091c4#diff-8f54b814c3da916328992910b1ad9163 I don't immediately see the change being necessary, so I suspect it could be reverted or made reverse-compatible without too much trouble. If it's the former, then I'll presumably hit the same issue again in reverse moving from 5.4 to 5.4.1, which could be ok now that I know to expect it. > ZK leader path changed in 5.4 > - > > Key: SOLR-8531 > URL: https://issues.apache.org/jira/browse/SOLR-8531 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Jeff Wartes > > While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I > observed that upgraded nodes would not register their shards as active unless > they were elected the leader for the shard. > There were no errors, the shards were fully up and responsive, but would not > publish any change from the "down" state. > This appears to be because the recovery process never happens, because the ZK > node containing the current leader can't be found, because the ZK path has > changed. > Specifically, the leader data node changed from: > /leaders/ > to > /leaders//leader > It looks to me like this happened during SOLR-7844, perhaps accidentally. > At the least, the "Migrating to Solr 5.4" section of the README should get > updated with this info, since it means a rolling upgrade of a collection with > multiple replicas will suffer serious degradation in the number of active > replicas as nodes are upgraded. It's entirely possible this will reduce some > shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8531) ZK leader path changed in 5.4
Jeff Wartes created SOLR-8531: - Summary: ZK leader path changed in 5.4 Key: SOLR-8531 URL: https://issues.apache.org/jira/browse/SOLR-8531 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.4 Reporter: Jeff Wartes While doing a rolling upgrade from 5.3 to 5.4 of a solrcloud cluster, I observed that upgraded nodes would not register their shards as active unless they were elected the leader for the shard. There were no errors, the shards were fully up and responsive, but would not publish any change from the "down" state. This appears to be because the recovery process never happens, because the ZK node containing the current leader can't be found, because the ZK path has changed. Specifically, the leader data node changed from: /leaders/ to /leaders//leader It looks to me like this happened during SOLR-7844, perhaps accidentally. At the least, the "Migrating to Solr 5.4" section of the README should get updated with this info, since it means a rolling upgrade of a collection with multiple replicas will suffer serious degradation in the number of active replicas as nodes are upgraded. It's entirely possible this will reduce some shards to a single active replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053711#comment-15053711 ] Jeff Wartes commented on SOLR-4449: --- For what it's worth, if I were a solr committer, I probably wouldn't just merge this issue as-is. BackupRequestLBHttpSolrClient still has a certain amount of copy/paste code from the parent LBHttpSolrClient class that'd become extra long-term maintenance load. (As it will be every time I update this issue for a new solr version) Instead, I'd do something like: 1. Pull the asynchronous ExecutorCompletionService-based query approach into the LBHttpSolrClient itself. This would be interesting and useful functionality in it's own right. 2. Add the concept of a shardTimeout. (Distinct from timeAllowed) 3. Add extendable support for how to handle a shardTimeout. If a strategy ends up making a request to another server in the list, that request must be submitted to the same ExecutorCompletionService so that in all cases, LBHttpSolrClient would return the first response among the submitted requests. 4. The backup-request functionality could still then be a class extending LBHttpSolrClient, but the only real code there would be defining the shardTimeout for a given request, and how to handle a shardTimeout if there was one. I'd probably audit the access restrictions in LBHttpSolrClient while I was at it though, since solrconfig.xml provides such an easy way to use alternate implementations of that class. A lot of the existing code in BackupRequestLBHttpSolrClient is only necessary due to not having sufficient access to the parent class. (isTimeExceeded/getTimeAllowedInNanos seem generally useful to have, for example, and I'm not sure why doRequest is protected) > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053665#comment-15053665 ] Jeff Wartes commented on SOLR-4449: --- I looked around a bit, and unless I'm missing something, it looks like solr-core doesn't really use metrics-core. At the end of SOLR-1972, the necessary classes were just copy/pasted into the solr codeline. It sounds like this was mostly due to being nervous after encountering some problems in the metrics-core version at the time, and an aversion to a global registry approach. Unfortunately, this means that although requesthandlers have statistics, they cannot be attached to a metrics Reporter, and instead you have to develop something to interrogate JMX or some such. Solr does include metrics-core 3.0.1, but there's only a few places it actually gets used, and only in contrib modules. I didn't have the negative experience with metrics-core. In fact, my experiences with 3.1.2 over the last year and a half has been universally positive. So when I added backup-percentile support to this issue I relied heavily on the global SharedMetricsRegistry and the assumption that the library was threadsafe in general. My scattershot code reviews of the metrics library have generally enforced my opinion that this is ok, and I'm using my version of this issue in production now. Initializing a well-known-named shared registry with an attached reporter in jetty.xml has yielded all kinds of great performance data. This might be a useful point of information if anyone gets back to SOLR-4735. I'll mention here if I do encounter any metrics-core related issues in the future. > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8171) Facet query filterCache usage is psychic
Jeff Wartes created SOLR-8171: - Summary: Facet query filterCache usage is psychic Key: SOLR-8171 URL: https://issues.apache.org/jira/browse/SOLR-8171 Project: Solr Issue Type: Bug Components: faceting Affects Versions: 5.3 Reporter: Jeff Wartes >From this thread: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201510.mbox/%3cd23998fc.6fa32%25jwar...@whitepages.com%3E There's really a few points here, which may be different issues: 1. Either facet queries aren't using the filterCache correctly, or the stats don't reflect actual usage. (Or it's psychic.) Somehow, "lookups" only ever gets incremented when "hits" does, yielding a 100% cache hit rate at all times. 2. Facet queries appear to use the filterCache as a queryResultCache. Meaning, only identical facet queries cause filterCache "hits" to increase. Interestingly, disabling the queryResultCache still results in facet queries doing *inserts* into the filterCache, but no longer allows stats-reported *usage* of those entries. If the stats are right and facet queries *aren't* actually using the filterCache for anything except possible future searches, then there should be a mechanism for disabling facet query filterCache usage to avoid filling the filterCache with low usage queries. Honestly though, that sounds more like something for the queryResultCache than filterCache anyway. If facet queries *are* using the filterCache for performance within a single query, I'd suggest that facet queries should have their own named cache specifically for that use, rather than try to share a task load (size, regenerator) with the generic filterCache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949141#comment-14949141 ] Jeff Wartes commented on SOLR-4449: --- Ah, great. Yeah, same thing. I knew it had been discussed in SOLR-4735, but since that hadn't been merged, I didn't even bother checking if it already existed. Thanks for the reference. After reading through the comment history in SOLR-1972, it seems like I should look closely at that integration and see if I can leverage anything existing there. > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14947724#comment-14947724 ] Jeff Wartes commented on SOLR-4449: --- I've added performance tracking to this, so that you can request a backup request at (say) the 95th percentile latency for a given performance class of query. I'm likely going to continue on this path, but this adds a dependency on metrics-core, so I've dropped a tag (5.3_port_complete) just prior to those changes. Anyone interested in merging something like this may prefer to work from that. > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8059) NPE distributed DebugComponent
[ https://issues.apache.org/jira/browse/SOLR-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933631#comment-14933631 ] Jeff Wartes commented on SOLR-8059: --- I've seen this too. I assumed it was related to https://issues.apache.org/jira/browse/SOLR-1880, but I've never investigated. > NPE distributed DebugComponent > -- > > Key: SOLR-8059 > URL: https://issues.apache.org/jira/browse/SOLR-8059 > Project: Solr > Issue Type: Bug >Affects Versions: 5.3 >Reporter: Markus Jelsma >Assignee: Shalin Shekhar Mangar > Fix For: 5.4 > > > The following URL select?debug=true=*:*=id,score yields > {code} > java.lang.NullPointerException > at > org.apache.solr.handler.component.DebugComponent.finishStage(DebugComponent.java:229) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:416) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:210) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.eclipse.jetty.server.Server.handle(Server.java:499) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > {code} > I can reproduce it everytime. Strange enough fl=*,score, or any other content > field does not! I have seen this happening in Highlighter as well on the same > code path. It makes little sense, how would fl influence that piece of code, > the id is requested in fl afterall. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901555#comment-14901555 ] Jeff Wartes commented on SOLR-4449: --- I pulled this patch out into a freestanding jar and ported it to Solr 5.3. I tried to pull in all the things that had changed since they were copied from the parent class in 4.4, and added per-request backup time support. Sadly, there were still a few places where package-protected restrictions got in the way, (Rsp.server and LBHttpSolrClient.doRequest in particular) so even as a separate jar, this must be loaded by the same classloader as LBHttpSolrClient, not via solr's lib inclusion mechanism. After this long, it feels unlikely this feature will get merged, but if there's any interest in that it should still be pretty simple to just copy the files back into the solr source tree, I didn't change any paths or package names, and I'd be happy to upload another patch file. My version can be found here: https://github.com/whitepages/SOLR-4449 For those who were wondering about the effect of this stuff, in one test today I cut my median query response time in half, at a cost of about 15% more cluster-wide cpu, simply by using this and setting the backupRequestDelay to half my observed ParNew GC pause. The next logical step would be performance-aware backup request settings, like "issue a backup request when you exceed your 95th percentile latency for a given requestHandler or queryPerformanceClass". My thanks to [~phloy] for authoring this. > Enable backup requests for the internal solr load balancer > -- > > Key: SOLR-4449 > URL: https://issues.apache.org/jira/browse/SOLR-4449 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: philip hoy >Priority: Minor > Attachments: SOLR-4449.patch, SOLR-4449.patch, SOLR-4449.patch, > patch-4449.txt, solr-back-request-lb-plugin.jar > > > Add the ability to configure the built-in solr load balancer such that it > submits a backup request to the next server in the list if the initial > request takes too long. Employing such an algorithm could improve the latency > of the 9xth percentile albeit at the expense of increasing overall load due > to additional requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7698) solr alternative logback contrib
[ https://issues.apache.org/jira/browse/SOLR-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703723#comment-14703723 ] Jeff Wartes commented on SOLR-7698: --- Also see SOLR-6377 solr alternative logback contrib - Key: SOLR-7698 URL: https://issues.apache.org/jira/browse/SOLR-7698 Project: Solr Issue Type: New Feature Affects Versions: 5.2.1 Reporter: Linbin Chen Labels: logback Fix For: 5.3 Attachments: SOLR-7698.patch alternative use logback support solr.xml like {code:xml} solr !-- ... -- logging str name=classorg.apache.solr.logging.logback.LogbackWatcher/str bool name=enabledtrue/bool watcher int name=size50/int str name=thresholdWARN/str /watcher /logging !-- ... -- /solr {code} solr-X.X.X/server/lib/ext remove: * log4j-1.2.X.jar * slf4j-log4j12-1.7.X.jar add : * log4j-over-slf4j-1.7.7.jar * logback-classic-1.1.3.jar * logback-core-1.1.3.jar example : https://github.com/chenlb/vootoo/wiki/Logback-for-solr-logging -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7493) Requests aren't distributed evenly if the collection isn't present locally
Jeff Wartes created SOLR-7493: - Summary: Requests aren't distributed evenly if the collection isn't present locally Key: SOLR-7493 URL: https://issues.apache.org/jira/browse/SOLR-7493 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.0 Reporter: Jeff Wartes I had a SolrCloud cluster where every node is behind a simple round-robin load balancer. This cluster had two collections (A, B), and the slices of each were partitioned such that one collection (A) used two thirds of the nodes, and the other collection (B) used the remaining third of the nodes. I observed that every request for collection B that the load balancer sent to a node with (only) slices for collection A got proxied to one *specific* node hosting a slice for collection B. This node started running pretty hot, for obvious reasons. This meant that one specific node was handling the fan-out for slightly more than two-thirds of the requests against collection B. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7493) Requests aren't distributed evenly if the collection isn't present locally
[ https://issues.apache.org/jira/browse/SOLR-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-7493: -- Labels: (was: pat) Requests aren't distributed evenly if the collection isn't present locally -- Key: SOLR-7493 URL: https://issues.apache.org/jira/browse/SOLR-7493 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.0 Reporter: Jeff Wartes Attachments: SOLR-7493.patch I had a SolrCloud cluster where every node is behind a simple round-robin load balancer. This cluster had two collections (A, B), and the slices of each were partitioned such that one collection (A) used two thirds of the nodes, and the other collection (B) used the remaining third of the nodes. I observed that every request for collection B that the load balancer sent to a node with (only) slices for collection A got proxied to one *specific* node hosting a slice for collection B. This node started running pretty hot, for obvious reasons. This meant that one specific node was handling the fan-out for slightly more than two-thirds of the requests against collection B. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7493) Requests aren't distributed evenly if the collection isn't present locally
[ https://issues.apache.org/jira/browse/SOLR-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-7493: -- Attachment: SOLR-7493.patch It looks like this happens because SolrDispatchFilter's getRemoteCoreURL eventually takes the first viable entry from a HashMap.values list of cores. HashMap.values ordering is always the same, if you load the HashMap with the same data in the same order. So if the list from ZK is presented in the same order on every node, every node will use the same ordering on every request. There might be a better solution, but this patch would randomize that ordering per-request. My environment is a bit messed up at the moment, so I haven't done much more than verify this compiles. Requests aren't distributed evenly if the collection isn't present locally -- Key: SOLR-7493 URL: https://issues.apache.org/jira/browse/SOLR-7493 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.0 Reporter: Jeff Wartes Attachments: SOLR-7493.patch I had a SolrCloud cluster where every node is behind a simple round-robin load balancer. This cluster had two collections (A, B), and the slices of each were partitioned such that one collection (A) used two thirds of the nodes, and the other collection (B) used the remaining third of the nodes. I observed that every request for collection B that the load balancer sent to a node with (only) slices for collection A got proxied to one *specific* node hosting a slice for collection B. This node started running pretty hot, for obvious reasons. This meant that one specific node was handling the fan-out for slightly more than two-thirds of the requests against collection B. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7493) Requests aren't distributed evenly if the collection isn't present locally
[ https://issues.apache.org/jira/browse/SOLR-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-7493: -- Labels: pat (was: ) Requests aren't distributed evenly if the collection isn't present locally -- Key: SOLR-7493 URL: https://issues.apache.org/jira/browse/SOLR-7493 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 5.0 Reporter: Jeff Wartes Labels: pat Attachments: SOLR-7493.patch I had a SolrCloud cluster where every node is behind a simple round-robin load balancer. This cluster had two collections (A, B), and the slices of each were partitioned such that one collection (A) used two thirds of the nodes, and the other collection (B) used the remaining third of the nodes. I observed that every request for collection B that the load balancer sent to a node with (only) slices for collection A got proxied to one *specific* node hosting a slice for collection B. This node started running pretty hot, for obvious reasons. This meant that one specific node was handling the fan-out for slightly more than two-thirds of the requests against collection B. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509521#comment-14509521 ] Jeff Wartes commented on SOLR-5170: --- I got tired of maintaining a custom solr build process for the sole purpose of this patch at my work, especially given the deployment changes in Solr 5.0. Since this patch really just adds new classes, I pulled those files out into a freestanding repository that builds a jar, copied the necessary infrastructure to allow the tests to run, and posted that here: https://github.com/randomstatistic/SOLR-5170 This repo contains the necessary API changes to the patch to support Solr 5.0. I have not bothered to update the patch in Jira here with those changes, and going forward, I'll probably continue to only push changes to that repo unless someone asks otherwise. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-5170: -- Attachment: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch Updated to work with Solr 4.9 LUCENE-5703. Any chance of realtime-friendly multi-value distance sorting getting into the mainline anytime soon? I've been building with this patch for getting close to a year now. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5917) Allow dismax wildcard field specifications
Jeff Wartes created SOLR-5917: - Summary: Allow dismax wildcard field specifications Key: SOLR-5917 URL: https://issues.apache.org/jira/browse/SOLR-5917 Project: Solr Issue Type: Improvement Components: query parsers Reporter: Jeff Wartes Priority: Minor The dynamic field schema specification is handy for when you want a bunch of fields of a given type, but don't know how many there will be. You do currently need to know how many there will be (and the exact names) if you want to query them, however. If edismax supported a similar wildcard specification like qf=dynfield_*, this would allow easy search across a given field type. It would also provide a convenient alternative to multi-value fields without the fieldNorm implications of having multiple values in a single field. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Wartes updated SOLR-5170: -- Attachment: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt Adds recipDistance scoring, lat/long is one param. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch, SOLR-5170_spatial_multi-value_sort_via_docvalues.patch.txt The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5170) Spatial multi-value distance sort via DocValues
[ https://issues.apache.org/jira/browse/SOLR-5170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864738#comment-13864738 ] Jeff Wartes commented on SOLR-5170: --- I've been using this patch with some minor tweaks and solr 4.3.1 in production for about six months now. Since I was applying it again against 4.6 this morning, I figured I should attach my tweaks, and mention it passes tests against 4.6. This does NOT address the design issues David raises in the initial comment. The changes vs the initial patchfile allow it to be applied against a greater range of solr versions, and brings it a little closer to feeling the same as geofilt's params. Spatial multi-value distance sort via DocValues --- Key: SOLR-5170 URL: https://issues.apache.org/jira/browse/SOLR-5170 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Attachments: SOLR-5170_spatial_multi-value_sort_via_docvalues.patch The attached patch implements spatial multi-value distance sorting. In other words, a document can have more than one point per field, and using a provided function query, it will return the distance to the closest point. The data goes into binary DocValues, and as-such it's pretty friendly to realtime search requirements, and it only uses 8 bytes per point. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org