[jira] [Commented] (SOLR-11767) Please create SolrCloud Helm Chart or Controller for Kubernetes
[ https://issues.apache.org/jira/browse/SOLR-11767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344104#comment-16344104 ] Keith Laban commented on SOLR-11767: Hey Rodney, building a CRD for Solr is something I've been investigating for a while. I've built some small POCs which are far from production ready. For us, we use bare-metal and local storage and unfortunately the state of local storage in kubernetes just isn't there yet although there are some things in the works that [look exciting|https://github.com/kubernetes/features/issues/490#issuecomment-359508997]. This could potentially work if you use network storage, as that is slightly more solved for in the kube world, unfortunately that wont work for us and isn't a path we researched. In my original POC I built a CRD controller which used empty-dir with the idea of deploying ephemeral sandboxes of solrcloud, again not really a production solution. > Please create SolrCloud Helm Chart or Controller for Kubernetes > --- > > Key: SOLR-11767 > URL: https://issues.apache.org/jira/browse/SOLR-11767 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.1 > Environment: Azure AKS, On-Prem Kuberenetes 1.8 >Reporter: Rodney Aaron Stainback >Priority: Blocker > Original Estimate: 168h > Remaining Estimate: 168h > > Please creates a highly avialable auto-scaling Kubernetes Helm Chart or > Controller/Custom Resource for easy deployment of SolrCloud in Kubernetes in > any environement. Thanks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199087#comment-16199087 ] Keith Laban commented on SOLR-7887: --- I haven't looked at this since my last update. It looks like the hadoop issue has not been resolved, so I imagine we would still be blocked on that integration. > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > Attachments: SOLR-7887-WIP.patch > > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006516#comment-16006516 ] Keith Laban commented on SOLR-10654: Seems like there a few things here: 1. - a: Pull reporter framework. An interesting idea, but is it over engineering for an initial effort? I'm not aware any other mainstream metrics frameworks that pull metrics in a very specific format. Any home rolled thing can consume the default format we expose. - b: Additionally, we can expose these metrics under {{/metrics/prometheus}} like suggested above, to avoid having to change the api if we later decide there is a need for more generic framework. 2. Response writers. It might be interesting to expose response writers dynamically with a plugin-style interface. Or add a Function to the response object that can dictate the writer to be used (optionally). Either way, I think this is separate enough from metrics, and useful in other places, that it should be pursued in a separate issue. 3. Dropwizard Exports - yes, there is not feature parity with default solr metrics, but is it required for an initial patch? To me it seems like a lot of work that isn't required on day one. But I agree it should be added at some point. I propose that we tackle #2 and #1.b for the initial patch. And circle back to #3 and #1.a if we find it necessary. > Expose Metrics in Prometheus format > --- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Keith Laban > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004814#comment-16004814 ] Keith Laban commented on SOLR-10654: Jan, I think the problem would still be that each handler would most likely need their own very specific custom response format. If we were to do this we would need to expose them as a raw Filter or Servlet instead of a SolrRequestHandler. I'm not aware of anywhere else in solr where this is happening. The other option would be to add a custom response writer format for each metrics type, kind of like Iike I did here. > Expose Metrics in Prometheus format > --- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Keith Laban > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16003379#comment-16003379 ] Keith Laban commented on SOLR-10654: Prometheus uses a pull model. I thought of setting up a reporter which also setups up a servlet to serve stats in prometheus format but that seemed more cumbersome. I also wasn't sure if it was possible to expose an arbitrary servlet easily. Using a solr request handler would still require the custom response format. > Expose Metrics in Prometheus format > --- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics >Reporter: Keith Laban > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10654) Expose Metrics in Prometheus format
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-10654: --- Description: Expose metrics via a `wt=prometheus` response type. Example scape_config in prometheus.yml: {code} scrape_configs: - job_name: 'solr' metrics_path: '/solr/admin/metrics' params: wt: ["prometheus"] static_configs: - targets: ['localhost:8983'] {code} was:Expose metrics via a `wt=prometheus` response type. > Expose Metrics in Prometheus format > --- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16003328#comment-16003328 ] Keith Laban commented on SOLR-10654: Exposing prometheus metrics by creating a new response writer type. I couldn't think of a more graceful way to handle this then creating a new response format. > Expose Metrics in Prometheus format > --- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Expose metrics via a `wt=prometheus` response type. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10654) Expose Metrics in Prometheus format
Keith Laban created SOLR-10654: -- Summary: Expose Metrics in Prometheus format Key: SOLR-10654 URL: https://issues.apache.org/jira/browse/SOLR-10654 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban Expose metrics via a `wt=prometheus` response type. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981427#comment-15981427 ] Keith Laban commented on SOLR-10047: Funny, I did the same thing. I closed the old PR and put in a new one -- https://github.com/apache/lucene-solr/pull/195/files . This can be closed if you find your patch to be satisfactory. > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Shalin Shekhar Mangar > Fix For: 6.6, master (7.0) > > Attachments: SOLR_10047_test.patch > > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981389#comment-15981389 ] Keith Laban commented on SOLR-10047: Thanks Hoss, [~shalinmangar] -- no need, I will take a look at it today. > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Shalin Shekhar Mangar > Fix For: 6.6, master (7.0) > > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975006#comment-15975006 ] Keith Laban commented on SOLR-10047: Thanks Steve, I am able to reproduce this using this seed. The test commits three documents and relies on them not being merged into a single segment to verify the behavior of this patch works correctly. It seems like with this seed the merge policy is merging these three segments. What is the best way to setup this test to assure that no merges occur? > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Shalin Shekhar Mangar > Fix For: 6.6, master (7.0) > > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964985#comment-15964985 ] Keith Laban commented on SOLR-10047: Shalin, I wasn't sure exactly how you wanted me to submit the diff between 6.x and master. In this PR: - [Commit|https://github.com/kelaban/lucene-solr/commit/806f33e092491cc6a2ee292d2934c76171e40dc7] adds new and old interface, and modifies all the tests to use the new interface - [Commit|https://github.com/kelaban/lucene-solr/commit/c38f4cabc2828ee83b53b931dd829e29a3e1701c] reverts the tests to using the old interface. If you want to keep both interfaces as a convenience method and tests unmodified you can squash them all down. Otherwise use HEAD for 6.x (tests not updated) and reset to HEAD^ for master (tests updated). I did not write specific tests to explicitly use new/old signature because they all use the new signature under the hood. The test I added in the original commit tests the updated intended behavior. > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961212#comment-15961212 ] Keith Laban commented on LUCENE-7671: - [~mikemccand], I pushed an update which at the moment cleanly applies. However is looks like LUCENE-7756 break {{addIndexes}} again because of this check which was added: {code} + private void validateMergeReader(CodecReader leaf) { +LeafMetaData segmentMeta = leaf.getMetaData(); +if (segmentInfos.getIndexCreatedVersionMajor() != segmentMeta.getCreatedVersionMajor()) { + throw new IllegalArgumentException("Cannot merge a segment that has been created with major version " + + segmentMeta.getCreatedVersionMajor() + " into this index which has been created by major version " + + segmentInfos.getIndexCreatedVersionMajor()); +} + +if (segmentInfos.getIndexCreatedVersionMajor() >= 7 && segmentMeta.getMinVersion() == null) { + throw new IllegalStateException("Indexes created on or after Lucene 7 must record the created version major, but " + leaf + " hides it"); +} + +Sort leafIndexSort = segmentMeta.getSort(); +if (config.getIndexSort() != null && leafIndexSort != null +&& config.getIndexSort().equals(leafIndexSort) == false) { + throw new IllegalArgumentException("cannot change index sort from " + leafIndexSort + " to " + config.getIndexSort()); +} + } {code} Is this mergepolicy working against future goals of lucene such that it will be impossible to upgrade major versions without reIndexing? > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959865#comment-15959865 ] Keith Laban commented on SOLR-10047: Shalin, based on your comment I did a bit of an overhaul to the original PR. Instead of deleting UninvertingReader#wrap I changed the interface from accepting {{Map}} to {{Function}}. With this change I updated SolrIndexSearcher to use this interface instead of creating a new static class. I also updated all of the places where the original static function {{wrap}} was being used. If you think updating all of the test to the new interface is too much overkill/not backcompat I can overload the {{UninvertingReader#wrap}} function to accept the original static mapping and delegate to the new impl. > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
[ https://issues.apache.org/jira/browse/SOLR-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957449#comment-15957449 ] Keith Laban commented on SOLR-10047: - updated javadoc - renamed class to {{UninvertingDirectoryReaderPerSegmentMapping}} - added IndexReader#getReaderCacheHelper (copied note from UninvertingDirectoryReader) - removed old now unsued UninvertingDirectoryReader > Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per > segment > - > > Key: SOLR-10047 > URL: https://issues.apache.org/jira/browse/SOLR-10047 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > The configuration of UninvertingReader in SolrIndexSearch creates a global > mapping for the directory for fields to uninvert. If docvalues are enabled on > a field the creation of a new segment will cause the query to fail when > faceting/sorting on the recently docvalue enabled field. This happens because > the UninvertingReader is configured globally across the entire directory, and > a single segment containing DVs for a field will incorrectly indicate that > all segments contain DVs. > This patch addresses the incorrect behavior by determining the fields to be > uninverted on a per-segment basis. > With the fix, it is still recommended that a reindexing occur as data loss > will when a DV and non-DV segment are merged, SOLR-10046 addresses this > behavior. This fix is to be a stop gap for the time between enabling > docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957388#comment-15957388 ] Keith Laban commented on LUCENE-7671: - The test now passes using the CodeReader interface. I also renamed the class to {{LiveUpgradeSegmentsMergePolicy}} because it upgrades on a segment by segment basis verse the {{UpgradeIndexMergePolicy}} which tries to ugprade all segments. It looks like someone has been touching TestBackwardCompatibility class and this no longer cleanly lands on master. I'll take a look at catching it up later > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949261#comment-15949261 ] Keith Laban commented on LUCENE-7671: - [~mikemccand] I updated PR with some extra changes: - Fixed typo in testUpgradeWithExcplicitUpgrades - Added usage for -include-new-segments options - Also added -num-segments option for IndexUpgrader and usage - Added random toggle for new options to be added in tests Still outstanding: See my earlier [comment|https://issues.apache.org/jira/browse/LUCENE-7671?focusedCommentId=15925030=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15925030] about the failing test. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933483#comment-15933483 ] Keith Laban commented on LUCENE-7671: - Pushed a new version that breaks out the behavior outlined above into a new class called {{LiveUpgradeIndexMergePolicy}}. The name indicates that a different IndexWriter need not be used. Additionally, because the new MP does not have an option to ignore new segments, I've changed the IndexUpgrader script to accept an additional parameter telling it to "include-new-segments" allowing the switch between the old behavior and new behavior. The test are randomized between old/new behavior for coverage. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930202#comment-15930202 ] Keith Laban commented on LUCENE-7671: - How would you feel about removing some of the complexity if we are going to split it out into a separate merge policy? A lot of added complexity here was to make it compatible with the previous version. E.g. ignoreNewSegments will conform to the old behavior of only considering segments needing upgrade as merge candidates. Ideally this merge policy will have the options: - setEnableUpgrades - setMaxUpgradesAtATime And the behavior should be: - Delegate to the wrapped MP UNLESS enableUpgrades is set - When enabledUpgrades is set first delegate to wrap'd MP then rewrite (no merge) old segments in new format > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925030#comment-15925030 ] Keith Laban edited comment on LUCENE-7671 at 3/14/17 9:21 PM: -- Because i've added some {{oldSingleSegmentNames}} with this commit, {{TestBackwardsCompatibility.testUpgradeOldSingleSegmentIndexWithAdditions}} now fails after rebasing LUCENE-7703 because of this line https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L2657 Prior to this commit this test effectively does nothing as there were no old single segment test files. This failure should be addressed outside the scope of this ticket. was (Author: k317h): Because i've added some {{oldSingleSegmentNames}} with this commit, {{TestBackwardsCompatibility.testUpgradeOldSingleSegmentIndexWithAdditions}} now fails after rebasing LUCENE-7703 because of this line https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L2657 Prior to this commit this effectively does nothing as there were no old single segment files. This failure should be addressed outside the scope of this ticket. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925030#comment-15925030 ] Keith Laban commented on LUCENE-7671: - Because i've added some {{oldSingleSegmentNames}} with this commit, {{TestBackwardsCompatibility.testUpgradeOldSingleSegmentIndexWithAdditions}} now fails after rebasing LUCENE-7703 because of this line https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java#L2657 Prior to this commit this effectively does nothing as there were no old single segment files. This failure should be addressed outside the scope of this ticket. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925000#comment-15925000 ] Keith Laban commented on LUCENE-7671: - Rebased with master to keep up-to-date. [~mikemccand] did you ever have an opportunity to review my previous post? > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10046) Create UninvertDocValuesMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-10046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924860#comment-15924860 ] Keith Laban commented on SOLR-10046: I rebased with master and added in the new {{get*CacheHelper}} methods which were added in LUCENE-7410 although I think that the delegations I added should be in the abstract {{FilterCodecReader}} instead > Create UninvertDocValuesMergePolicy > --- > > Key: SOLR-10046 > URL: https://issues.apache.org/jira/browse/SOLR-10046 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Christine Poerschke > > Create a merge policy that can detect schema changes and use > UninvertingReader to uninvert fields and write docvalues into merged segments > when a field has docvalues enabled. > The current behavior is to write null values in the merged segment which can > lead to data integrity problems when sorting or faceting pending a full > reindex. > With this patch it would still be recommended to reindex when adding > docvalues for performance reasons, as it not guarenteed all segments will be > merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10046) Create UninvertDocValuesMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-10046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897964#comment-15897964 ] Keith Laban commented on SOLR-10046: Thanks Christine, I missed this last comment. I merged your pull request > Create UninvertDocValuesMergePolicy > --- > > Key: SOLR-10046 > URL: https://issues.apache.org/jira/browse/SOLR-10046 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Christine Poerschke > > Create a merge policy that can detect schema changes and use > UninvertingReader to uninvert fields and write docvalues into merged segments > when a field has docvalues enabled. > The current behavior is to write null values in the merged segment which can > lead to data integrity problems when sorting or faceting pending a full > reindex. > With this patch it would still be recommended to reindex when adding > docvalues for performance reasons, as it not guarenteed all segments will be > merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10046) Create UninvertDocValuesMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-10046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876323#comment-15876323 ] Keith Laban commented on SOLR-10046: Hi Christine, I was able to do the above. - I created a new commit on top of master to clean up the working branch - Added javadocs and removed TODOs - {{ant precommit}} passes > Create UninvertDocValuesMergePolicy > --- > > Key: SOLR-10046 > URL: https://issues.apache.org/jira/browse/SOLR-10046 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban >Assignee: Christine Poerschke > > Create a merge policy that can detect schema changes and use > UninvertingReader to uninvert fields and write docvalues into merged segments > when a field has docvalues enabled. > The current behavior is to write null values in the merged segment which can > lead to data integrity problems when sorting or faceting pending a full > reindex. > With this patch it would still be recommended to reindex when adding > docvalues for performance reasons, as it not guarenteed all segments will be > merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7688) add a OneMergeWrappingMergePolicy class
[ https://issues.apache.org/jira/browse/LUCENE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872030#comment-15872030 ] Keith Laban commented on LUCENE-7688: - Hi Michael, the main use case is easily overaload the {{wrapForMerge}} function in {{OneMerge}} without having write a whole merge policy. The required ticket SOLR-10046 uses it to wrap the CodecReader with one that has access to FeildCache and will add docvalues when merging segments if required. But generally you can do anything you can do by wrapping a CodeReader, add/remove fields, etc. > add a OneMergeWrappingMergePolicy class > --- > > Key: LUCENE-7688 > URL: https://issues.apache.org/jira/browse/LUCENE-7688 > Project: Lucene - Core > Issue Type: Task >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Attachments: LUCENE-7688.patch > > > This ticket splits out the lucene part of the changes proposed in SOLR-10046 > for a conversation on whether or not the {{OneMergeWrappingMergePolicy}} > class would best be located in Lucene or in Solr. > (As an aside, the proposed use of > [java.util.function.UnaryOperator|https://docs.oracle.com/javase/8/docs/api/java/util/function/UnaryOperator.html] > causes {{ant documentation-lint}} to fail, I have created LUCENE-7689 > separately for that.) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-10046) Create UninvertDocValuesMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-10046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15861472#comment-15861472 ] Keith Laban commented on SOLR-10046: Christine, I merged your second PR, not sure that was required or not. Thanks for your suggestions, it's looking good. > Create UninvertDocValuesMergePolicy > --- > > Key: SOLR-10046 > URL: https://issues.apache.org/jira/browse/SOLR-10046 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Create a merge policy that can detect schema changes and use > UninvertingReader to uninvert fields and write docvalues into merged segments > when a field has docvalues enabled. > The current behavior is to write null values in the merged segment which can > lead to data integrity problems when sorting or faceting pending a full > reindex. > With this patch it would still be recommended to reindex when adding > docvalues for performance reasons, as it not guarenteed all segments will be > merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852346#comment-15852346 ] Keith Laban edited comment on LUCENE-7671 at 2/3/17 11:49 PM: -- Yes, in a nutshell that is my goal. Ultimately I'd like to build a request handler in solr that can enable a core to upgrade it's segments without first taking it offline or reconfiguring the index writer. When not engaged it should have no effect, and when it is engaged it should do minimal work at a time. The additional bells and whistles of this PR are for backwards compatibility. Previous the behavior was: 1) Determine all old segments 2) Ask wrapping merge policy what to do with the old segments 3) Merge segments specified by wrapped merge policy 4) Merge the remaining old segments into a single new segment Meaning, if you were to upgrade an old index using TierdMergePolicy or similar as the wrapped MP and said {{w.forceMerge(Integer.MAX_INT)}}, the TMP would say nothing to do, but the UpgradeIndexMergePolicy would then take it upon itself to merge everything down into a single segment. Ideally if {{max number of segments > number of segments}} and the wrapped MP is happy, the UIMP should not take it upon itself make any merge decisions and only upgrade segments needing upgrade by rewritting each segment. An additional decision to rely on cascading calls from the IW was chosen so that if this was being used as the default MP and an upgrade was in progress, old segments could still be candidates for merges issued during a commit. The idea is loosely based on elasticsearch's ElasticsearchMergePolicy. There should probably also be support for a Predicate to be passed for determining whether the segment should be upgraded (rewritten), then this MP can be used for things such as deciding to rewrite segments with a different codec. was (Author: k317h): Yes, in a nutshell that is my goal. Ultimately I'd like to build a request handler in solr that can enable a core to upgrade it's segments without first taking it offline or reconfiguring the index writer. When not engaged it should have no effect, and when it is engaged it should do minimal work at a time. The additional bells and whistles of this PR are for backwards compatibility. Previous the behavior was: 1) Determine all old segments 2) Ask wrapping merge policy what to do with the old segments 3) Merge segments specified by wrapped merge policy 4) Merge the remaining old segments into a single new segment Meaning, if you were to upgrade an old index using TierdMergePolicy or similar as the wrapped MP and said {{w.forceMerge(Integer.MAX_INT)}}, the TMP would say nothing to do, but the UpgradeIndexMergePolicy would then take it upon itself to merge everything down into a single segment. Ideally if {{max number of segments > number of segments}} and the wrapped MP is happy, the UIMP should not take it upon itself make any merge decisions and only upgrade segments needing upgrade. An additional decision to rely on cascading calls from the IW was chosen so that if this was being used as the default MP and an upgrade was in progress, old segments could still be candidates for merges issued during a commit. The idea is loosely based on elasticsearch's ElasticsearchMergePolicy. There should probably also be support for a Predicate to be passed for determining whether the segment should be upgraded (rewritten), then this MP can be used for things such as deciding to rewrite segments with a different codec. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
[ https://issues.apache.org/jira/browse/LUCENE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852346#comment-15852346 ] Keith Laban commented on LUCENE-7671: - Yes, in a nutshell that is my goal. Ultimately I'd like to build a request handler in solr that can enable a core to upgrade it's segments without first taking it offline or reconfiguring the index writer. When not engaged it should have no effect, and when it is engaged it should do minimal work at a time. The additional bells and whistles of this PR are for backwards compatibility. Previous the behavior was: 1) Determine all old segments 2) Ask wrapping merge policy what to do with the old segments 3) Merge segments specified by wrapped merge policy 4) Merge the remaining old segments into a single new segment Meaning, if you were to upgrade an old index using TierdMergePolicy or similar as the wrapped MP and said {{w.forceMerge(Integer.MAX_INT)}}, the TMP would say nothing to do, but the UpgradeIndexMergePolicy would then take it upon itself to merge everything down into a single segment. Ideally if {{max number of segments > number of segments}} and the wrapped MP is happy, the UIMP should not take it upon itself make any merge decisions and only upgrade segments needing upgrade. An additional decision to rely on cascading calls from the IW was chosen so that if this was being used as the default MP and an upgrade was in progress, old segments could still be candidates for merges issued during a commit. The idea is loosely based on elasticsearch's ElasticsearchMergePolicy. There should probably also be support for a Predicate to be passed for determining whether the segment should be upgraded (rewritten), then this MP can be used for things such as deciding to rewrite segments with a different codec. > Enhance UpgradeIndexMergePolicy with additional options > --- > > Key: LUCENE-7671 > URL: https://issues.apache.org/jira/browse/LUCENE-7671 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Keith Laban > > Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside > the scope the IndexUpgrader. > The enhancement aims to allow the UpgradeIndexMergePolicy to: > 1) Delegate normal force merges to the underlying merge policy > 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it > should start looking for upgrades. > 3) Allow new segments to be considered to be merged with old segments, > depending on underlying MergePolicy. > 4) Be configurable for backwards compatibility such that only segments > needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] (LUCENE-7671) Enhance UpgradeIndexMergePolicy with additional options
Keith Laban created LUCENE-7671: --- Summary: Enhance UpgradeIndexMergePolicy with additional options Key: LUCENE-7671 URL: https://issues.apache.org/jira/browse/LUCENE-7671 Project: Lucene - Core Issue Type: Improvement Reporter: Keith Laban Enhance UpgradeIndexMergePolicy to be a MergePolicy that can be used outside the scope the IndexUpgrader. The enhancement aims to allow the UpgradeIndexMergePolicy to: 1) Delegate normal force merges to the underlying merge policy 2) Enable a flag that will explicitly tell UpgradeIndexMergePolicy when it should start looking for upgrades. 3) Allow new segments to be considered to be merged with old segments, depending on underlying MergePolicy. 4) Be configurable for backwards compatibility such that only segments needing an upgrade would be considered when merging, no explicit upgrades. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10047) Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment
Keith Laban created SOLR-10047: -- Summary: Mismatched Docvalue segments cause exception in Sorting/Facting; Uninvert per segment Key: SOLR-10047 URL: https://issues.apache.org/jira/browse/SOLR-10047 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban The configuration of UninvertingReader in SolrIndexSearch creates a global mapping for the directory for fields to uninvert. If docvalues are enabled on a field the creation of a new segment will cause the query to fail when faceting/sorting on the recently docvalue enabled field. This happens because the UninvertingReader is configured globally across the entire directory, and a single segment containing DVs for a field will incorrectly indicate that all segments contain DVs. This patch addresses the incorrect behavior by determining the fields to be uninverted on a per-segment basis. With the fix, it is still recommended that a reindexing occur as data loss will when a DV and non-DV segment are merged, SOLR-10046 addresses this behavior. This fix is to be a stop gap for the time between enabling docvalues and the duration of a reindex. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10046) Create UninvertDocValuesMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-10046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-10046: --- Issue Type: Improvement (was: Bug) > Create UninvertDocValuesMergePolicy > --- > > Key: SOLR-10046 > URL: https://issues.apache.org/jira/browse/SOLR-10046 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Create a merge policy that can detect schema changes and use > UninvertingReader to uninvert fields and write docvalues into merged segments > when a field has docvalues enabled. > The current behavior is to write null values in the merged segment which can > lead to data integrity problems when sorting or faceting pending a full > reindex. > With this patch it would still be recommended to reindex when adding > docvalues for performance reasons, as it not guarenteed all segments will be > merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10046) Create UninvertDocValuesMergePolicy
Keith Laban created SOLR-10046: -- Summary: Create UninvertDocValuesMergePolicy Key: SOLR-10046 URL: https://issues.apache.org/jira/browse/SOLR-10046 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban Create a merge policy that can detect schema changes and use UninvertingReader to uninvert fields and write docvalues into merged segments when a field has docvalues enabled. The current behavior is to write null values in the merged segment which can lead to data integrity problems when sorting or faceting pending a full reindex. With this patch it would still be recommended to reindex when adding docvalues for performance reasons, as it not guarenteed all segments will be merged with docvalues turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9648) Wrap all solr merge policies with SolrMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602256#comment-15602256 ] Keith Laban commented on SOLR-9648: --- Hi christine let me try to address each of these bq. currently force-merge happens only when externally triggered true bq. the force-merge behaviour added by the wrap is (proposed to be) executed only on startup this is just where force merge is explicitly called in an effort to upgrade segments bq. the configured merge policy could (at least theoretically) disallow force merges not true, this implementation will fall through to the delegate if there are no segments to upgrade bq. The {{MAX_UPGRADES_AT_A_TIME = 5;}} sounds similar to what the MergeScheduler does (unless merge-on-startup bypasses the merge scheduler somehow?) not sure if force merge abides by the MergeScheduler bq. IndexWriter has a UNBOUNDED_MAX_MERGE_SEGMENTS==-1 which if made non-private could perhaps be used in the cmd.maxOptimizeSegments = Integer.MAX_VALUE; could be an interesting approach bq. UpgradeIndexMergePolicy also sounds very similar actually. I saw this but chose not to use it because the implementation doesn't fallback to the delegating merge policy. bq. The SolrMergePolicy has no solr dependencies, might it be renamed to something else and be part of the lucene code base? That is true right now, but I hope we can use the same approach to add in hooks for other solr specifc things if we need later. And hopefully also use this for things like adding/removing docvalues when the schema changes > Wrap all solr merge policies with SolrMergePolicy > - > > Key: SOLR-9648 > URL: https://issues.apache.org/jira/browse/SOLR-9648 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > Attachments: SOLR-9648-WIP.patch > > > Wrap the entry point for all merge policies with a single entry point merge > policy for more fine grained control over merging with minimal configuration. > The main benefit will be to allow upgrading of segments on startup when > lucene version changes. Ideally we can use the same approach for adding and > removing of doc values when the schema changes and hopefully other index type > changes such as Trie -> Point types, or even analyzer changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586530#comment-15586530 ] Keith Laban commented on SOLR-9506: --- How expensive would it be to check numDocs (#4 in yoniks comment earlier). I think this would be the most straightforward and understandable approach. > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > Attachments: SOLR-9506.patch, SOLR-9506.patch, SOLR-9506_POC.patch > > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9659) Add zookeeper DataWatch API
[ https://issues.apache.org/jira/browse/SOLR-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586137#comment-15586137 ] Keith Laban commented on SOLR-9659: --- I've used the *Cache recipes Scott is talking about pretty extensively for various projects. It makes doing what you describe pretty trivial. No resetting watches, dealing with timing, dealing with client connections. Basically, 1) Create a client 2) Create a PathChildrenCache or NodeCache for a path 3) Add a listener for cache changes 4) Start the cache Everything else is maintained by Curator. Which has become a pretty battle tested piece of software. > Add zookeeper DataWatch API > --- > > Key: SOLR-9659 > URL: https://issues.apache.org/jira/browse/SOLR-9659 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Alan Woodward >Assignee: Alan Woodward > Attachments: SOLR-9659.patch > > > We have several components which need to set up watches on ZooKeeper nodes > for various aspects of cluster management. At the moment, all of these > components do this themselves, leading to large amounts of duplicated code, > and complicated logic for dealing with reconnections, etc, scattered across > the codebase. We should replace this with a simple API controlled by > SolrZkClient, which should make the code more robust, and testing > considerably easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9659) Add zookeeper DataWatch API
[ https://issues.apache.org/jira/browse/SOLR-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585924#comment-15585924 ] Keith Laban commented on SOLR-9659: --- I'm not implying a full cutover. But if we were to build a generic API for talking to zk and getting events we might be able to borrow some ideas from Curator. > Add zookeeper DataWatch API > --- > > Key: SOLR-9659 > URL: https://issues.apache.org/jira/browse/SOLR-9659 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Alan Woodward >Assignee: Alan Woodward > Attachments: SOLR-9659.patch > > > We have several components which need to set up watches on ZooKeeper nodes > for various aspects of cluster management. At the moment, all of these > components do this themselves, leading to large amounts of duplicated code, > and complicated logic for dealing with reconnections, etc, scattered across > the codebase. We should replace this with a simple API controlled by > SolrZkClient, which should make the code more robust, and testing > considerably easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585694#comment-15585694 ] Keith Laban commented on SOLR-9506: --- Are you implying that if you add a document. commit it, compute the index fingerprint and cache the segments. Then delete that document and commit that change, and compute the fingerprint again with the cached segment fingerprint, you will end up with the same index fingerprint? > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > Attachments: SOLR-9506.patch, SOLR-9506.patch, SOLR-9506_POC.patch > > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9506: -- Comment: was deleted (was: Are you implying that if you add a document. commit it, compute the index fingerprint and cache the segments. Then delete that document and commit that change, and compute the fingerprint again with the cached segment fingerprint, you will end up with the same index fingerprint?) > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > Attachments: SOLR-9506.patch, SOLR-9506.patch, SOLR-9506_POC.patch > > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9506) cache IndexFingerprint for each segment
[ https://issues.apache.org/jira/browse/SOLR-9506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585693#comment-15585693 ] Keith Laban commented on SOLR-9506: --- Are you implying that if you add a document. commit it, compute the index fingerprint and cache the segments. Then delete that document and commit that change, and compute the fingerprint again with the cached segment fingerprint, you will end up with the same index fingerprint? > cache IndexFingerprint for each segment > --- > > Key: SOLR-9506 > URL: https://issues.apache.org/jira/browse/SOLR-9506 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul > Attachments: SOLR-9506.patch, SOLR-9506.patch, SOLR-9506_POC.patch > > > The IndexFingerprint is cached per index searcher. it is quite useless during > high throughput indexing. If the fingerprint is cached per segment it will > make it vastly more efficient to compute the fingerprint -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9659) Add zookeeper DataWatch API
[ https://issues.apache.org/jira/browse/SOLR-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585540#comment-15585540 ] Keith Laban commented on SOLR-9659: --- It might be worth looking at Apache Curator, if not for the dependency at least to model it after. Their API is very nice and easy to work with. > Add zookeeper DataWatch API > --- > > Key: SOLR-9659 > URL: https://issues.apache.org/jira/browse/SOLR-9659 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Alan Woodward >Assignee: Alan Woodward > Attachments: SOLR-9659.patch > > > We have several components which need to set up watches on ZooKeeper nodes > for various aspects of cluster management. At the moment, all of these > components do this themselves, leading to large amounts of duplicated code, > and complicated logic for dealing with reconnections, etc, scattered across > the codebase. We should replace this with a simple API controlled by > SolrZkClient, which should make the code more robust, and testing > considerably easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9056) Add ZkConnectionListener interface
[ https://issues.apache.org/jira/browse/SOLR-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585493#comment-15585493 ] Keith Laban commented on SOLR-9056: --- It would be great if we could use this for any generic zookeeper state changes. There still a big pain point in trying to get notified for rest managed resources changing, it requires building a zk watcher strategy from scratch. > Add ZkConnectionListener interface > -- > > Key: SOLR-9056 > URL: https://issues.apache.org/jira/browse/SOLR-9056 > Project: Solr > Issue Type: New Feature >Affects Versions: 6.0, 6.1 >Reporter: Alan Woodward >Assignee: Alan Woodward > Attachments: SOLR-9056.patch > > > Zk connection management is currently split among a few classes in > not-very-helpful ways. There's SolrZkClient, which manages general > interaction with zookeeper; ZkClientConnectionStrategy, which is a sort-of > connection factory, but one that's heavily intertwined with SolrZkClient; and > ConnectionManager, which doesn't actually manage connections at all, but > instead is a ZK watcher that calls back into SolrZkClient and > ZkClientConnectionStrategy. > We also have a number of classes that need to be notified about ZK session > changes - ZkStateReader sets up a bunch of watches for cluster state updates, > Overseer and ZkController use ephemeral nodes for elections and service > registry, CoreContainer needs to register cores and deal with recoveries, and > so on. At the moment, these are mostly handled via ZkController, which > therefore needs to know how about the internals of all these different > classes. There are a few other places where this co-ordination is > duplicated, though, for example in CloudSolrClient. And, as is always the > case with duplicated code, things are slightly different in each location. > I'd like to try and rationalize this, by refactoring the connection > management and adding a ZkConnectionListener interface. Any class that needs > to be notified when a zk session has expired or a new session has been > established can register itself with the SolrZkClient. And we can remove a > whole bunch of abstraction leakage out of ZkController, and back into the > classes that actually need to deal with session changes. Plus, it makes > things a lot easier to test. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9651) Consider tracking modification time of external file fields for faster reloading
[ https://issues.apache.org/jira/browse/SOLR-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583091#comment-15583091 ] Keith Laban commented on SOLR-9651: --- I wrote an extension of EFF called RemoteFileField (SOLR-9617). The idea is that you can drop your external file field in an s3 bucket or some remote hosted place and then tell solr to suck it down and update the EFF. We could probably use the same approach to have it just do atomic updates to the documents instead of writing an external file. Maybe the title/description of this ticket should be updated to be a discussion around finding a better approach for EFF. > Consider tracking modification time of external file fields for faster > reloading > > > Key: SOLR-9651 > URL: https://issues.apache.org/jira/browse/SOLR-9651 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 4.10.4 > Environment: Linux >Reporter: Mike > > I have an index of about 4M legal documents that has pagerank boosting > configured as an external file field. The external file is about 100MB in > size and has one row per document in the index. Each row indicates the > pagerank score of a document. When we open new searchers, this file has to > get reloaded, and it creates a noticeable delay for our users -- takes > several seconds to reload. > An idea to fix this came up in [a recent discussion in the Solr mailing > list|https://www.mail-archive.com/solr-user@lucene.apache.org/msg125521.html]: > Could the file only be reloaded if it has changed on disk? In other words, > when new searchers are opened, could they check the modtime of the file, and > avoid reloading it if the file hasn't changed? > In our configuration, this would be a big improvement. We only change the > pagerank file once/week because computing it is intensive and new documents > don't tend to have a big impact. At the same time, because we're regularly > adding new documents, we do hundreds of commits per day, all of which have a > delay as the (largish) external file field is reloaded. > Is this a reasonable improvement to request? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9651) Consider tracking modification time of external file fields for faster reloading
[ https://issues.apache.org/jira/browse/SOLR-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583026#comment-15583026 ] Keith Laban commented on SOLR-9651: --- I wonder if we can do something even more hacky (and efficient) like write the external file field as a doc value segment and pretend like its an unstored, doc value field. I'm with Mike in thinking that dropping a file is much easier than updating all your documents. > Consider tracking modification time of external file fields for faster > reloading > > > Key: SOLR-9651 > URL: https://issues.apache.org/jira/browse/SOLR-9651 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 4.10.4 > Environment: Linux >Reporter: Mike > > I have an index of about 4M legal documents that has pagerank boosting > configured as an external file field. The external file is about 100MB in > size and has one row per document in the index. Each row indicates the > pagerank score of a document. When we open new searchers, this file has to > get reloaded, and it creates a noticeable delay for our users -- takes > several seconds to reload. > An idea to fix this came up in [a recent discussion in the Solr mailing > list|https://www.mail-archive.com/solr-user@lucene.apache.org/msg125521.html]: > Could the file only be reloaded if it has changed on disk? In other words, > when new searchers are opened, could they check the modtime of the file, and > avoid reloading it if the file hasn't changed? > In our configuration, this would be a big improvement. We only change the > pagerank file once/week because computing it is intensive and new documents > don't tend to have a big impact. At the same time, because we're regularly > adding new documents, we do hundreds of commits per day, all of which have a > delay as the (largish) external file field is reloaded. > Is this a reasonable improvement to request? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9648) Wrap all solr merge policies with SolrMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579083#comment-15579083 ] Keith Laban edited comment on SOLR-9648 at 10/16/16 1:39 AM: - Adding a naive implementation that will do the upgrade of segments on startup (no tests). As of now this doesn't allow any configuration options to be passed, but can be easily added. Initial patch is intended as a POC to start the dialogue . was (Author: k317h): Adding a naive implementation that will do the upgrade of segments on startup (no tests). As of now this doesn't allow any configuration options to be passed, but can be easily added. > Wrap all solr merge policies with SolrMergePolicy > - > > Key: SOLR-9648 > URL: https://issues.apache.org/jira/browse/SOLR-9648 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > Attachments: SOLR-9648-WIP.patch > > > Wrap the entry point for all merge policies with a single entry point merge > policy for more fine grained control over merging with minimal configuration. > The main benefit will be to allow upgrading of segments on startup when > lucene version changes. Ideally we can use the same approach for adding and > removing of doc values when the schema changes and hopefully other index type > changes such as Trie -> Point types, or even analyzer changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9648) Wrap all solr merge policies with SolrMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9648: -- Attachment: SOLR-9648-WIP.patch Adding a naive implementation that will upgrade of segments on startup (no tests). As of now this doesn't allow any configuration options to be passed, but can be easily added. > Wrap all solr merge policies with SolrMergePolicy > - > > Key: SOLR-9648 > URL: https://issues.apache.org/jira/browse/SOLR-9648 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > Attachments: SOLR-9648-WIP.patch > > > Wrap the entry point for all merge policies with a single entry point merge > policy for more fine grained control over merging with minimal configuration. > The main benefit will be to allow upgrading of segments on startup when > lucene version changes. Ideally we can use the same approach for adding and > removing of doc values when the schema changes and hopefully other index type > changes such as Trie -> Point types, or even analyzer changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-9648) Wrap all solr merge policies with SolrMergePolicy
[ https://issues.apache.org/jira/browse/SOLR-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579083#comment-15579083 ] Keith Laban edited comment on SOLR-9648 at 10/16/16 1:37 AM: - Adding a naive implementation that will do the upgrade of segments on startup (no tests). As of now this doesn't allow any configuration options to be passed, but can be easily added. was (Author: k317h): Adding a naive implementation that will upgrade of segments on startup (no tests). As of now this doesn't allow any configuration options to be passed, but can be easily added. > Wrap all solr merge policies with SolrMergePolicy > - > > Key: SOLR-9648 > URL: https://issues.apache.org/jira/browse/SOLR-9648 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > Attachments: SOLR-9648-WIP.patch > > > Wrap the entry point for all merge policies with a single entry point merge > policy for more fine grained control over merging with minimal configuration. > The main benefit will be to allow upgrading of segments on startup when > lucene version changes. Ideally we can use the same approach for adding and > removing of doc values when the schema changes and hopefully other index type > changes such as Trie -> Point types, or even analyzer changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9648) Wrap all solr merge policies with SolrMergePolicy
Keith Laban created SOLR-9648: - Summary: Wrap all solr merge policies with SolrMergePolicy Key: SOLR-9648 URL: https://issues.apache.org/jira/browse/SOLR-9648 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban Wrap the entry point for all merge policies with a single entry point merge policy for more fine grained control over merging with minimal configuration. The main benefit will be to allow upgrading of segments on startup when lucene version changes. Ideally we can use the same approach for adding and removing of doc values when the schema changes and hopefully other index type changes such as Trie -> Point types, or even analyzer changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9617) Add Field Type RemoteFileField
[ https://issues.apache.org/jira/browse/SOLR-9617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563051#comment-15563051 ] Keith Laban commented on SOLR-9617: --- Are you sure that is the right ticket? I don't see the relevance > Add Field Type RemoteFileField > -- > > Key: SOLR-9617 > URL: https://issues.apache.org/jira/browse/SOLR-9617 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > RemoteFileField extends from ExternalFileField. The purpose of this field > type extension is to download an external file from a remote location (e.g. > S3 or artifactory) to a local location to be used as an external file field. > URLs are maintained as a ManagedResource and can be PUT as a fieldName -> url > mapping. Additionally there is a RequestHandler that will redownload all > RemoteFileFields. This request handler also distributes the request to all > live nodes in the cluster. The RequestHandler also implements SolrCoreAware > and will redownload all files when callad (i.e. whenever a core is loaded). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9617) Add Field Type RemoteFileField
Keith Laban created SOLR-9617: - Summary: Add Field Type RemoteFileField Key: SOLR-9617 URL: https://issues.apache.org/jira/browse/SOLR-9617 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban RemoteFileField extends from ExternalFileField. The purpose of this field type extension is to download an external file from a remote location (e.g. S3 or artifactory) to a local location to be used as an external file field. URLs are maintained as a ManagedResource and can be PUT as a fieldName -> url mapping. Additionally there is a RequestHandler that will redownload all RemoteFileFields. This request handler also distributes the request to all live nodes in the cluster. The RequestHandler also implements SolrCoreAware and will redownload all files when callad (i.e. whenever a core is loaded). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8316) Allow a field to be stored=false indexed=false docValues=true
[ https://issues.apache.org/jira/browse/SOLR-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558284#comment-15558284 ] Keith Laban commented on SOLR-8316: --- Yes this can be closed > Allow a field to be stored=false indexed=false docValues=true > - > > Key: SOLR-8316 > URL: https://issues.apache.org/jira/browse/SOLR-8316 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > > Right now if you try to index a field which is not stored or indexed you will > get an exception, however sometimes it makes sense to have a field which only > has docValues on for example see [SOLR-8220] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9310) PeerSync fails on a node restart due to IndexFingerPrint mismatch
[ https://issues.apache.org/jira/browse/SOLR-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430898#comment-15430898 ] Keith Laban commented on SOLR-9310: --- What is the need for the cache? I seems like there would only ever be a cache hit if if there is no active indexing. It seems like the added complexity is not worth the potential small performance boost. > PeerSync fails on a node restart due to IndexFingerPrint mismatch > - > > Key: SOLR-9310 > URL: https://issues.apache.org/jira/browse/SOLR-9310 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Pushkar Raste >Assignee: Noble Paul > Fix For: trunk, 6.3 > > Attachments: PeerSync_3Node_Setup.jpg, PeerSync_Experiment.patch, > SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, SOLR-9310.patch, > SOLR-9310.patch, SOLR-9310_3ReplicaTest.patch, SOLR-9310_final.patch > > > I found that Peer Sync fails if a node restarts and documents were indexed > while node was down. IndexFingerPrint check fails after recovering node > applies updates. > This happens only when node restarts and not if node just misses updates due > reason other than it being down. > Please check attached patch for the test. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9281) locate cores to host in zk based on nodeName
[ https://issues.apache.org/jira/browse/SOLR-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364866#comment-15364866 ] Keith Laban commented on SOLR-9281: --- Relates to [~markrmil...@gmail.com] comment in https://issues.apache.org/jira/browse/SOLR-7248?focusedCommentId=14363441=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14363441 > locate cores to host in zk based on nodeName > > > Key: SOLR-9281 > URL: https://issues.apache.org/jira/browse/SOLR-9281 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > when starting up an instance of solr in addition to discovering cores on the > local filesystem should discover its cores in zk based on its node name -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9280) make nodeName a configurable parameter in solr.xml
[ https://issues.apache.org/jira/browse/SOLR-9280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363601#comment-15363601 ] Keith Laban commented on SOLR-9280: --- I started working on this. The approach I am looking is this: 1) Add nodeName as a solr.xml property 2) in ZkController when starting if nodeName property is set use this one otherwise fall back to the genericNodeName :_ convention 3) change live nodes to have data in each ephemeral node such that the path name is nodeName and the data is genericNodeName. Essentially building a nodeName -> genericNodeName mapping. 4) in ZkStateReader change liveNodes from a Set to a Map. This can be done without having to change any public interfaces or anything external to the class 5) the major problem here is the method {{getBaseUrlForNodeName}} which takes a nodeName and returns a baseUrl. The method assumes that input nodeName is going to be in the genericNodeName format and there are tests which validate this. What i'm working on doing is creating a new method called {{getBaseUrlForGenericNodeName}} which basically does what {{getBaseUrlForNodeName}} did the updated version of {{getBaseUrlForNodeName}} will do the lookup for genericNodeName in the liveNodes map and then do the generic -> baseUrl conversion. Past that I'm just working on tracking down what uses the original method to see if there will be any issues anywhere. So far it doesn't look like it, but I haven't run the full test suite yet. Doing some manual testing with the changes i'm able to start two nodes with custom node names and use admin api for creating/modifying/deleting collections and replicas. Overall this approach is pretty self contained with not a whole lot of code modification. Also, without addition code if you try to start a second node with the same name it will block the instance from starting and timeout and die eventually, which is the desired behavior. I hope to have a patch up sometime this week > make nodeName a configurable parameter in solr.xml > -- > > Key: SOLR-9280 > URL: https://issues.apache.org/jira/browse/SOLR-9280 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Originally node name is automatically generated based on > {{:_}}. Instead it should be configurable in solr.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9265) Add configurable node_name aliases instead of host:post_context
[ https://issues.apache.org/jira/browse/SOLR-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363587#comment-15363587 ] Keith Laban commented on SOLR-9265: --- created SOLR-9280 and SOLR-9281 to track each issue independently. I'll keep this one open to discuss the overall approach if needed. > Add configurable node_name aliases instead of host:post_context > --- > > Key: SOLR-9265 > URL: https://issues.apache.org/jira/browse/SOLR-9265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Make it possible to give an alias name to node_name of an instance. As far as > I can tell you can’t do this, it's always going to be > :_. The goals of this change are the following: > 1) Address the node by alias in the core admin/collection apis > 2) Be able to start a new node with the same alias and have it update > clusterstate with the new base_url and suck down all the cores that the old > alias was hosting. This is already (kind of) possible if you create > core.properties for all the cores that you want the new node to host. However > I think this bleeds a little too much of the ananotmy of the cloud into the > directory structure of the solr instance. The other approach is more in the > paradigm of zookeeper is truth. > For #2 the desired behavior should be such that. > 1) If there is already a live node with the same node_name this current node > should block until that node is gone > 2) Once there is no node with the same node name and if there are any cores > assigned to that node alias they should now be hosted on the newly started > node > 3) If the old node comes back with the same alias and there is now a node in > live nodes with this alias go back to #1 > Configuration should be in solr.xml such that: > {code} > > > ${solrNodeName:} > > > {code} > where the default would be ":_" style. > An example for requirement #1: > {{/admin/collections?action=ADDREPLICA=collection=shard=solrNodeNameAlias}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9281) locate cores to host in zk based on nodeName
Keith Laban created SOLR-9281: - Summary: locate cores to host in zk based on nodeName Key: SOLR-9281 URL: https://issues.apache.org/jira/browse/SOLR-9281 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban when starting up an instance of solr in addition to discovering cores on the local filesystem should discover its cores in zk based on its node name -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9280) make nodeName a configurable parameter in solr.xml
Keith Laban created SOLR-9280: - Summary: make nodeName a configurable parameter in solr.xml Key: SOLR-9280 URL: https://issues.apache.org/jira/browse/SOLR-9280 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban Originally node name is automatically generated based on {{:_}}. Instead it should be configurable in solr.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357284#comment-15357284 ] Keith Laban commented on SOLR-9216: --- Never mind, I didn't realize you made a change to use it in some of the other classes > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban >Assignee: Noble Paul > Fix For: 6.2 > > Attachments: SOLR-9216.patch, SOLR-9216.patch, SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357280#comment-15357280 ] Keith Laban commented on SOLR-9216: --- [~noble.paul] did you mean to commit that change to SolrParams? > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban >Assignee: Noble Paul > Fix For: 6.2 > > Attachments: SOLR-9216.patch, SOLR-9216.patch, SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9265) Add configurable node_name aliases instead of host:post_context
Keith Laban created SOLR-9265: - Summary: Add configurable node_name aliases instead of host:post_context Key: SOLR-9265 URL: https://issues.apache.org/jira/browse/SOLR-9265 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Keith Laban Make it possible to give an alias name to node_name of an instance. As far as I can tell you can’t do this, it's always going to be :_. The goals of this change are the following: 1) Address the node by alias in the core admin/collection apis 2) Be able to start a new node with the same alias and have it update clusterstate with the new base_url and suck down all the cores that the old alias was hosting. This is already (kind of) possible if you create core.properties for all the cores that you want the new node to host. However I think this bleeds a little too much of the ananotmy of the cloud into the directory structure of the solr instance. The other approach is more in the paradigm of zookeeper is truth. For #2 the desired behavior should be such that. 1) If there is already a live node with the same node_name this current node should block until that node is gone 2) Once there is no node with the same node name and if there are any cores assigned to that node alias they should now be hosted on the newly started node 3) If the old node comes back with the same alias and there is now a node in live nodes with this alias go back to #1 Configuration should be in solr.xml such that: {code} ${solrNodeName:} {code} where the default would be ":_" style. An example for requirement #1: {{/admin/collections?action=ADDREPLICA=collection=shard=solrNodeNameAlias}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9265) Add configurable node_name aliases instead of host:post_context
[ https://issues.apache.org/jira/browse/SOLR-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355831#comment-15355831 ] Keith Laban commented on SOLR-9265: --- I briefly spoke to [~shalinmangar] offline about this idea > Add configurable node_name aliases instead of host:post_context > --- > > Key: SOLR-9265 > URL: https://issues.apache.org/jira/browse/SOLR-9265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Keith Laban > > Make it possible to give an alias name to node_name of an instance. As far as > I can tell you can’t do this, it's always going to be > :_. The goals of this change are the following: > 1) Address the node by alias in the core admin/collection apis > 2) Be able to start a new node with the same alias and have it update > clusterstate with the new base_url and suck down all the cores that the old > alias was hosting. This is already (kind of) possible if you create > core.properties for all the cores that you want the new node to host. However > I think this bleeds a little too much of the ananotmy of the cloud into the > directory structure of the solr instance. The other approach is more in the > paradigm of zookeeper is truth. > For #2 the desired behavior should be such that. > 1) If there is already a live node with the same node_name this current node > should block until that node is gone > 2) Once there is no node with the same node name and if there are any cores > assigned to that node alias they should now be hosted on the newly started > node > 3) If the old node comes back with the same alias and there is now a node in > live nodes with this alias go back to #1 > Configuration should be in solr.xml such that: > {code} > > > ${solrNodeName:} > > > {code} > where the default would be ":_" style. > An example for requirement #1: > {{/admin/collections?action=ADDREPLICA=collection=shard=solrNodeNameAlias}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355784#comment-15355784 ] Keith Laban commented on SOLR-9216: --- removed changes from RulesTest and added an OverseerModifyCollectionTest with this testcase > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban >Assignee: Noble Paul > Attachments: SOLR-9216.patch, SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9216: -- Attachment: SOLR-9216.patch > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban >Assignee: Noble Paul > Attachments: SOLR-9216.patch, SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9216: -- Attachment: (was: SOLR-9216.patch) > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9216: -- Attachment: SOLR-9216.patch > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336861#comment-15336861 ] Keith Laban commented on SOLR-9216: --- [~noble.paul] looks like you did the original work on MODIFYCOLLECTION. Would you mind taking a look? I put the test into RulesTest, although I'm not sure this is the best place to put the test, it was the only spot I could find that this request was being tested. If you would like I can move it separate class. > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
[ https://issues.apache.org/jira/browse/SOLR-9216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-9216: -- Attachment: SOLR-9216.patch > Support collection.configName in MODIFYCOLLECTION request > - > > Key: SOLR-9216 > URL: https://issues.apache.org/jira/browse/SOLR-9216 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-9216.patch > > > MODIFYCOLLECTION should support updating the > {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9216) Support collection.configName in MODIFYCOLLECTION request
Keith Laban created SOLR-9216: - Summary: Support collection.configName in MODIFYCOLLECTION request Key: SOLR-9216 URL: https://issues.apache.org/jira/browse/SOLR-9216 Project: Solr Issue Type: Improvement Reporter: Keith Laban MODIFYCOLLECTION should support updating the {{/collections/}} value of "configName" in zookeeper -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9183) Test ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing is failing
[ https://issues.apache.org/jira/browse/SOLR-9183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314207#comment-15314207 ] Keith Laban commented on SOLR-9183: --- Stepping through the code: https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/rule/ImplicitSnitch.java#L168 is returning an internal {{10.xx.xx.xxx}} ip address. Which seems to be because the DNS is defaulting to some internal ip address. The following produced the same ip. {code} $ dig 192.168.1.2.1 +short 10.xx.xx.xxx {code} Perhaps different validation should be done if the validity of the ip address format is required. > Test ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing > is failing > > > Key: SOLR-9183 > URL: https://issues.apache.org/jira/browse/SOLR-9183 > Project: Solr > Issue Type: Bug >Reporter: Keith Laban > > This is causing tests to fail for me on 6x and master > The test introduced in SOLR-8522 > {{ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing}} > is failing for me. > {code} > java.lang.AssertionError: > Expected: is <0> > got: <1> > at org.junit.Assert.assertThat(Assert.java:780) > at org.junit.Assert.assertThat(Assert.java:738) > at > org.apache.solr.cloud.rule.ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing(ImplicitSnitchTest.java:101) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) > at org.junit.runners.ParentRunner.run(ParentRunner.java:300) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192) > {code} > I suspect that > https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/rule/ImplicitSnitch.java#L130 > should be returning null causing the failures. > I'll run the test at home to see if it has something to do with the corporate > network I'm running on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8522) ImplicitSnitch to support IPv4 fragment tags
[ https://issues.apache.org/jira/browse/SOLR-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313156#comment-15313156 ] Keith Laban commented on SOLR-8522: --- I'm getting test failures due to some of the tests introduced in this ticket. I opened an issue at SOLR-9183 > ImplicitSnitch to support IPv4 fragment tags > > > Key: SOLR-8522 > URL: https://issues.apache.org/jira/browse/SOLR-8522 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 5.4 >Reporter: Arcadius Ahouansou >Assignee: Noble Paul >Priority: Minor > Fix For: 6.0 > > Attachments: SOLR-8522.patch, SOLR-8522.patch, SOLR-8522.patch, > SOLR-8522.patch, SOLR-8522.patch > > > This is a description from [~noble.paul]'s comment on SOLR-8146 > h3. IPv4 fragment tags > Lets assume a Solr node IPv4 address is {{192.93.255.255}} . > This is about enhancing the current {{ImplicitSnitch}} to support IP based > tags like: > - {{hostfrag_1 = 255}} > - {{hostfrag_2 = 255}} > - {{hostfrag_3 = 93}} > - {{hostfrag_4 = 192}} > Note that IPv6 support will be implemented by a separate ticket > h3. Host name fragment tags > Lets assume a Solr node host name {{serv1.dc1.country1.apache.org}} . > This is about enhancing the current {{ImplicitSnitch}} to support tags like: > - {{hostfrag_1 = org}} > - {{hostfrag_2 = apache}} > - {{hostfrag_3 = country1}} > - {{hostfrag_4 = dc1}} > - {{hostfrag_5 = serv1}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-9183) Test ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing is failing
Keith Laban created SOLR-9183: - Summary: Test ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing is failing Key: SOLR-9183 URL: https://issues.apache.org/jira/browse/SOLR-9183 Project: Solr Issue Type: Bug Reporter: Keith Laban This is causing tests to fail for me on 6x and master The test introduced in SOLR-8522 {{ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing}} is failing for me. {code} java.lang.AssertionError: Expected: is <0> got: <1> at org.junit.Assert.assertThat(Assert.java:780) at org.junit.Assert.assertThat(Assert.java:738) at org.apache.solr.cloud.rule.ImplicitSnitchTest.testGetTags_with_wrong_ipv4_format_ip_returns_nothing(ImplicitSnitchTest.java:101) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192) {code} I suspect that https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/rule/ImplicitSnitch.java#L130 should be returning null causing the failures. I'll run the test at home to see if it has something to do with the corporate network I'm running on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311183#comment-15311183 ] Keith Laban commented on SOLR-7887: --- If you notice the first note here https://logging.apache.org/log4j/2.0/manual/migration.html bq. They must not access methods and classes internal to the Log4j 1.x implementation such as Appenders, LoggerRepository or Category's callAppenders method. Unfortunately, hdfs extends {{AppenderSkeleton}} which violates this rule causing the bridge solution not to work. As far as I can tell there are only two soltuions: # continue to bring in bring in log4j 1.2 -- I think all of the packages have changed in the upgrade so that should hopefully work # hdfs/hadoop needs to upgrade to log4j2 Not sure what other options there are > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > Attachments: SOLR-7887-WIP.patch > > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311074#comment-15311074 ] Keith Laban edited comment on SOLR-7887 at 6/1/16 8:46 PM: --- I took a stab at upgrading from log4j to log4j2, which includes porting over the work that [~thelabdude] did over at https://github.com/lucidworks/solr-log4j2 .. Although I'm running into some issues. I'll do my best to document where I'm at so far. *Changes so far:* # Removed dependencies for log4j and slf4j-log4j12 # Added log4j2 dependencies: log4j-api, log4j-core, log4j-slf4j-impl (new slf4j bridge) # Deleted {{org.apache.solr.logging.log4j.\*}} and added {{org.apache.solr.logging.log4j2.\*}} from Tim's impl # Added Tim's Log4j2Watcher Test # Updated {{RequestLoggingTest}} and {{LoggingHandlerTest}} for log4j2 # Updated {{SolrLogLayout}} for log4j2 (no tests?) # Updated log4j.properties in example/resources to log4j2.xml # Update {{bin/solr}} to use new log4j2.xml and system property {{-Dlog4j.configurationFile}} # This build will compile and run and LogWatcher works in admin UI as expected *The problems i'm facing are*: # Bringing in log4j2 dependencies for some reason is causing javac doclint to fail with errors like "Invalid use of @throws" in random places throughout solr. For the mean time i've changed lucene/common-build.xml to use {{}} until I can figure out whats going on # What finally railroaded me here is that HDFS uses log4j directly and tests which use HDFS fail during runtime with no class found exceptions. I tried including the log4j2-1.2 api as a stop-gap but the the nature of how hdfs uses log4j does not allow this as a fix. If someone wants to jump in here and help out that would be great. I think this would be an big win for the next major release of solr. was (Author: k317h): I took a stab at upgrading from log4j to log4j2, which includes porting over the work that [~thelabdude] did over at https://github.com/lucidworks/solr-log4j2 .. Although I'm running into some issues. I'll do my best to document where I'm at so far. *Changes so far:* # Removed dependencies for log4j and slf4j-log4j12 # Added log4j2 dependencies: log4j-api, log4j-core, log4j-slf4j-impl (new slf4j bridge) # Deleted {{org.apache.solr.logging.log4j.\*}} and added {{org.apache.solr.logging.log4j2.\*}} from Tim's impl # Added Tim's Log4j2Watcher Test # Updated {{RequestLoggingTest}} and {{LoggingHandlerTest}} for log4j2 # Updated {{SolrLogLayout}} for log4j2 (no tests?) # Updated log4j.properties in example/resources to log4j2.xml # Update {{bin/solr}} to use new log4j2.xml and system property {{-Dlog4j.configurationFile}} # This build will compile and run and LogWatcher works in admin UI as expected *The problems i'm facing are*: # Bringing in log4j2 dependencies for some reason is causing javac doclint to fail with errors like "Invalid use of @throws" in random places throughout solr. For the mean time i've changed lucene/common-build.xml to use {{}} until I can figure out whats going on # What finally railroaded me here is that HDFS uses log4j directly and tests which use HDFS fail during runtime with no class found exceptions. I tried including the log4j2-1.2 api as a stop-gap but the the nature of how hdfs uses log4j2 does not allow this as a fix. If someone wants to jump in here and help out that would be great. I think this would be an big win for the next major release of solr. > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > Attachments: SOLR-7887-WIP.patch > > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-7887: -- Attachment: SOLR-7887-WIP.patch I took a stab at upgrading from log4j to log4j2, which includes porting over the work that [~thelabdude] did over at https://github.com/lucidworks/solr-log4j2 .. Although I'm running into some issues. I'll do my best to document where I'm at so far. *Changes so far:* # Removed dependencies for log4j and slf4j-log4j12 # Added log4j2 dependencies: log4j-api, log4j-core, log4j-slf4j-impl (new slf4j bridge) # Deleted {{org.apache.solr.logging.log4j.\*}} and added {{org.apache.solr.logging.log4j2.\*}} from Tim's impl # Added Tim's Log4j2Watcher Test # Updated {{RequestLoggingTest}} and {{LoggingHandlerTest}} for log4j2 # Updated {{SolrLogLayout}} for log4j2 (no tests?) # Updated log4j.properties in example/resources to log4j2.xml # Update {{bin/solr}} to use new log4j2.xml and system property {{-Dlog4j.configurationFile}} # This build will compile and run and LogWatcher works in admin UI as expected *The problems i'm facing are*: # Bringing in log4j2 dependencies for some reason is causing javac doclint to fail with errors like "Invalid use of @throws" in random places throughout solr. For the mean time i've changed lucene/common-build.xml to use {{}} until I can figure out whats going on # What finally railroaded me here is that HDFS uses log4j directly and tests which use HDFS fail during runtime with no class found exceptions. I tried including the log4j2-1.2 api as a stop-gap but the the nature of how hdfs uses log4j2 does not allow this as a fix. If someone wants to jump in here and help out that would be great. I think this would be an big win for the next major release of solr. > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > Attachments: SOLR-7887-WIP.patch > > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302432#comment-15302432 ] Keith Laban commented on SOLR-7887: --- Is there any reason the the upgrade and patch [~thelabdude] mentions above hasn't made into master? > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302430#comment-15302430 ] Keith Laban commented on SOLR-8988: --- Thats right. This affects all queries where {{isDistrib}} is true for any reason. > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.5, 6.0 >Reporter: Keith Laban >Assignee: Dennis Gove > Fix For: 6.1 > > Attachments: SOLR-8988.patch, SOLR-8988.patch, SOLR-8988.patch, > SOLR-8988.patch, Screen Shot 2016-04-25 at 2.54.47 PM.png, Screen Shot > 2016-04-25 at 2.55.00 PM.png > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
[ https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300420#comment-15300420 ] Keith Laban commented on SOLR-7887: --- Has anyone been looking at this recently? I have a requirement to upgrade to log4j2 in our build of solr. Some other arguments I can think of for doing this in addition to async appenders is the ability to Regex filter different requests paths into separate log files instead of everything showing up in either server log or request log > Upgrade Solr to use log4j2 -- log4j 1 now officially end of life > > > Key: SOLR-7887 > URL: https://issues.apache.org/jira/browse/SOLR-7887 > Project: Solr > Issue Type: Task >Affects Versions: 5.2.1 >Reporter: Shawn Heisey > > The logging services project has officially announced the EOL of log4j 1: > https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces > In the official binary jetty deployment, we use use log4j 1.2 as our final > logging destination, so the admin UI has a log watcher that actually uses > log4j and java.util.logging classes. That will need to be extended to add > log4j2. I think that might be the largest pain point to this upgrade. > There is some crossover between log4j2 and slf4j. Figuring out exactly which > jars need to be in the lib/ext directory will take some research. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9152) Change the default of facet.distrib.mco from false to true
[ https://issues.apache.org/jira/browse/SOLR-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297007#comment-15297007 ] Keith Laban commented on SOLR-9152: --- The original concern in SOLR-8988 was that it would affect refinement. I can't see a reason why it would, additionally in any of the testing I've done I've seen only improvements. > Change the default of facet.distrib.mco from false to true > -- > > Key: SOLR-9152 > URL: https://issues.apache.org/jira/browse/SOLR-9152 > Project: Solr > Issue Type: Improvement >Reporter: Dennis Gove >Priority: Minor > > SOLR-8988 added a new query option facet.distrib.mco which when set to true > would allow the use of facet.mincount=1 in cloud mode. The previous behavior, > and current default, is that facet.mincount=0 when in cloud mode. > h3. What exactly would be changed? > The default of facet.distrib.mco=false would be changed to > facet.distrib.mco=true. > h3. When is this option effective? > From the documentation, > {code} > /** > * If we are returning facet field counts, are sorting those facets by their > count, and the minimum count to return is 0, > * then allow the use of facet.mincount = 1 in cloud mode. To enable this use > facet.distrib.mco=true. > * > * i.e. If the following three conditions are met in cloud mode: > facet.sort=count, facet.limit 0, facet.mincount 0. > * Then use facet.mincount=1. > * > * Previously and by default facet.mincount will be explicitly set to 0 when > in cloud mode for this condition. > * In SOLR-8599 and SOLR-8988, significant performance increase has been seen > when enabling this optimization. > * > * Note: enabling this flag has no effect when the conditions above are not > met. For those other cases the default behavior is sufficient. > */ > {code} > h3. What is the result of turning this option on? > When facet.distrib.mco=true is used, and the conditions above are met, then > when Solr is sending requests off to the various shards it will include > facet.mincount=1. The result of this is that only terms with a count > 0 will > be considered when processing the request for that shard. This can result in > a significant performance gain when the field has high cardinality and the > matching docset is relatively small because terms with 0 matches will not be > considered. > As shown in SOLR-8988, the runtime of a single query was reduced from 20 > seconds to less than 1 second. > h3. Can this change result in worse performance? > The current thinking is no, worse performance won't be experienced even under > non-optimal scenarios. From the comments in SOLR-8988, > {quote} > Consider you asked for up to 10 terms from shardA with mincount=1 but you > received only 5 terms back. In this case you know, definitively, that a term > seen in the response from shardB but not in the response from shardA could > have at most a count of 0 in shardA. If it had any other count in shardA then > it would have been returned in the response from shardA. > Also, if you asked for up to 10 terms from shardA with mincount=1 and you get > back a response with 10 terms having a count >= 1 then the response is > identical to the one you'd have received if mincount=0. > Because of this, there isn't a scenario where the response would result in > more work than would have been required if mincount=0. For this reason, the > decrease in required work when mincount=1 is *always* either a moot point or > a net win. > {quote} > The belief here is that it is safe to change the default of facet.distrib.mco > such that facet.mincount=1 will be used when appropriate. The overall > performance gain can be significant and there is no seen performance cost. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8988: -- Attachment: SOLR-8988.patch Fixed docs > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch, SOLR-8988.patch, SOLR-8988.patch, > SOLR-8988.patch, Screen Shot 2016-04-25 at 2.54.47 PM.png, Screen Shot > 2016-04-25 at 2.55.00 PM.png > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15266916#comment-15266916 ] Keith Laban commented on SOLR-8988: --- [~hossman] how does the updated patch look? > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch, SOLR-8988.patch, Screen Shot 2016-04-25 > at 2.54.47 PM.png, Screen Shot 2016-04-25 at 2.55.00 PM.png > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15256797#comment-15256797 ] Keith Laban edited comment on SOLR-8988 at 4/25/16 7:04 PM: Added second version of patch which has this feature disabled by default but can be enabled with {{facet.distrib.mco=true}}. I also did some benchmarking and under all scenarios tested the new way is either the same or way faster. The test was with 12 shards everything evenly distributed. Two things to note about this test: - All terms have the same count which would be the worst case for refinement which is evident in the shape of each graph. Overrequesting is far more efficient. - All segments are evenly distributed however in the real world, the best performance gains for this patch would be seen when there are many segments which contain no relevant terms for the query. More details about the test. - 2 node cloud running locally each with 4g - 12 shards without replication (only 12 total cores) - terms were integers with doc values enabled - instances were restarted after each test to avoid lingering GC issues, however each test had some warmup queries before running the test - The Y-axis is average QTime(ms) over 100 test runs was (Author: k317h): Added second version of patch which has this feature disabled by default but can be enabled with {{facet.distrib.mco=true}}. I also did some benchmarking and under all scenarios tested the new way is either the same or way faster. The test was with 12 shards everything evenly distributed. Two things to note about this test: - All terms have the same count which would be the worst case for refinement which is evident in the shape of each graph. Overrequesting is far more efficient. - All segments are evenly distributed however in the real world, the best performance gains for this patch would be seen when there are many segments which contain no relevant terms for the query. > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch, SOLR-8988.patch, Screen Shot 2016-04-25 > at 2.54.47 PM.png, Screen Shot 2016-04-25 at 2.55.00 PM.png > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8988: -- Attachment: Screen Shot 2016-04-25 at 2.55.00 PM.png Screen Shot 2016-04-25 at 2.54.47 PM.png SOLR-8988.patch Added second version of patch which has this feature disabled by default but can be enabled with {{facet.distrib.mco=true}}. I also did some benchmarking and under all scenarios tested the new way is either the same or way faster. The test was with 12 shards everything evenly distributed. Two things to note about this test: - All terms have the same count which would be the worst case for refinement which is evident in the shape of each graph. Overrequesting is far more efficient. - All segments are evenly distributed however in the real world, the best performance gains for this patch would be seen when there are many segments which contain no relevant terms for the query. > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch, SOLR-8988.patch, Screen Shot 2016-04-25 > at 2.54.47 PM.png, Screen Shot 2016-04-25 at 2.55.00 PM.png > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252720#comment-15252720 ] Keith Laban commented on SOLR-8599: --- [~anshumg] There were two separate commits for this ticket, but they didn't all land on 6x or 6.0 but are both on master: {code} commit e3b785a906d6f93e04f2cb45c436516158af0425 Author: Dennis GoveDate: Sun Mar 20 11:13:56 2016 -0400 SOLR-8599: Improved the tests for this issue to avoid changing a variable to non-final commit 2c0a5e30364d83dc82383075a5f7c65200022494 Author: Dennis Gove Date: Wed Feb 10 15:02:18 2016 -0500 SOLR-8599: After a failed connection during construction of SolrZkClient attempt to retry until a connection can be made {code} however only the first commit found its way to 6x and 6.0 so please port that second commit and remember to port both for 5.5.1, thanks > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban >Assignee: Dennis Gove > Fix For: master, 6.0, 5.5.1 > > Attachments: SOLR-8599.patch, SOLR-8599.patch, SOLR-8599.patch, > SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15248953#comment-15248953 ] Keith Laban commented on SOLR-8988: --- [~hossman] can I convince you that this should be the default behavior? > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243103#comment-15243103 ] Keith Laban commented on SOLR-8988: --- For clarity of this test: bq. num shards - 12 bq. num docs per shard - ~70 million bq. num terms in field - ~15 million bq. num terms with non-zero facet counts for docs matching query on a per shard basis - ~90k bq. how much variance there is in the num terms with non-zero facet counts for docs matching query on a per shard basis - evenly distributed bq. ...is that if you get back a count of foo=0 from shardA, and if foo winds up being a candidate term for the final topN list because of it's count on other shards, then you know definitively that you don't have to ask shardA to provide a refinement value for "foo" - you already know it's count. This is the part that I would argue doesn't matter. Consider you asked for 10 terms from shardA with mincount =1 and you received only 5 terms back. Then you know that if foo was in shardB, but not in shardA the maximum count it could have had in shardA was 0, otherwise it would have been returned in the initial request. On the other hand if you ask for 10 terms with mincount=1 and you get back 10 terms with a count >=1 well the response back would have been identical if mincount=0. Logic aids refinement pulled from -- {{FacetComponent.DistributedFieldFacet}} {code} void add(int shardNum, NamedList shardCounts, int numRequested) { // shardCounts could be null if there was an exception int sz = shardCounts == null ? 0 : shardCounts.size(); int numReceived = sz; FixedBitSet terms = new FixedBitSet(termNum + sz); long last = 0; for (int i = 0; i < sz; i++) { String name = shardCounts.getName(i); long count = ((Number) shardCounts.getVal(i)).longValue(); if (name == null) { missingCount += count; numReceived--; } else { ShardFacetCount sfc = counts.get(name); if (sfc == null) { sfc = new ShardFacetCount(); sfc.name = name; sfc.indexed = ftype == null ? sfc.name : ftype.toInternal(sfc.name); sfc.termNum = termNum++; counts.put(name, sfc); } sfc.count += count; terms.set(sfc.termNum); last = count; } } // the largest possible missing term is initialMincount if we received // less than the number requested. if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) { last = initialMincount; } missingMaxPossible += last; missingMax[shardNum] = last; counted[shardNum] = terms; } {code} However I think this line block should also be changed. {code} if (numRequested < 0 || numRequested != 0 && numReceived < numRequested) { last = Math.max(initialMincount-1, 0); } {code} > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241767#comment-15241767 ] Keith Laban commented on SOLR-8988: --- I'm not sure who would be best to look at this. Maybe [~yo...@apache.org] or [~erike4...@yahoo.com] would be more familiar with this code path. Is there any reason this wouldn't work? > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-8988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8988: -- Attachment: SOLR-8988.patch > Improve facet.method=fcs performance in SolrCloud > - > > Key: SOLR-8988 > URL: https://issues.apache.org/jira/browse/SOLR-8988 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8988.patch > > > This relates to SOLR-8559 -- which improves the algorithm used by fcs > faceting when {{facet.mincount=1}} > This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. > As far as I can tell there is no reason to set {{facet.mincount=0}} for > refinement purposes . After trying to make sense of all the refinement logic, > I cant see how the difference between _no value_ and _value=0_ would have a > negative effect. > *Test perf:* > - ~15million unique terms > - query matches ~3million documents > *Params:* > {code} > facet.mincount=1 > facet.limit=500 > facet.method=fcs > facet.sort=count > {code} > *Average Time Per Request:* > - Before patch: ~20seconds > - After patch: <1 second > *Note*: all tests pass and in my test, the output was identical before and > after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8988) Improve facet.method=fcs performance in SolrCloud
Keith Laban created SOLR-8988: - Summary: Improve facet.method=fcs performance in SolrCloud Key: SOLR-8988 URL: https://issues.apache.org/jira/browse/SOLR-8988 Project: Solr Issue Type: Improvement Reporter: Keith Laban This relates to SOLR-8559 -- which improves the algorithm used by fcs faceting when {{facet.mincount=1}} This patch allows {{facet.mincount}} to be sent as 1 for distributed queries. As far as I can tell there is no reason to set {{facet.mincount=0}} for refinement purposes . After trying to make sense of all the refinement logic, I cant see how the difference between _no value_ and _value=0_ would have a negative effect. *Test perf:* - ~15million unique terms - query matches ~3million documents *Params:* {code} facet.mincount=1 facet.limit=500 facet.method=fcs facet.sort=count {code} *Average Time Per Request:* - Before patch: ~20seconds - After patch: <1 second *Note*: all tests pass and in my test, the output was identical before and after patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8922) DocSetCollector can allocate massive garbage on large indexes
[ https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220252#comment-15220252 ] Keith Laban commented on SOLR-8922: --- Just out of curiosity, have you run tests with G1? Was there a performance difference? > DocSetCollector can allocate massive garbage on large indexes > - > > Key: SOLR-8922 > URL: https://issues.apache.org/jira/browse/SOLR-8922 > Project: Solr > Issue Type: Improvement >Reporter: Jeff Wartes > Attachments: SOLR-8922.patch > > > After reaching a point of diminishing returns tuning the GC collector, I > decided to take a look at where the garbage was coming from. To my surprise, > it turned out that for my index and query set, almost 60% of the garbage was > coming from this single line: > https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49 > This is due to the simple fact that I have 86M documents in my shards. > Allocating a scratch array big enough to track a result set 1/64th of my > index (1.3M) is also almost certainly excessive, considering my 99.9th > percentile hit count is less than 56k. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189585#comment-15189585 ] Keith Laban edited comment on SOLR-8599 at 3/10/16 6:45 PM: uploaded a new patch against master addressing Uwes concerns. I see that this was already merged in master, but that seems to be the only branch its currently on so hopefully its ok to put this out there. Instead of scheduling a thread to change the address I was able to mock the DefaultConnectionStrategy to throw an exception in the reconnect call the first time its called to verify that it is being retried. With this change I also changed the server address to be final again and removed the public setter method seeing how they were no longer needed Note: The new patch doesn't have the changes which introduced the retry logic in {{ConnectionManager}} since that part is already merged in on the master branch and branch_6x/branch_6_0. I can put in a separate patch containing both for backport to branch_5x if needed was (Author: k317h): uploaded a new patch against master addressing Uwes concerns. I see that this was already merged in master, but that seems to be the only branch its currently on so hopefully its ok to put this out there. Instead of scheduling a thread to change the address I was able to mock the DefaultConnectionStrategy to throw an exception in the reconnect call the first time its called to verify that it is being retried. With this change I also changed the server address to be final again and removed the public setter method seeing how they were no longer needed Note: The new patch doesn't have the changes which introduced the retry logic in {{ConnectionManager}} since that part is already merged in on the master branch. I can put in a separate patch containing both for backport to branch_6x and branch_5x > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban >Assignee: Dennis Gove > Attachments: SOLR-8599.patch, SOLR-8599.patch, SOLR-8599.patch, > SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189585#comment-15189585 ] Keith Laban edited comment on SOLR-8599 at 3/10/16 5:27 PM: uploaded a new patch against master addressing Uwes concerns. I see that this was already merged in master, but that seems to be the only branch its currently on so hopefully its ok to put this out there. Instead of scheduling a thread to change the address I was able to mock the DefaultConnectionStrategy to throw an exception in the reconnect call the first time its called to verify that it is being retried. With this change I also changed the server address to be final again and removed the public setter method seeing how they were no longer needed Note: The new patch doesn't have the changes which introduced the retry logic in {{ConnectionManager}} since that part is already merged in on the master branch. I can put in a separate patch containing both for backport to branch_6x and branch_5x was (Author: k317h): uploaded a new patch against master addressing Uwes concerns. I see that this was already merged in master, but that seems to be the only branch its currently on so hopefully its ok to put this out there. Instead of scheduling a thread to change the address I was able to mock the DefaultConnectionStrategy to throw an exception in the reconnect call the first time its called to verify that it is being retried. With this change I also changed the server address to be final again and removed the public setter method seeing how they were no longer needed > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban >Assignee: Dennis Gove > Attachments: SOLR-8599.patch, SOLR-8599.patch, SOLR-8599.patch, > SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8599: -- Attachment: SOLR-8599.patch uploaded a new patch against master addressing Uwes concerns. I see that this was already merged in master, but that seems to be the only branch its currently on so hopefully its ok to put this out there. Instead of scheduling a thread to change the address I was able to mock the DefaultConnectionStrategy to throw an exception in the reconnect call the first time its called to verify that it is being retried. With this change I also changed the server address to be final again and removed the public setter method seeing how they were no longer needed > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban >Assignee: Dennis Gove > Attachments: SOLR-8599.patch, SOLR-8599.patch, SOLR-8599.patch, > SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3990) index size unavailable in gui/mbeans unless replication handler configured
[ https://issues.apache.org/jira/browse/SOLR-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-3990: -- Attachment: SOLR-3990.patch Bumping this ticket. Attaching a new patch for current master (should also apply to 5x) > index size unavailable in gui/mbeans unless replication handler configured > -- > > Key: SOLR-3990 > URL: https://issues.apache.org/jira/browse/SOLR-3990 > Project: Solr > Issue Type: Improvement > Components: web gui >Affects Versions: 4.0 >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Minor > Fix For: 4.10, master > > Attachments: SOLR-3990.patch, SOLR-3990.patch > > > Unless you configure the replication handler, the on-disk size of each core's > index seems to be unavailable in the gui or from the mbeans handler. If you > are not doing replication, you should still be able to get the size of each > index without configuring things that won't be used. > Also, I would like to get the size of the index in a consistent unit of > measurement, probably MB. I understand the desire to give people a human > readable unit next to a number that's not enormous, but it's difficult to do > programmatic comparisons between values such as 787.33 MB and 23.56 GB. That > may mean that the number needs to be available twice, one format to be shown > in the admin GUI and both formats available from the mbeans handler, for > scripting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5616) Make grouping code use response builder needDocList
[ https://issues.apache.org/jira/browse/SOLR-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-5616: -- Attachment: SOLR-5616.patch Attached new, addressing [~erickerickson] comments from way back when. test suite passes when running locally > Make grouping code use response builder needDocList > --- > > Key: SOLR-5616 > URL: https://issues.apache.org/jira/browse/SOLR-5616 > Project: Solr > Issue Type: Bug >Reporter: Steven Bower >Assignee: Erick Erickson > Attachments: SOLR-5616.patch, SOLR-5616.patch, SOLR-5616.patch > > > Right now the grouping code does this to check if it needs to generate a > docList for grouped results: > {code} > if (rb.doHighlights || rb.isDebug() || params.getBool(MoreLikeThisParams.MLT, > false) ){ > ... > } > {code} > this is ugly because any new component that needs a doclist, from grouped > results, will need to modify QueryComponent to add a check to this if. > Ideally this should just use the rb.isNeedDocList() flag... > Coincidentally this boolean is really never used at for non-grouped results > it always gets generated.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-8371) Try and prevent too many recovery requests from stacking up and clean up some faulty logic.
[ https://issues.apache.org/jira/browse/SOLR-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145596#comment-15145596 ] Keith Laban commented on SOLR-8371: --- I see that this part of the code is being ported from the previous implementation, but, what are the effects of canceling recovery and then throttling the next recovery? Would it be more efficient to let to original recovery finish and have the next pending recovery fired once the original one is done? It seems counter intuitive to cancel the current progress then wait to start it again if the throttling strategy says so. With these changes there should always be one pending recovery as long there has been a recovery request since the currently running one has started. My depth of knowledge here is pretty limited so I'm not sure if finishing current recovery and then picking up the missing pieces would be better than stopping and starting, just throwing in some thoughts. > Try and prevent too many recovery requests from stacking up and clean up some > faulty logic. > --- > > Key: SOLR-8371 > URL: https://issues.apache.org/jira/browse/SOLR-8371 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 5.5, master > > Attachments: SOLR-8371-2.patch, SOLR-8371.patch, SOLR-8371.patch, > SOLR-8371.patch, SOLR-8371.patch, SOLR-8371.patch, SOLR-8371.patch, > SOLR-8371.patch, SOLR-8371.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8599: -- Attachment: SOLR-8599.patch added usage of DefaultSolrThreadFactory to test > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban > Attachments: SOLR-8599.patch, SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-8666) Add header to SearchHandler to indicate whether solr is connection to zk
Keith Laban created SOLR-8666: - Summary: Add header to SearchHandler to indicate whether solr is connection to zk Key: SOLR-8666 URL: https://issues.apache.org/jira/browse/SOLR-8666 Project: Solr Issue Type: Improvement Reporter: Keith Laban Currently solr update requests error out if a zookeeper check fails, however SearchHandler does not do these checks. As a result, if a request is sent to a node which should be part of a SolrCloud but is not connected to zookeeper and thinks that its Active, it's possible the response is composed of stale data. The purpose of this header is to allow the client to decide whether or not the result data should be considered valid. This patch also returns the {{zkConnected}} header in the ping handler to allow external health checks to use this information. See [SOLR-8599] for an example of when this situation can arise. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8666) Add header to SearchHandler to indicate whether solr is connection to zk
[ https://issues.apache.org/jira/browse/SOLR-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8666: -- Attachment: SOLR-8666.patch Patch added > Add header to SearchHandler to indicate whether solr is connection to zk > > > Key: SOLR-8666 > URL: https://issues.apache.org/jira/browse/SOLR-8666 > Project: Solr > Issue Type: Improvement >Reporter: Keith Laban > Attachments: SOLR-8666.patch > > > Currently solr update requests error out if a zookeeper check fails, however > SearchHandler does not do these checks. As a result, if a request is sent to > a node which should be part of a SolrCloud but is not connected to zookeeper > and thinks that its Active, it's possible the response is composed of stale > data. > The purpose of this header is to allow the client to decide whether or not > the result data should be considered valid. > This patch also returns the {{zkConnected}} header in the ping handler to > allow external health checks to use this information. > See [SOLR-8599] for an example of when this situation can arise. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8599: -- Attachment: SOLR-8599.patch Added a patch to address issue #1. In this patch I had to remove {{final}} clause for {{ZkServerAddress}} and add a public setter method in order to test the issue > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban > Attachments: SOLR-8599.patch > > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-8599) Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state
[ https://issues.apache.org/jira/browse/SOLR-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Laban updated SOLR-8599: -- Summary: Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent state (was: Errors in construction of SolrZooKeeper cause Solr to into inconsistent state) > Errors in construction of SolrZooKeeper cause Solr to go into an inconsistent > state > --- > > Key: SOLR-8599 > URL: https://issues.apache.org/jira/browse/SOLR-8599 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Reporter: Keith Laban > > We originally saw this happen due to a DNS exception (see stack trace below). > Although any exception thrown in the constructor of SolrZooKeeper or the > parent class, ZooKeeper, will cause DefaultConnectionStrategy to fail to > update the zookeeper client. Once it gets into this state, it will not try to > connect again until the process is restarted. The node itself will also > respond successfully to query requests, but not to update requests. > Two things should be address here: > 1) Fix the error handling and issue some number of retries > 2) If we are stuck in a state like this stop responding to all requests > {code} > 2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - > :java.net.UnknownHostException: HOSTNAME: unknown error > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) > at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) > at java.net.InetAddress.getAllByName0(InetAddress.java:1276) > at java.net.InetAddress.getAllByName(InetAddress.java:1192) > at java.net.InetAddress.getAllByName(InetAddress.java:1126) > at > org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:380) > at org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:41) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53) > at > org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - > Connected:false > 2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut > down > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org