[jira] [Created] (SOLR-15106) Thread in OverseerTaskProcessor should not "return"
Mathieu Marie created SOLR-15106: Summary: Thread in OverseerTaskProcessor should not "return" Key: SOLR-15106 URL: https://issues.apache.org/jira/browse/SOLR-15106 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Affects Versions: 8.6, master (9.0) Reporter: Mathieu Marie I have encountered a scenario were ZK was not accessible for a long time (due to _jute.maxbuffer_ issue, but not related to the rest of this issue). During that time, the ClusterStateUpdater and OC queues from the Overseer got filled with 1200+ messages. Once we restored ZK availability, the ClusterStateUpdater queue got emptied, but not the OC one. The Overseer stopped to dequeue from the OC queue. After some digging in the code it seems that a *return* from the overseer thread starting the runners could be the issue. Code in OverseerTaskProcessor.java (https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java#L357) The lines of codes that immediately follow should also be reviewed carefully as they also return or interrupt the thread that is responsible to execute the runners. Anyhow, if anybody hit that same issue, the quick workaround is to bump the overseer instance to elect a new overseer on another node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15083) prometheus-exporter metric solr_metrics_jvm_os_cpu_time_seconds is misnamed
Mathieu Marie created SOLR-15083: Summary: prometheus-exporter metric solr_metrics_jvm_os_cpu_time_seconds is misnamed Key: SOLR-15083 URL: https://issues.apache.org/jira/browse/SOLR-15083 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - prometheus-exporter Affects Versions: 8.6, master (9.0) Reporter: Mathieu Marie *solr_metrics_jvm_os_cpu_time_seconds* metric exported by prometheus-exporter has seconds in its name, however it appears that it is microseconds. This name can create confusion when one wants to report it in a dashboard. That metric is defined in [https://github.com/apache/lucene-solr/blob/branch_8_5/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml#L247] {code} .metrics["solr.jvm"] | to_entries | .[] | select(.key == "os.processCpuTime") as $object | ($object.value / 1000.0) as $value | { name : "solr_metrics_jvm_os_cpu_time_seconds", type : "COUNTER", help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html;, label_names : ["item"], label_values : ["processCpuTime"], value: $value } {code} In the above config we see that the metric came from *os.processCpuTime*, which itself came from JMX call [getProcessCpuTime()|https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getProcessCpuTime()]. That javadoc says {code} long getProcessCpuTime() Returns the CPU time used by the process on which the Java virtual machine is running in nanoseconds. The returned value is of nanoseconds precision but not necessarily nanoseconds accuracy. This method returns -1 if the the platform does not support this operation. Returns: the CPU time used by the process in nanoseconds, or -1 if this operation is not supported. {code} Nanoseconds / 1000 is microseconds. Either the name or the computation should be updated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-15074) ClassCastException when repeating the same query param twice in Prometheus exporter config file
[ https://issues.apache.org/jira/browse/SOLR-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mathieu Marie updated SOLR-15074: - Description: I am using the prometheus-exporter to monitor my service, and I wish to scale down the amount of data exchanged between solr and prometheus. I can do that by updating the queries in the solr-exporter.xml file. I wanted to use a query that look like : {noformat} http://localhost:8983/solr/admin/metrics?group=solr.node=.*metrics.*requestTimes=p95_ms=count { "responseHeader":{ "status":0, "QTime":1}, "metrics":{ "solr.node":{ "ADMIN./admin/metrics.distrib.requestTimes":{ "count":0, "p95_ms":0.0}, "ADMIN./admin/metrics.local.requestTimes":{ "count":238, "p95_ms":1.984835}, "ADMIN./admin/metrics.requestTimes":{ "count":238, "p95_ms":1.995053}, "QUERY./admin/metrics/collector.distrib.requestTimes":{ "count":0, "p95_ms":0.0}, "QUERY./admin/metrics/collector.local.requestTimes":{ "count":0, "p95_ms":0.0}, "QUERY./admin/metrics/collector.requestTimes":{ "count":0, "p95_ms":0.0}, "QUERY./admin/metrics/history.distrib.requestTimes":{ "count":0, "p95_ms":0.0}, "QUERY./admin/metrics/history.local.requestTimes":{ "count":0, "p95_ms":0.0}, "QUERY./admin/metrics/history.requestTimes":{ "count":0, "p95_ms":0.0 {noformat} In that query I repeated the param property to provide two values: {{property=p95_ms=count}} My config file look like that : {noformat} /admin/metrics solr.core .*metrics.*requestTimes count p95_ms ... {noformat} Then when the prometheus-exporter starts, it produces a ClassCastException. {noformat} Exception in thread "main" java.lang.RuntimeException: java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap') at org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:231) at org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:213) Caused by: java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap') at org.apache.solr.prometheus.exporter.MetricsQuery.from(MetricsQuery.java:109) at org.apache.solr.prometheus.exporter.MetricsConfiguration.toMetricQueries(MetricsConfiguration.java:91) at org.apache.solr.prometheus.exporter.MetricsConfiguration.from(MetricsConfiguration.java:80) at org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:228) ... 1 more {noformat} This comes from a bad casting (obviously) in that code: {noformat} NamedList query = (NamedList) request.get("query"); NamedList queryParameters = (NamedList) query.get("params"); String path = (String) query.get("path"); String core = (String) query.get("core"); String collection = (String) query.get("collection"); List jsonQueries = (ArrayList) request.get("jsonQueries"); ModifiableSolrParams params = new ModifiableSolrParams(); if (queryParameters != null) { for (Map.Entry entrySet : (Set>) queryParameters.asShallowMap().entrySet()) { params.add(entrySet.getKey(), entrySet.getValue()); } } {noformat} was: I am using the prometheus-exporter to monitor my service, and I wish to scale down the amount of data exchanged between solr and prometheus. I can do that by updating the queries in the solr-exporter.xml file. I wanted to use a query that look like : {{http://localhost:8983/solr/admin/metrics?group=solr.node=.*metrics.*requestTimes=p95_ms=count}}{{ }}{{{}} {{ "responseHeader":{}} {{ "status":0,}} {{ "QTime":1},}} {{ "metrics":{}} {{ "solr.node":{}} {{ "ADMIN./admin/metrics.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "ADMIN./admin/metrics.local.requestTimes":{}} {{ "count":238,}} {{ "p95_ms":1.984835},}} {{ "ADMIN./admin/metrics.requestTimes":{}} {{ "count":238,}} {{ "p95_ms":1.995053},}} {{ "QUERY./admin/metrics/collector.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/collector.local.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/collector.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.local.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.requestTimes":{}} {{ "count":0,}} {{
[jira] [Created] (SOLR-15074) ClassCastException when repeating the same query param twice in Prometheus exporter config file
Mathieu Marie created SOLR-15074: Summary: ClassCastException when repeating the same query param twice in Prometheus exporter config file Key: SOLR-15074 URL: https://issues.apache.org/jira/browse/SOLR-15074 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - prometheus-exporter Affects Versions: 8.6.2 Reporter: Mathieu Marie I am using the prometheus-exporter to monitor my service, and I wish to scale down the amount of data exchanged between solr and prometheus. I can do that by updating the queries in the solr-exporter.xml file. I wanted to use a query that look like : {{http://localhost:8983/solr/admin/metrics?group=solr.node=.*metrics.*requestTimes=p95_ms=count}}{{ }}{{{}} {{ "responseHeader":{}} {{ "status":0,}} {{ "QTime":1},}} {{ "metrics":{}} {{ "solr.node":{}} {{ "ADMIN./admin/metrics.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "ADMIN./admin/metrics.local.requestTimes":{}} {{ "count":238,}} {{ "p95_ms":1.984835},}} {{ "ADMIN./admin/metrics.requestTimes":{}} {{ "count":238,}} {{ "p95_ms":1.995053},}} {{ "QUERY./admin/metrics/collector.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/collector.local.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/collector.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.distrib.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.local.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0},}} {{ "QUERY./admin/metrics/history.requestTimes":{}} {{ "count":0,}} {{ "p95_ms":0.0}} In that query I repeated the param property to provide two values: {{property=p95_ms=count}} {{My config file look like that :}} {{ }} {{ }} {{ /admin/metrics}} {{ }} {{ solr.core}} {{ .*metrics.*requestTimes}} {color:#FF}{{ count}}{color} {color:#FF}{{ p95_ms}}{color} {{ }} {{ }} {{ ...}} {{}} {{Then when the prometheus-exporter starts, it produces a ClassCastException.}} Exception in thread "main" java.lang.RuntimeException: java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap') at org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:231) at org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:213) Caused by: java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap') at org.apache.solr.prometheus.exporter.MetricsQuery.from(MetricsQuery.java:109) at org.apache.solr.prometheus.exporter.MetricsConfiguration.toMetricQueries(MetricsConfiguration.java:91) at org.apache.solr.prometheus.exporter.MetricsConfiguration.from(MetricsConfiguration.java:80) at org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:228) ... 1 more {{This comes from a bad casting (obviously) in that code:}} {{NamedList query = (NamedList) request.get("query");}} {{NamedList queryParameters = (NamedList) query.get("params");}} {{String path = (String) query.get("path");}} {{String core = (String) query.get("core");}} {{String collection = (String) query.get("collection");}} {{List jsonQueries = (ArrayList) request.get("jsonQueries");}} {{ModifiableSolrParams params = new ModifiableSolrParams();}} {{if (queryParameters != null) {}} {{{color:#FF} for (Map.Entry entrySet : (Set>) queryParameters.asShallowMap().entrySet()) {{color}}} {{{color:#FF} params.add(entrySet.getKey(), entrySet.getValue());{color}}} {{ }}} {{}}}{{}}{{}}{{}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15007) Aggregate core handler=/select and /update metrics at the node level metric too
Mathieu Marie created SOLR-15007: Summary: Aggregate core handler=/select and /update metrics at the node level metric too Key: SOLR-15007 URL: https://issues.apache.org/jira/browse/SOLR-15007 Project: Solr Issue Type: Wish Security Level: Public (Default Security Level. Issues are Public) Components: metrics Affects Versions: master (9.0) Reporter: Mathieu Marie At my company, we anticipate huge number of cores and would like to report aggregated view at the node level instead of the core level that will grow exponentially. Right now, we're aggregating all of the solr.cores metrics to compute per-cluster dashboards. But given that there are many admin handlers already reporting metrics at the node level, I wonder if we could aggregate _/update_, _/select_ and all the other handler counters in solr and expose them at the solr.node level too. It would requires (a lot) less data to transport, store and aggregate later, while still giving access to per core metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14699) Solr request logs should escape names, values (SolrQueryResponse.getToLogAsString)
[ https://issues.apache.org/jira/browse/SOLR-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17170050#comment-17170050 ] Mathieu Marie commented on SOLR-14699: -- We should not encode depending on the content. Else it becomes to difficult to understand if some values where already encoded or have been encoded because they contained a special character. It looks like a global parameter would be preferred in that case. > Solr request logs should escape names, values > (SolrQueryResponse.getToLogAsString) > -- > > Key: SOLR-14699 > URL: https://issues.apache.org/jira/browse/SOLR-14699 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Reporter: David Smiley >Priority: Minor > > {{SolrQueryResponse.getToLogAsString}} encodes the NamedList into a String > with simple space-separated pairs with name=value. However, it does no > escaping/encoding, and as-such a value might itself contain spaces and > equals. This is a problem if these logs are being parsed, and we'd like to > ensure we do so correctly. Note that SolrLogPostTool (aka "postlogs") parses > these logs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org