[
https://issues.apache.org/jira/browse/NIFI-10666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
René Zeidler updated NIFI-10666:
--------------------------------
Affects Version/s: 2.0.0-M2
1.25.0
Description:
We have created a default PrometheusReportingTask for our NiFi instance and
tried to consume the metrics with Prometheus. However, Prometheus threw the
following error:
{code:java}
ts=2022-10-19T12:25:18.110Z caller=scrape.go:1332 level=debug component="scrape
manager" scrape_pool=nifi-cluster target=http://***nifi***:9092/metrics
msg="Append failed" err="invalid UTF-8 label value" {code}
Upon further inspection, we noticed that the /metrics/ endpoint exposed by the
reporting task does not use UTF-8 encoding, which is required by Prometheus (as
documented here: [Exposition formats |
Prometheus|https://prometheus.io/docs/instrumenting/exposition_formats/]).
Our flow uses non-ASCII characters (in our case German umlauts like "ü"). As a
workaround, removing those characters fixes the Prometheus error, but this is
not practical for a large flow in German language.
Opening the /metrics/ endpoint in a browser confirms that the encoding used is
not UTF-8:
{code:java}
> document.characterSet
'windows-1252' {code}
----
The responsible code might be here:
[https://github.com/apache/nifi/blob/2be5c26f287469f4f19f0fa759d6c1b56dc0e348/nifi-nar-bundles/nifi-prometheus-bundle/nifi-prometheus-reporting-task/src/main/java/org/apache/nifi/reporting/prometheus/PrometheusServer.java#L67]
The PrometheusServer used by the reporting task uses an OutputStreamWriter with
the default encoding, instead of explicitly using UTF-8. The Content-Type
header set in that function also does not get passed along (see screenshot).
was:
We have created a default PrometheusReportingTask for our NiFi instance and
tried to consume the metrics with Prometheus. However, Prometheus threw the
following error:
{code:java}
ts=2022-10-19T12:25:18.110Z caller=scrape.go:1332 level=debug component="scrape
manager" scrape_pool=nifi-cluster target=http://***nifi***:9092/metrics
msg="Append failed" err="invalid UTF-8 label value" {code}
Upon further inspection, we noticed that the /metrics/ endpoint exposed by the
reporting task does not use UTF-8 encoding, which is required by Prometheus.
Our flow uses non-ASCII characters (in our case German umlauts like "ü"). As a
workaround, removing those characters fixes the Prometheus error, but this is
not practical for a large flow in German language.
Opening the /metrics/ endpoint in a browser confirms that the encoding used is
not UTF-8:
{code:java}
> document.characterSet
'windows-1252' {code}
----
The responsible code might be here:
[https://github.com/apache/nifi/blob/2be5c26f287469f4f19f0fa759d6c1b56dc0e348/nifi-nar-bundles/nifi-prometheus-bundle/nifi-prometheus-reporting-task/src/main/java/org/apache/nifi/reporting/prometheus/PrometheusServer.java#L67]
The PrometheusServer used by the reporting task uses an OutputStreamWriter with
the default encoding, instead of explicitly using UTF-8. The Content-Type
header set in that function also does not get passed along (see screenshot).
Environment: JVM with non-UTF-8 default encoding (e.g. default
Windows installation) (was: Windows Server 2019 Version 1809
)
Labels: encoding prometheus utf-8 (was: )
The issue still persist in current versions 1.25.0 and 2.0.0-M2.
I have confirmed that the issue is indeed caused by a non-UTF-8 default
encoding set in the JVM, like the issues NIFI-12669 and NIFI-12670. This
includes all standard Windows installations, which do not use UTF-8 as the
default encoding.
Promtheus requires UTF-8 encoding (as documented here: [Exposition formats |
Prometheus|https://prometheus.io/docs/instrumenting/exposition_formats/]), so
the encoding used for this endpoint should not depend on the system default
encoding.
> PrometheusReportingTask does not use UTF-8 encoding on /metrics/ endpoint
> -------------------------------------------------------------------------
>
> Key: NIFI-10666
> URL: https://issues.apache.org/jira/browse/NIFI-10666
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.17.0, 1.16.3, 1.18.0, 1.23.2, 1.25.0, 2.0.0-M2
> Environment: JVM with non-UTF-8 default encoding (e.g. default
> Windows installation)
> Reporter: René Zeidler
> Priority: Minor
> Labels: encoding, prometheus, utf-8
> Attachments: missing-header.png
>
>
> We have created a default PrometheusReportingTask for our NiFi instance and
> tried to consume the metrics with Prometheus. However, Prometheus threw the
> following error:
> {code:java}
> ts=2022-10-19T12:25:18.110Z caller=scrape.go:1332 level=debug
> component="scrape manager" scrape_pool=nifi-cluster
> target=http://***nifi***:9092/metrics msg="Append failed" err="invalid UTF-8
> label value" {code}
> Upon further inspection, we noticed that the /metrics/ endpoint exposed by
> the reporting task does not use UTF-8 encoding, which is required by
> Prometheus (as documented here: [Exposition formats |
> Prometheus|https://prometheus.io/docs/instrumenting/exposition_formats/]).
> Our flow uses non-ASCII characters (in our case German umlauts like "ü"). As
> a workaround, removing those characters fixes the Prometheus error, but this
> is not practical for a large flow in German language.
> Opening the /metrics/ endpoint in a browser confirms that the encoding used
> is not UTF-8:
> {code:java}
> > document.characterSet
> 'windows-1252' {code}
> ----
> The responsible code might be here:
> [https://github.com/apache/nifi/blob/2be5c26f287469f4f19f0fa759d6c1b56dc0e348/nifi-nar-bundles/nifi-prometheus-bundle/nifi-prometheus-reporting-task/src/main/java/org/apache/nifi/reporting/prometheus/PrometheusServer.java#L67]
> The PrometheusServer used by the reporting task uses an OutputStreamWriter
> with the default encoding, instead of explicitly using UTF-8. The
> Content-Type header set in that function also does not get passed along (see
> screenshot).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)