Hi I have been facing an issue with our prometheus (installed from helm). After the implementation of the remoteWrite towards our metricbeat we receive just like 5 mins of metrics and after that the connection stops. I can that the prometheus_remote_storage_shards are reaching their max shards and in the logs of the operator I see:
*2021-03-04T17:56:05.977Z caller=dedupe.go:111 component=remote level=info remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Remote storage resharding" from=3 to=7* *2021-03-04T17:56:05.977Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Flushing samples to remote storage..." count=8577* *2021-03-04T17:56:05.977Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Flushing samples to remote storage..." count=8580* *2021-03-04T17:56:05.988Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Flushing samples to remote storage..." count=8585* *2021-03-04T17:56:15.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488* *0564 minSendTimestamp=1614880565* *2021-03-04T17:56:25.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488* *0564 minSendTimestamp=1614880575* *2021-03-04T17:56:35.977Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488* *0564 minSendTimestamp=1614880585* *2021-03-04T17:56:45.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488* *0564 minSendTimestamp=1614880595* *2021-03-04T17:56:55.976Z caller=dedupe.go:111 component=remote level=warn remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Skipping resharding, last successful send was beyond threshold" lastSendTimestamp=161488* *0564 minSendTimestamp=1614880605* *2021-03-04T17:57:02.039Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Done flushing."* *2021-03-04T17:57:05.976Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg=QueueManager.calculateDesiredShards samplesInRate=1684.132190739004 samplesOutRate=261.7* *8113035956926 samplesKeptRatio=0.444250380874013 samplesPendingRate=486.3952368184191 samplesPending=44890.5820306793 samplesOutDuration=1.4510985561508776 timePerSample=0.005543174766484209 desiredShards=11.527856913093473 highestSent=1.614880565e+09 highestRecv=1.614880625e+09 inte* *gralAccumulator=395.5166384472137* *2021-03-04T17:57:05.977Z caller=dedupe.go:111 component=remote level=debug remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg=QueueManager.updateShardsLoop lowerBound=4.8999999999999995 desiredShards=11.52785691309* *3473 upperBound=9.1* *2021-03-04T17:57:05.977Z caller=dedupe.go:111 component=remote level=info remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Currently resharding, skipping."* *2021-03-04T17:57:05.977Z caller=dedupe.go:111 component=remote level=error remote_name=8a9071 url=http://prometheusmetricbeataks:9201/write msg="Failed to flush all samples on shutdown"* We have ChartName": "prometheus-operator" and "helmChartVersion": "8.12.3", that installs prometheus version 2.15.2 and the config of the remote write is: *remoteWrite:* * - url: " http://prometheusmetricbeataks:9201/write "* * writeRelabelConfigs:* * - sourceLabels: [observability]* * regex: 'true'* * action: keep* * queueConfig:* * capacity: 3000* * maxSamplesPerSend: 1000* * maxShards: 2000* I have been reading a bit and there are different views over the queueConfig paramethers but none of them have been good to our case. Did this happen to any of you ? Thanks in advace for your help Ruben -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/077e468b-bd07-4cf4-8e6b-85610486973dn%40googlegroups.com.

