The issue started again. 

629G    chunks_head
0       lock
4.0K    queries.active
9.3G    wal

There is numerous restart of Prometheus
Feb 17 09:02:02 kernel: Out of memory: Kill process 36580 (prometheus) 
score 844 or sacrifice child
Feb 17 09:08:36 kernel: Out of memory: Kill process 39001 (prometheus) 
score 846 or sacrifice child
Feb 17 09:16:02 kernel: Out of memory: Kill process 41074 (prometheus) 
score 845 or sacrifice child
Feb 17 09:22:17 kernel: Out of memory: Kill process 44665 (prometheus) 
score 844 or sacrifice child
Feb 17 09:29:25 kernel: Out of memory: Kill process 47234 (prometheus) 
score 844 or sacrifice child
Feb 17 09:36:06 kernel: Out of memory: Kill process 48970 (prometheus) 
score 846 or sacrifice child
Feb 17 09:43:21 kernel: Out of memory: Kill process 50661 (prometheus) 
score 844 or sacrifice child

but there is plenty of mem available in the servers.

              total        used        free      shared  buff/cache   
available
Mem:             47           5          31           0          10         
 40
Swap:             5           1           3
Total:           52           7          35

On Tuesday, February 1, 2022 at 5:21:32 PM UTC-5 Brian Candler wrote:

> On Tuesday, 1 February 2022 at 21:52:30 UTC Senthil wrote:
>
>> I started on Jan 31, so it's a day.
>>
>> # du -sck chunks_head/*
>> 54140   chunks_head/024326
>> 4       chunks_head/024327
>> 54144   total
>>
>
> That's perfectly reasonable: it's only 54MB (which is a long way from 
> 689GB!)
>
> Here's what I see on a moderately busy system:
>
> root@ldex-prometheus:~# du -sck /var/lib/prometheus/data/chunks_head/*
> 81004        /var/lib/prometheus/data/chunks_head/006831
> 77824        /var/lib/prometheus/data/chunks_head/006832
> 158828        total
>
> That's comparable to yours.
>
> Therefore, I think you need to keep an eye on this periodically.  If only 
> you had a monitoring system which could do this for you :-)
>
> If it does start to rise, that's when you'll need to check prometheus log 
> output and find out what's happening.  But this is very strange, and it 
> does seem to be something specific to your system.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0ed74316-a992-4fdf-ba77-9890cec75131n%40googlegroups.com.

Reply via email to