potiuk commented on issue #14924: URL: https://github.com/apache/airflow/issues/14924#issuecomment-911812067
Ah right. The last line you wrote is GOLD. That probably would explain it and it's NOT AN ISSUE. When you open many files Linux basically will use as much memory it can for file caches. Whenever you read or write a file, the blocks of disk are kept also in memory just in case the files needs to be accessed by any process. It also marks them dirty in case the blocks change and evicts such dirty blocks from memory. Also when some process needs more memory than it has available, it will evict some unused pages from memory to free them. Basically for any system, that writes files to logs continuously and the logs are not modified later, the cache memory will grow CONTINUOUSLY until the limit set by kernel configuration. So depending on what your Kernel configuration is (basically the Kernel of your Kubernetes Virtual machines under the hood), you will see the metrics growing continuously (up to the kernel limit). You can limit the memory available to your Scheduler container to limit it "per container" (via giving it less memory resources) but basically as much memory you give to the scheduler container, it will be used for cache after some time (and will not be explicitly freed - but it's not a problem because the memory is effectively "free" - it's just used for cache and it can be freed immediately when needed). That would PERFECTLY explain why the memory drops immediately after the files are deleted - those files are deleted so the cache for those files should also get deleted by the system immediately. Instead of looking at total memory used you should look at the **container_memory_working_set_bytes** - metrics. It reflects the actually "actively used" memory. You can read more here: https://blog.freshtracks.io/a-deep-dive-into-kubernetes-metrics-part-3-container-resource-metrics-361c5ee46e66 You can also test it by running (from https://linuxhint.com/clear_cache_linux/): `echo 1 > /proc/sys/vm/drop_caches` In the container. This should drop your caches immediately without deleting the files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
