Hi HAWQ Community, overcommit_memory setting in linux control the behaviour of memory allocation. In cluster deployed with hawq and hadoop, it is controversial to set overcommit_memory for the nodes. To be specific, it is recommended to use overcommit strategy 2 by hawq, while it is recommended to use 1 or 0 in hadoop.
This thread is to start the discussion regarding the options to make a reasonable choice here so that it is good with both products. *1. From HAWQ perspective* It is recommended to use vm.overcommit_memory = 2 (other than 0 and 1) to prevent random kill of HAWQ process and thus backend reset. If nodes of the cluster are set to overcommit_memory = 0 or 1, there is risk that running query might get terminated due to backend reset. Even worse, with overcommit_memory = 1, there is chance that data file and transaction log might get corrupted due to insufficient cleanup during process exit when oom happens. More details of overcommit_memory setting in HAWQ can be found at: Linux-Overcommit-strategies-and-Pivotal-GPDB-HDB <https://discuss.pivotal.io/hc/en-us/articles/202703383-Linux-Overcommit-strategies-and-Pivotal-Greenplum-GPDB-Pivotal-HDB-HDB-> . *2. From Hadoop perspective* The crash of datanode usually happens when there is not enough heap memory for JVM. To be specific, JVM allocates more heap (via a malloc or mmap system call) and the address space has been exhausted. When overcommit_memory = 2 and we run out of available address space, the system will return ENOMEM for the system call, and the JVM will crash. This is due to the fact is that Java is very address space greedy. It will allocate large regions of address space that it isn't actually using. The overcommit_memory = 2 setting doesn't actually restrict physical memory use, it restricts address space use. Many applications (especially java) actually allocate sparse pages of memory, and rely on the kernel/OS to actually provide the memory as soon as a page fault occurs. Best regards, Ruilong Huo
