Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Daniel P . Berrangé
On Fri, Jul 17, 2020 at 03:55:15PM +0100, Alex Bennée wrote:
> 
> Daniel P. Berrangé  writes:
> 
> > On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote:
> >> It turns out there are some 64 bit systems that have relatively low
> >> amounts of physical memory available to them (typically CI system).
> >> Even with swapping available a 1GB translation buffer that fills up
> >> can put the machine under increased memory pressure. Detect these low
> >> memory situations and reduce tb_size appropriately.
> >> 
> >> Fixes: 600e17b261
> >> Signed-off-by: Alex Bennée 
> >> Cc: BALATON Zoltan 
> >> Cc: Christian Ehrhardt 
> >> ---
> >>  accel/tcg/translate-all.c | 7 ++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> >> index 2afa46bd2b1..2ff0ba6d19b 100644
> >> --- a/accel/tcg/translate-all.c
> >> +++ b/accel/tcg/translate-all.c
> >> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t 
> >> tb_size)
> >>  {
> >>  /* Size the buffer.  */
> >>  if (tb_size == 0) {
> >> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> >> +size_t phys_mem = qemu_get_host_physmem();
> >> +if (phys_mem > 0 && phys_mem < (2 * 
> >> DEFAULT_CODE_GEN_BUFFER_SIZE)) {
> >> +tb_size = phys_mem / 4;
> >> +} else {
> >> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> >> +}
> >
> > I'm not convinced this is going to work when running QEMU inside a
> > container environment with RAM cap, because the physmem level is
> > completely unrelated to the RAM the container is permitted to actually
> > use in practice. ie host has 32 GB of RAM, but the container QEMU is
> > in only has 1 GB permitted.
> 
> What will happen when the mmap happens? Will a capped container limit
> the attempted mmap? I would hope the container case at least gave
> different feedback than a "silent" OOM.

IIRC it should trigger the OOM killer on process(s) within the container.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Alex Bennée


Daniel P. Berrangé  writes:

> On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote:
>> It turns out there are some 64 bit systems that have relatively low
>> amounts of physical memory available to them (typically CI system).
>> Even with swapping available a 1GB translation buffer that fills up
>> can put the machine under increased memory pressure. Detect these low
>> memory situations and reduce tb_size appropriately.
>> 
>> Fixes: 600e17b261
>> Signed-off-by: Alex Bennée 
>> Cc: BALATON Zoltan 
>> Cc: Christian Ehrhardt 
>> ---
>>  accel/tcg/translate-all.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
>> index 2afa46bd2b1..2ff0ba6d19b 100644
>> --- a/accel/tcg/translate-all.c
>> +++ b/accel/tcg/translate-all.c
>> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t 
>> tb_size)
>>  {
>>  /* Size the buffer.  */
>>  if (tb_size == 0) {
>> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
>> +size_t phys_mem = qemu_get_host_physmem();
>> +if (phys_mem > 0 && phys_mem < (2 * DEFAULT_CODE_GEN_BUFFER_SIZE)) {
>> +tb_size = phys_mem / 4;
>> +} else {
>> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
>> +}
>
> I'm not convinced this is going to work when running QEMU inside a
> container environment with RAM cap, because the physmem level is
> completely unrelated to the RAM the container is permitted to actually
> use in practice. ie host has 32 GB of RAM, but the container QEMU is
> in only has 1 GB permitted.

What will happen when the mmap happens? Will a capped container limit
the attempted mmap? I would hope the container case at least gave
different feedback than a "silent" OOM.

> I don't have much of a better suggestion, as I don't think we want
> to get into reading the cgroups memory limits. It does feel like the
> assumption we can blindly use a 1GB cache though is invalid even
> with this patch applied.
>
> Regards,
> Daniel


-- 
Alex Bennée



Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Daniel P . Berrangé
On Fri, Jul 17, 2020 at 11:51:39AM +0100, Alex Bennée wrote:
> It turns out there are some 64 bit systems that have relatively low
> amounts of physical memory available to them (typically CI system).
> Even with swapping available a 1GB translation buffer that fills up
> can put the machine under increased memory pressure. Detect these low
> memory situations and reduce tb_size appropriately.
> 
> Fixes: 600e17b261
> Signed-off-by: Alex Bennée 
> Cc: BALATON Zoltan 
> Cc: Christian Ehrhardt 
> ---
>  accel/tcg/translate-all.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 2afa46bd2b1..2ff0ba6d19b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t tb_size)
>  {
>  /* Size the buffer.  */
>  if (tb_size == 0) {
> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +size_t phys_mem = qemu_get_host_physmem();
> +if (phys_mem > 0 && phys_mem < (2 * DEFAULT_CODE_GEN_BUFFER_SIZE)) {
> +tb_size = phys_mem / 4;
> +} else {
> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +}

I'm not convinced this is going to work when running QEMU inside a
container environment with RAM cap, because the physmem level is
completely unrelated to the RAM the container is permitted to actually
use in practice. ie host has 32 GB of RAM, but the container QEMU is
in only has 1 GB permitted.

I don't have much of a better suggestion, as I don't think we want
to get into reading the cgroups memory limits. It does feel like the
assumption we can blindly use a 1GB cache though is invalid even
with this patch applied.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Christian Ehrhardt
On Fri, Jul 17, 2020 at 12:51 PM Alex Bennée  wrote:

> It turns out there are some 64 bit systems that have relatively low
> amounts of physical memory available to them (typically CI system).
> Even with swapping available a 1GB translation buffer that fills up
> can put the machine under increased memory pressure. Detect these low
> memory situations and reduce tb_size appropriately.
>
> Fixes: 600e17b261
> Signed-off-by: Alex Bennée 
> Cc: BALATON Zoltan 
> Cc: Christian Ehrhardt 
> ---
>  accel/tcg/translate-all.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 2afa46bd2b1..2ff0ba6d19b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t
> tb_size)
>  {
>  /* Size the buffer.  */
>  if (tb_size == 0) {
> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +size_t phys_mem = qemu_get_host_physmem();
> +if (phys_mem > 0 && phys_mem < (2 *
> DEFAULT_CODE_GEN_BUFFER_SIZE)) {
> +tb_size = phys_mem / 4;
>

In my experiments I've found that /8 more closely matches the former
behavior
on small hosts while at the same time not affecting common large hosts.


> +} else {
> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +}
>  }
>  if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>  tb_size = MIN_CODE_GEN_BUFFER_SIZE;
> --
> 2.20.1
>
>

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


[PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Alex Bennée
It turns out there are some 64 bit systems that have relatively low
amounts of physical memory available to them (typically CI system).
Even with swapping available a 1GB translation buffer that fills up
can put the machine under increased memory pressure. Detect these low
memory situations and reduce tb_size appropriately.

Fixes: 600e17b261
Signed-off-by: Alex Bennée 
Cc: BALATON Zoltan 
Cc: Christian Ehrhardt 
---
 accel/tcg/translate-all.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 2afa46bd2b1..2ff0ba6d19b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t tb_size)
 {
 /* Size the buffer.  */
 if (tb_size == 0) {
-tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+size_t phys_mem = qemu_get_host_physmem();
+if (phys_mem > 0 && phys_mem < (2 * DEFAULT_CODE_GEN_BUFFER_SIZE)) {
+tb_size = phys_mem / 4;
+} else {
+tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+}
 }
 if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
 tb_size = MIN_CODE_GEN_BUFFER_SIZE;
-- 
2.20.1