Re: [osv-dev] Lazily allocating thread stacks WIP

2020-01-12 Thread Waldek Kozaczuk
Hi Matthew,

Have you made any progress on it?

Nadav,
If you have a bit of time would you mind giving your feedback on what we 
are proposing/discussing here? I put some sort of summary of it in the 
comments of the issue 
- https://github.com/cloudius-systems/osv/issues/143#issuecomment-567232417.

Thanks,
Waldek

On Friday, December 13, 2019 at 8:21:18 PM UTC-5, Waldek Kozaczuk wrote:
>
>
>
> On Friday, December 13, 2019 at 5:26:46 PM UTC-5, Waldek Kozaczuk wrote:
>>
>> Some other thoughts:
>>
>> 1. All the increments and decrements of the *stack_page_read_counter* 
>> should be symmetrical. Decrementing it in wait_for_interrupts() does not 
>> seem to but if I remember correctly when I was trying to get it all to work 
>> I had to put it there, maybe because we start with 1. But I am not longer 
>> convinced it is necessary. We need to better understand it.
>>
>> 2. Ideally, we should not even be trying to read the next page on kernel 
>> threads or even better on threads with stack mapped with mmap_stack. 
>> That can be accomplished by initializing the *stack_page_read_counter *to 
>> some higher value than 1 (10 or 11) so it never reaches 0 to trigger page 
>> read. This possibly can be set as a 1 thing in the thread_main_c method 
>> (see arch/x64/arch-switch.hh). It received thread pointer so we should 
>> somehow be able to determine how its stack got constructed. If not we can 
>> add a boolean flag to the thread class. 
>>
>> I really meant: "we should not even be trying to read the next page on 
> kernel threads or even better on threads with stack NOT mapped with 
> mmap_stack""
>
>> 3. In theory, we could get by without *stack_page_read_counter *with 
>> just checking sched::preemptable() AND somehow read the flags register 
>> (PUSHF?) to directly check if interrupt flag is set or not in the 
>> read_stack_page method. But I  have a feeling it would be more expensive.
>>
>> 4. Using the counter may not be that expensive given we already have use 
>> a similar mechanism to implement preemptable().
>>
>> It would be nice to have Nadav to weigh in on that as it seems it all 
>> either is related to the scheduler or affects it in some way.
>>
>> On Wednesday, December 11, 2019 at 5:07:58 PM UTC-5, Matthew Pabst wrote:
>>>
>>> Great stuff Waldek! That solved a few of the issues I was running into. 
>>> However, tst-vfs.so is still failing occasionally with the error I 
>>> mentioned earlier (std::length_error), which apparently is usually the 
>>> result of an illegal memory access, probably caused by the changes to 
>>> thread::init_stack() or arch::read_next_stack_page(). I was trying to debug 
>>> the test case using GDB, but I couldn't figure out how to run tst-vfs.so in 
>>> debug mode like the wiki examples. What is the best way to do this?
>>>
>>
>> Are you saying that this test runs just fine without the lazy stack patch?
>>
>> The best way is to run the test in repeatable mode until it breaks like 
>> so:
>>
>> ./scripts/tests.py --name "/tests/tst-vfs.so" -r 
>>
>> Though by default when it fails or aborts it powers down automatically. 
>> There is no way to prevent it by passing an option to the test.py script, 
>> but you can manually edit the scripts/tests/testing.py and remove 
>> '"--power-off-on-abort" from the command line constructed by 
>> run_command_in_guest method. This will keep qemu and OSv running after 
>> crash and let you connect with gdb from another terminal. Best is also 
>> limit the number of vcpus to 1 (add '-c 1' to that same line in 
>> run_command_in_guest). You can debug using regular release version:
>>
>> gdb build/release/loader.elf
>> connect
>> osv syms
>> bt
>>
>> The interesting stack trace may be on other thread so use 'osv thread 
>> ' to switch to whatever thread you need to. See 
>> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
>>  for 
>> more details.
>>
>>>
>>> Another question I had was if the calls to barrier() are necessary in 
>>> your patch. Do they ensure that the running thread gets the correct 
>>> stack_page_read_counter?
>>>
>> As I understand barrier() is a hint to a compiler not to perform certain 
>> optimizations that might skew what we want to achieve. I was roughly 
>> following what we do around preempt_counter but not sure if that is 
>> necessary for stack_page_read_counter. Nadav would probably be the best 
>> person to explain that.
>>
>>>
>>> On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk 
>>> wrote:
>>>
 It was pretty late when I was sending this email. So to recap more 
 clearly how this patch works:

 1. The stack_
 page_read_counter is a thread-local variable initialized to 1 and 
 behaves similarly to the preempt_counter
 2. For every new thread (except the very first one) its thread_main_c 
 (see 
 https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323)
  
 calls irq_enable() and 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-13 Thread Waldek Kozaczuk


On Friday, December 13, 2019 at 5:26:46 PM UTC-5, Waldek Kozaczuk wrote:
>
> Some other thoughts:
>
> 1. All the increments and decrements of the *stack_page_read_counter* 
> should be symmetrical. Decrementing it in wait_for_interrupts() does not 
> seem to but if I remember correctly when I was trying to get it all to work 
> I had to put it there, maybe because we start with 1. But I am not longer 
> convinced it is necessary. We need to better understand it.
>
> 2. Ideally, we should not even be trying to read the next page on kernel 
> threads or even better on threads with stack mapped with mmap_stack. That 
> can be accomplished by initializing the *stack_page_read_counter *to some 
> higher value than 1 (10 or 11) so it never reaches 0 to trigger page read. 
> This possibly can be set as a 1 thing in the thread_main_c method (see 
> arch/x64/arch-switch.hh). It received thread pointer so we should somehow 
> be able to determine how its stack got constructed. If not we can add a 
> boolean flag to the thread class. 
>
> I really meant: "we should not even be trying to read the next page on 
kernel threads or even better on threads with stack NOT mapped with 
mmap_stack""

> 3. In theory, we could get by without *stack_page_read_counter *with just 
> checking sched::preemptable() AND somehow read the flags register (PUSHF?) 
> to directly check if interrupt flag is set or not in the read_stack_page 
> method. But I  have a feeling it would be more expensive.
>
> 4. Using the counter may not be that expensive given we already have use a 
> similar mechanism to implement preemptable().
>
> It would be nice to have Nadav to weigh in on that as it seems it all 
> either is related to the scheduler or affects it in some way.
>
> On Wednesday, December 11, 2019 at 5:07:58 PM UTC-5, Matthew Pabst wrote:
>>
>> Great stuff Waldek! That solved a few of the issues I was running into. 
>> However, tst-vfs.so is still failing occasionally with the error I 
>> mentioned earlier (std::length_error), which apparently is usually the 
>> result of an illegal memory access, probably caused by the changes to 
>> thread::init_stack() or arch::read_next_stack_page(). I was trying to debug 
>> the test case using GDB, but I couldn't figure out how to run tst-vfs.so in 
>> debug mode like the wiki examples. What is the best way to do this?
>>
>
> Are you saying that this test runs just fine without the lazy stack patch?
>
> The best way is to run the test in repeatable mode until it breaks like so:
>
> ./scripts/tests.py --name "/tests/tst-vfs.so" -r 
>
> Though by default when it fails or aborts it powers down automatically. 
> There is no way to prevent it by passing an option to the test.py script, 
> but you can manually edit the scripts/tests/testing.py and remove 
> '"--power-off-on-abort" from the command line constructed by 
> run_command_in_guest method. This will keep qemu and OSv running after 
> crash and let you connect with gdb from another terminal. Best is also 
> limit the number of vcpus to 1 (add '-c 1' to that same line in 
> run_command_in_guest). You can debug using regular release version:
>
> gdb build/release/loader.elf
> connect
> osv syms
> bt
>
> The interesting stack trace may be on other thread so use 'osv thread ' 
> to switch to whatever thread you need to. See 
> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
>  for 
> more details.
>
>>
>> Another question I had was if the calls to barrier() are necessary in 
>> your patch. Do they ensure that the running thread gets the correct 
>> stack_page_read_counter?
>>
> As I understand barrier() is a hint to a compiler not to perform certain 
> optimizations that might skew what we want to achieve. I was roughly 
> following what we do around preempt_counter but not sure if that is 
> necessary for stack_page_read_counter. Nadav would probably be the best 
> person to explain that.
>
>>
>> On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk 
>> wrote:
>>
>>> It was pretty late when I was sending this email. So to recap more 
>>> clearly how this patch works:
>>>
>>> 1. The stack_
>>> page_read_counter is a thread-local variable initialized to 1 and 
>>> behaves similarly to the preempt_counter
>>> 2. For every new thread (except the very first one) its thread_main_c 
>>> (see 
>>> https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323)
>>>  
>>> calls irq_enable() and preempt_enable() which more importantly decrements 
>>> and resets the stack_page_read_counter to 0.
>>> 3. From this point on any time preempt_disable() or the lock() method on 
>>> irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is 
>>> called which ALWAYS increments the stack_page_read_counter counter but 
>>> ONLY reads one byte from the page ahead on stack if that counter is 1 (to 
>>> prevent nesting problem). If nested preempt_disable() or lock() on those 
>>> irq 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-13 Thread Waldek Kozaczuk
Some other thoughts:

1. All the increments and decrements of the *stack_page_read_counter* 
should be symmetrical. Decrementing it in wait_for_interrupts() does not 
seem to but if I remember correctly when I was trying to get it all to work 
I had to put it there, maybe because we start with 1. But I am not longer 
convinced it is necessary. We need to better understand it.

2. Ideally, we should not even be trying to read the next page on kernel 
threads or even better on threads with stack mapped with mmap_stack. That 
can be accomplished by initializing the *stack_page_read_counter *to some 
higher value than 1 (10 or 11) so it never reaches 0 to trigger page read. 
This possibly can be set as a 1 thing in the thread_main_c method (see 
arch/x64/arch-switch.hh). It received thread pointer so we should somehow 
be able to determine how its stack got constructed. If not we can add a 
boolean flag to the thread class. 

3. In theory, we could get by without *stack_page_read_counter *with just 
checking sched::preemptable() AND somehow read the flags register (PUSHF?) 
to directly check if interrupt flag is set or not in the read_stack_page 
method. But I  have a feeling it would be more expensive.

4. Using the counter may not be that expensive given we already have use a 
similar mechanism to implement preemptable().

It would be nice to have Nadav to weigh in on that as it seems it all 
either is related to the scheduler or affects it in some way.

On Wednesday, December 11, 2019 at 5:07:58 PM UTC-5, Matthew Pabst wrote:
>
> Great stuff Waldek! That solved a few of the issues I was running into. 
> However, tst-vfs.so is still failing occasionally with the error I 
> mentioned earlier (std::length_error), which apparently is usually the 
> result of an illegal memory access, probably caused by the changes to 
> thread::init_stack() or arch::read_next_stack_page(). I was trying to debug 
> the test case using GDB, but I couldn't figure out how to run tst-vfs.so in 
> debug mode like the wiki examples. What is the best way to do this?
>

Are you saying that this test runs just fine without the lazy stack patch?

The best way is to run the test in repeatable mode until it breaks like so:

./scripts/tests.py --name "/tests/tst-vfs.so" -r 

Though by default when it fails or aborts it powers down automatically. 
There is no way to prevent it by passing an option to the test.py script, 
but you can manually edit the scripts/tests/testing.py and remove 
'"--power-off-on-abort" from the command line constructed by 
run_command_in_guest method. This will keep qemu and OSv running after 
crash and let you connect with gdb from another terminal. Best is also 
limit the number of vcpus to 1 (add '-c 1' to that same line in 
run_command_in_guest). You can debug using regular release version:

gdb build/release/loader.elf
connect
osv syms
bt

The interesting stack trace may be on other thread so use 'osv thread ' 
to switch to whatever thread you need to. See 
https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
 for 
more details.

>
> Another question I had was if the calls to barrier() are necessary in your 
> patch. Do they ensure that the running thread gets the correct 
> stack_page_read_counter?
>
As I understand barrier() is a hint to a compiler not to perform certain 
optimizations that might skew what we want to achieve. I was roughly 
following what we do around preempt_counter but not sure if that is 
necessary for stack_page_read_counter. Nadav would probably be the best 
person to explain that.

>
> On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk wrote:
>
>> It was pretty late when I was sending this email. So to recap more 
>> clearly how this patch works:
>>
>> 1. The stack_
>> page_read_counter is a thread-local variable initialized to 1 and 
>> behaves similarly to the preempt_counter
>> 2. For every new thread (except the very first one) its thread_main_c 
>> (see 
>> https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323)
>>  
>> calls irq_enable() and preempt_enable() which more importantly decrements 
>> and resets the stack_page_read_counter to 0.
>> 3. From this point on any time preempt_disable() or the lock() method on 
>> irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is 
>> called which ALWAYS increments the stack_page_read_counter counter but 
>> ONLY reads one byte from the page ahead on stack if that counter is 1 (to 
>> prevent nesting problem). If nested preempt_disable() or lock() on those 
>> irq locks is called it will only increment the counter but not read from 
>> stack.
>> 4. Any time preempt_enable() or unlock() on on irq_lock_type or 
>> irq_save_lock_type is called, correspondingly the 
>> stack_page_read_counter is decremented (eventually to 0).
>> 5. Lastly, any time wait_for_interrupt is called (re-enabled interrupts) 
>> we also decrement the 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-11 Thread Matthew Pabst
Great stuff Waldek! That solved a few of the issues I was running into. 
However, tst-vfs.so is still failing occasionally with the error I 
mentioned earlier (std::length_error), which apparently is usually the 
result of an illegal memory access, probably caused by the changes to 
thread::init_stack() or arch::read_next_stack_page(). I was trying to debug 
the test case using GDB, but I couldn't figure out how to run tst-vfs.so in 
debug mode like the wiki examples. What is the best way to do this?

Another question I had was if the calls to barrier() are necessary in your 
patch. Do they ensure that the running thread gets the correct 
stack_page_read_counter?

On Wednesday, December 11, 2019 at 9:08:11 AM UTC-6, Waldek Kozaczuk wrote:

> It was pretty late when I was sending this email. So to recap more clearly 
> how this patch works:
>
> 1. The stack_page_read_counter is a thread-local variable initialized to 
> 1 and behaves similarly to the preempt_counter
> 2. For every new thread (except the very first one) its thread_main_c (see 
> https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323)
>  
> calls irq_enable() and preempt_enable() which more importantly decrements 
> and resets the stack_page_read_counter to 0.
> 3. From this point on any time preempt_disable() or the lock() method on 
> irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is 
> called which ALWAYS increments the stack_page_read_counter counter but 
> ONLY reads one byte from the page ahead on stack if that counter is 1 (to 
> prevent nesting problem). If nested preempt_disable() or lock() on those 
> irq locks is called it will only increment the counter but not read from 
> stack.
> 4. Any time preempt_enable() or unlock() on on irq_lock_type or 
> irq_save_lock_type is called, correspondingly the stack_page_read_counter is 
> decremented (eventually to 0).
> 5. Lastly, any time wait_for_interrupt is called (re-enabled interrupts) 
> we also decrement the stack_page_read_counter
>
> On Wednesday, December 11, 2019 at 12:53:02 AM UTC-5, Waldek Kozaczuk 
> wrote:
>>
>> So here is a new very roughy patch that handles the problem of nested 
>> calls to disable interrupts and/or preemption:
>>
>> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
>> index 6803498f..70224472 100644
>> --- a/arch/x64/arch-switch.hh
>> +++ b/arch/x64/arch-switch.hh
>> @@ -148,6 +148,11 @@ void thread::init_stack()
>>  _state.rip = reinterpret_cast(thread_main);
>>  _state.rsp = stacktop;
>>  _state.exception_stack = _arch.exception_stack + 
>> sizeof(_arch.exception_stack);
>> +
>> +// Since thread stacks are lazily allocated and the thread initially 
>> starts
>> +// running with preemption disabled, we need to pre-fault the first 
>> stack page.
>> +volatile char r = *((char*)(stacktop-1));
>> +(void) r; // trick the compiler into thinking that r is used
>>  }
>>  
>>  void thread::setup_tcb()
>> diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
>> index 17df5f5c..e4b4 100644
>> --- a/arch/x64/arch.hh
>> +++ b/arch/x64/arch.hh
>> @@ -11,15 +11,32 @@
>>  #include "processor.hh"
>>  #include "msr.hh"
>>  
>> +#include 
>> +
>>  // namespace arch - architecture independent interface for architecture
>>  //  dependent operations (e.g. irq_disable vs. cli)
>>  
>> +namespace mmu {
>> +extern unsigned __thread stack_page_read_counter;
>> +}
>> +
>>  namespace arch {
>>  
>>  #define CACHELINE_ALIGNED __attribute__((aligned(64)))
>>  #define INSTR_SIZE_MIN 1
>>  #define ELF_IMAGE_START OSV_KERNEL_BASE
>>  
>> +inline void read_next_stack_page() {
>> +barrier();
>> +++mmu::stack_page_read_counter;
>> +if (mmu::stack_page_read_counter != 1) {
>> +return;
>> +}
>> +
>> +char i;
>> +asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
>> +}
>> +
>>  inline void irq_disable()
>>  {
>>  processor::cli();
>> @@ -41,6 +58,8 @@ inline void irq_enable()
>>  inline void wait_for_interrupt()
>>  {
>>  processor::sti_hlt();
>> +barrier();
>> +--mmu::stack_page_read_counter;
>>  }
>>  
>>  inline void halt_no_interrupts()
>> diff --git a/core/mmu.cc b/core/mmu.cc
>> index e8a10fd8..f883479f 100644
>> --- a/core/mmu.cc
>> +++ b/core/mmu.cc
>> @@ -43,6 +43,8 @@ extern const char text_start[], text_end[];
>>  
>>  namespace mmu {
>>  
>> +unsigned __thread stack_page_read_counter = 1;
>> +
>>  namespace bi = boost::intrusive;
>>  
>>  class vma_compare {
>> diff --git a/include/osv/irqlock.hh b/include/osv/irqlock.hh
>> index 2d5372fc..3507e608 100644
>> --- a/include/osv/irqlock.hh
>> +++ b/include/osv/irqlock.hh
>> @@ -10,10 +10,25 @@
>>  
>>  #include "arch.hh"
>>  
>> +namespace mmu {
>> +extern unsigned __thread stack_page_read_counter;
>> +}
>> +
>> +namespace arch {
>> +inline void read_next_stack_page();
>> +}
>> +
>>  class irq_lock_type {
>>  public:
>> -static void lock() { arch::irq_disable(); }
>> -

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-11 Thread Waldek Kozaczuk
It was pretty late when I was sending this email. So to recap more clearly 
how this patch works:

1. The stack_page_read_counter is a thread-local variable initialized to 1 
and behaves similarly to the preempt_counter
2. For every new thread (except the very first one) its thread_main_c (see 
https://github.com/cloudius-systems/osv/blob/master/arch/x64/arch-switch.hh#L312-L323)
 
calls irq_enable() and preempt_enable() which more importantly decrements 
and resets the stack_page_read_counter to 0.
3. From this point on any time preempt_disable() or the lock() method on 
irq_lock_type or irq_save_lock_type is called, the read_next_stack_page is 
called which ALWAYS increments the stack_page_read_counter counter but ONLY 
reads one byte from the page ahead on stack if that counter is 1 (to 
prevent nesting problem). If nested preempt_disable() or lock() on those 
irq locks is called it will only increment the counter but not read from 
stack.
4. Any time preempt_enable() or unlock() on on irq_lock_type or 
irq_save_lock_type is called, correspondingly the stack_page_read_counter is 
decremented (eventually to 0).
5. Lastly, any time wait_for_interrupt is called (re-enabled interrupts) we 
also decrement the stack_page_read_counter

On Wednesday, December 11, 2019 at 12:53:02 AM UTC-5, Waldek Kozaczuk wrote:
>
> So here is a new very roughy patch that handles the problem of nested 
> calls to disable interrupts and/or preemption:
>
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
> index 6803498f..70224472 100644
> --- a/arch/x64/arch-switch.hh
> +++ b/arch/x64/arch-switch.hh
> @@ -148,6 +148,11 @@ void thread::init_stack()
>  _state.rip = reinterpret_cast(thread_main);
>  _state.rsp = stacktop;
>  _state.exception_stack = _arch.exception_stack + 
> sizeof(_arch.exception_stack);
> +
> +// Since thread stacks are lazily allocated and the thread initially 
> starts
> +// running with preemption disabled, we need to pre-fault the first 
> stack page.
> +volatile char r = *((char*)(stacktop-1));
> +(void) r; // trick the compiler into thinking that r is used
>  }
>  
>  void thread::setup_tcb()
> diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
> index 17df5f5c..e4b4 100644
> --- a/arch/x64/arch.hh
> +++ b/arch/x64/arch.hh
> @@ -11,15 +11,32 @@
>  #include "processor.hh"
>  #include "msr.hh"
>  
> +#include 
> +
>  // namespace arch - architecture independent interface for architecture
>  //  dependent operations (e.g. irq_disable vs. cli)
>  
> +namespace mmu {
> +extern unsigned __thread stack_page_read_counter;
> +}
> +
>  namespace arch {
>  
>  #define CACHELINE_ALIGNED __attribute__((aligned(64)))
>  #define INSTR_SIZE_MIN 1
>  #define ELF_IMAGE_START OSV_KERNEL_BASE
>  
> +inline void read_next_stack_page() {
> +barrier();
> +++mmu::stack_page_read_counter;
> +if (mmu::stack_page_read_counter != 1) {
> +return;
> +}
> +
> +char i;
> +asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
> +}
> +
>  inline void irq_disable()
>  {
>  processor::cli();
> @@ -41,6 +58,8 @@ inline void irq_enable()
>  inline void wait_for_interrupt()
>  {
>  processor::sti_hlt();
> +barrier();
> +--mmu::stack_page_read_counter;
>  }
>  
>  inline void halt_no_interrupts()
> diff --git a/core/mmu.cc b/core/mmu.cc
> index e8a10fd8..f883479f 100644
> --- a/core/mmu.cc
> +++ b/core/mmu.cc
> @@ -43,6 +43,8 @@ extern const char text_start[], text_end[];
>  
>  namespace mmu {
>  
> +unsigned __thread stack_page_read_counter = 1;
> +
>  namespace bi = boost::intrusive;
>  
>  class vma_compare {
> diff --git a/include/osv/irqlock.hh b/include/osv/irqlock.hh
> index 2d5372fc..3507e608 100644
> --- a/include/osv/irqlock.hh
> +++ b/include/osv/irqlock.hh
> @@ -10,10 +10,25 @@
>  
>  #include "arch.hh"
>  
> +namespace mmu {
> +extern unsigned __thread stack_page_read_counter;
> +}
> +
> +namespace arch {
> +inline void read_next_stack_page();
> +}
> +
>  class irq_lock_type {
>  public:
> -static void lock() { arch::irq_disable(); }
> -static void unlock() { arch::irq_enable(); }
> +static void lock() { 
> +   arch::read_next_stack_page();
> +   arch::irq_disable(); 
> +}
> +static void unlock() { 
> +   arch::irq_enable(); 
> +   barrier();
> +   --mmu::stack_page_read_counter;
> +}
>  };
>  
>  class irq_save_lock_type {
> @@ -28,12 +43,15 @@ private:
>  inline void irq_save_lock_type::lock()
>  {
>  _flags.save();
> +arch::read_next_stack_page();
>  arch::irq_disable();
>  }
>  
>  inline void irq_save_lock_type::unlock()
>  {
>  _flags.restore();
> +barrier();
> +--mmu::stack_page_read_counter;
>  }
>  
>  extern irq_lock_type irq_lock;
> diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
> index 18edf441..694815f8 100644
> --- a/include/osv/mmu-defs.hh
> +++ b/include/osv/mmu-defs.hh
> @@ -84,6 +84,7 @@ enum {
>  mmap_small   = 1ul << 5,

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-10 Thread Waldek Kozaczuk
So here is a new very roughy patch that handles the problem of nested calls 
to disable interrupts and/or preemption:

diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
index 6803498f..70224472 100644
--- a/arch/x64/arch-switch.hh
+++ b/arch/x64/arch-switch.hh
@@ -148,6 +148,11 @@ void thread::init_stack()
 _state.rip = reinterpret_cast(thread_main);
 _state.rsp = stacktop;
 _state.exception_stack = _arch.exception_stack + 
sizeof(_arch.exception_stack);
+
+// Since thread stacks are lazily allocated and the thread initially 
starts
+// running with preemption disabled, we need to pre-fault the first 
stack page.
+volatile char r = *((char*)(stacktop-1));
+(void) r; // trick the compiler into thinking that r is used
 }
 
 void thread::setup_tcb()
diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
index 17df5f5c..e4b4 100644
--- a/arch/x64/arch.hh
+++ b/arch/x64/arch.hh
@@ -11,15 +11,32 @@
 #include "processor.hh"
 #include "msr.hh"
 
+#include 
+
 // namespace arch - architecture independent interface for architecture
 //  dependent operations (e.g. irq_disable vs. cli)
 
+namespace mmu {
+extern unsigned __thread stack_page_read_counter;
+}
+
 namespace arch {
 
 #define CACHELINE_ALIGNED __attribute__((aligned(64)))
 #define INSTR_SIZE_MIN 1
 #define ELF_IMAGE_START OSV_KERNEL_BASE
 
+inline void read_next_stack_page() {
+barrier();
+++mmu::stack_page_read_counter;
+if (mmu::stack_page_read_counter != 1) {
+return;
+}
+
+char i;
+asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
+}
+
 inline void irq_disable()
 {
 processor::cli();
@@ -41,6 +58,8 @@ inline void irq_enable()
 inline void wait_for_interrupt()
 {
 processor::sti_hlt();
+barrier();
+--mmu::stack_page_read_counter;
 }
 
 inline void halt_no_interrupts()
diff --git a/core/mmu.cc b/core/mmu.cc
index e8a10fd8..f883479f 100644
--- a/core/mmu.cc
+++ b/core/mmu.cc
@@ -43,6 +43,8 @@ extern const char text_start[], text_end[];
 
 namespace mmu {
 
+unsigned __thread stack_page_read_counter = 1;
+
 namespace bi = boost::intrusive;
 
 class vma_compare {
diff --git a/include/osv/irqlock.hh b/include/osv/irqlock.hh
index 2d5372fc..3507e608 100644
--- a/include/osv/irqlock.hh
+++ b/include/osv/irqlock.hh
@@ -10,10 +10,25 @@
 
 #include "arch.hh"
 
+namespace mmu {
+extern unsigned __thread stack_page_read_counter;
+}
+
+namespace arch {
+inline void read_next_stack_page();
+}
+
 class irq_lock_type {
 public:
-static void lock() { arch::irq_disable(); }
-static void unlock() { arch::irq_enable(); }
+static void lock() { 
+   arch::read_next_stack_page();
+   arch::irq_disable(); 
+}
+static void unlock() { 
+   arch::irq_enable(); 
+   barrier();
+   --mmu::stack_page_read_counter;
+}
 };
 
 class irq_save_lock_type {
@@ -28,12 +43,15 @@ private:
 inline void irq_save_lock_type::lock()
 {
 _flags.save();
+arch::read_next_stack_page();
 arch::irq_disable();
 }
 
 inline void irq_save_lock_type::unlock()
 {
 _flags.restore();
+barrier();
+--mmu::stack_page_read_counter;
 }
 
 extern irq_lock_type irq_lock;
diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
index 18edf441..694815f8 100644
--- a/include/osv/mmu-defs.hh
+++ b/include/osv/mmu-defs.hh
@@ -84,6 +84,7 @@ enum {
 mmap_small   = 1ul << 5,
 mmap_jvm_balloon = 1ul << 6,
 mmap_file= 1ul << 7,
+mmap_stack   = 1ul << 8,
 };
 
 enum {
diff --git a/include/osv/sched.hh b/include/osv/sched.hh
index 66c0a44a..35903b3c 100644
--- a/include/osv/sched.hh
+++ b/include/osv/sched.hh
@@ -41,6 +41,9 @@ void cancel_this_thread_alarm();
 namespace elf {
 struct tls_data;
 }
+namespace mmu {
+extern unsigned __thread stack_page_read_counter;
+}
 namespace osv {
 
@@ -1000,6 +1003,7 @@ inline void preempt()
 
 inline void preempt_disable()
 {
+arch::read_next_stack_page();
 ++preempt_counter;
 barrier();
 }
@@ -1008,6 +1012,7 @@ inline void preempt_enable()
 {
 barrier();
 --preempt_counter;
+--mmu::stack_page_read_counter;
 if (preemptable() && need_reschedule && arch::irq_enabled()) {
 cpu::schedule();
 }
diff --git a/libc/mman.cc b/libc/mman.cc
index bb573a80..5b0436e2 100644
--- a/libc/mman.cc
+++ b/libc/mman.cc
@@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)
 // and did us the courtesy of telling this to ue (via MAP_STACK),
 // let's return the courtesy by returning pre-faulted memory.
 // FIXME: If issue #143 is fixed, this workaround should be 
removed.
-mmap_flags |= mmu::mmap_populate;
+mmap_flags |= mmu::mmap_stack;
 }
 if (flags & MAP_SHARED) {
 mmap_flags |= mmu::mmap_shared;
diff --git a/libc/pthread.cc b/libc/pthread.cc
index 44b93b83..ea909d5d 100644
--- a/libc/pthread.cc
+++ b/libc/pthread.cc
@@ -139,7 +139,8 @@ namespace pthread_private {
 return {attr.stack_begin, 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-10 Thread Waldek Kozaczuk
So I think the thread::prepare_wait() is called in this context which 
indeed disables preemption:

1080 void thread::prepare_wait()
1081 {
1082 // After setting the thread's status to "waiting", we must not 
preempt it,
1083 // as it is no longer in "running" state and therefore will not 
return.
1084 preempt_disable();
1085 assert(_detached_state->st.load() == status::running);
1086 _detached_state->st.store(status::waiting);
1087 }


On Tuesday, December 10, 2019 at 6:03:20 PM UTC-5, Waldek Kozaczuk wrote:
>
> So I have applied your patch (had to manually reformat it from the email 
> so it might be somewhat different):
>
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
> index 6803498f..70224472 100644
> --- a/arch/x64/arch-switch.hh
> +++ b/arch/x64/arch-switch.hh
> @@ -148,6 +148,11 @@ void thread::init_stack()
>  _state.rip = reinterpret_cast(thread_main);
>  _state.rsp = stacktop;
>  _state.exception_stack = _arch.exception_stack + 
> sizeof(_arch.exception_stack);
> +
> +// Since thread stacks are lazily allocated and the thread initially 
> starts
> +// running with preemption disabled, we need to pre-fault the first 
> stack page.
> +volatile char r = *((char*)(stacktop-1));
> +(void) r; // trick the compiler into thinking that r is used
>  }
>  
>  void thread::setup_tcb()
> diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
> index 17df5f5c..a0c18aa4 100644
> --- a/arch/x64/arch.hh
> +++ b/arch/x64/arch.hh
> @@ -20,8 +20,14 @@ namespace arch {
>  #define INSTR_SIZE_MIN 1
>  #define ELF_IMAGE_START OSV_KERNEL_BASE
>  
> +inline void read_next_stack_page() {
> +char i;
> +asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
> +}
> +
>  inline void irq_disable()
>  {
> +read_next_stack_page();
>  processor::cli();
>  }
>  
> @@ -30,6 +36,7 @@ inline void irq_disable_notrace();
>  
>  inline void irq_disable_notrace()
>  {
> +read_next_stack_page();
>  processor::cli_notrace();
>  }
>  
> diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
> index 18edf441..694815f8 100644
> --- a/include/osv/mmu-defs.hh
> +++ b/include/osv/mmu-defs.hh
> @@ -84,6 +84,7 @@ enum {
>  mmap_small   = 1ul << 5,
>  mmap_jvm_balloon = 1ul << 6,
>  mmap_file= 1ul << 7,
> +mmap_stack   = 1ul << 8,
>  };
>  
>  enum {
> diff --git a/include/osv/sched.hh b/include/osv/sched.hh
> index 66c0a44a..3ae1c2d0 100644
> --- a/include/osv/sched.hh
> +++ b/include/osv/sched.hh
> @@ -1000,6 +1000,7 @@ inline void preempt()
>  
>  inline void preempt_disable()
>  {
> +arch::read_next_stack_page();
>  ++preempt_counter;
>  barrier();
>  }
> diff --git a/libc/mman.cc b/libc/mman.cc
> index bb573a80..5b0436e2 100644
> --- a/libc/mman.cc
> +++ b/libc/mman.cc
> @@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)
>  // and did us the courtesy of telling this to ue (via MAP_STACK),
>  // let's return the courtesy by returning pre-faulted memory.
>  // FIXME: If issue #143 is fixed, this workaround should be 
> removed.
> -mmap_flags |= mmu::mmap_populate;
> +mmap_flags |= mmu::mmap_stack;
>  }
>  if (flags & MAP_SHARED) {
>  mmap_flags |= mmu::mmap_shared;
> diff --git a/libc/pthread.cc b/libc/pthread.cc
> index 44b93b83..ea909d5d 100644
> --- a/libc/pthread.cc
> +++ b/libc/pthread.cc
> @@ -139,7 +139,8 @@ namespace pthread_private {
>  return {attr.stack_begin, attr.stack_size};
>  }
>  size_t size = attr.stack_size;
> -void *addr = mmu::map_anon(nullptr, size, mmu::mmap_populate, 
> mmu::perm_rw);
> +printf("--> Created stack of size: %ld\n", size);
> +void *addr = mmu::map_anon(nullptr, size, mmu::mmap_stack, 
> mmu::perm_rw);
>  mmu::mprotect(addr, attr.guard_size, 0);
>  sched::thread::stack_info si{addr, size};
>  si.deleter = free_stack;
>
> I ran unit tests once and all passed (I use the bare metal machine so the 
> KVM is enabled) but I have not stressed them much. 
>
> I played a bit more with Java apps to see if we truly see the memory 
> savings. Indeed simple Java hello world uses around 10MB less with 12 Java 
> application threads. 
>
> I also played with Java httpserver app (./scripts/build -j4 
> image=openjdk8-zulu-full,java-httpserver) and stressed it a bit with apache 
> bench (ab -k -c 50 -n 1000 http://localhost:8000/). Overall it worked but 
> here I got a crash which indicates some shortcomings of our solutions:
>
> (gdb) bt
> #0  abort (fmt=fmt@entry=0x40656db8 "Assertion failed: %s (%s: %s: %d)\n") 
> at runtime.cc:105
> #1  0x4023d4cf in __assert_fail (expr=expr@entry=0x4068f1c1 
> "sched::preemptable()", file=file@entry=0x40685125 "arch/x64/mmu.cc", 
> line=line@entry=37, func=func@entry=0x4068511a "page_fault") at 
> runtime.cc:139
> #2  0x403aee0c in page_fault (ef=0x80015048) at 
> arch/x64/arch-cpu.hh:107
> #3  
> #4  

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-10 Thread Waldek Kozaczuk
So I have applied your patch (had to manually reformat it from the email so 
it might be somewhat different):

diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
index 6803498f..70224472 100644
--- a/arch/x64/arch-switch.hh
+++ b/arch/x64/arch-switch.hh
@@ -148,6 +148,11 @@ void thread::init_stack()
 _state.rip = reinterpret_cast(thread_main);
 _state.rsp = stacktop;
 _state.exception_stack = _arch.exception_stack + 
sizeof(_arch.exception_stack);
+
+// Since thread stacks are lazily allocated and the thread initially 
starts
+// running with preemption disabled, we need to pre-fault the first 
stack page.
+volatile char r = *((char*)(stacktop-1));
+(void) r; // trick the compiler into thinking that r is used
 }
 
 void thread::setup_tcb()
diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
index 17df5f5c..a0c18aa4 100644
--- a/arch/x64/arch.hh
+++ b/arch/x64/arch.hh
@@ -20,8 +20,14 @@ namespace arch {
 #define INSTR_SIZE_MIN 1
 #define ELF_IMAGE_START OSV_KERNEL_BASE
 
+inline void read_next_stack_page() {
+char i;
+asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
+}
+
 inline void irq_disable()
 {
+read_next_stack_page();
 processor::cli();
 }
 
@@ -30,6 +36,7 @@ inline void irq_disable_notrace();
 
 inline void irq_disable_notrace()
 {
+read_next_stack_page();
 processor::cli_notrace();
 }
 
diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
index 18edf441..694815f8 100644
--- a/include/osv/mmu-defs.hh
+++ b/include/osv/mmu-defs.hh
@@ -84,6 +84,7 @@ enum {
 mmap_small   = 1ul << 5,
 mmap_jvm_balloon = 1ul << 6,
 mmap_file= 1ul << 7,
+mmap_stack   = 1ul << 8,
 };
 
 enum {
diff --git a/include/osv/sched.hh b/include/osv/sched.hh
index 66c0a44a..3ae1c2d0 100644
--- a/include/osv/sched.hh
+++ b/include/osv/sched.hh
@@ -1000,6 +1000,7 @@ inline void preempt()
 
 inline void preempt_disable()
 {
+arch::read_next_stack_page();
 ++preempt_counter;
 barrier();
 }
diff --git a/libc/mman.cc b/libc/mman.cc
index bb573a80..5b0436e2 100644
--- a/libc/mman.cc
+++ b/libc/mman.cc
@@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)
 // and did us the courtesy of telling this to ue (via MAP_STACK),
 // let's return the courtesy by returning pre-faulted memory.
 // FIXME: If issue #143 is fixed, this workaround should be 
removed.
-mmap_flags |= mmu::mmap_populate;
+mmap_flags |= mmu::mmap_stack;
 }
 if (flags & MAP_SHARED) {
 mmap_flags |= mmu::mmap_shared;
diff --git a/libc/pthread.cc b/libc/pthread.cc
index 44b93b83..ea909d5d 100644
--- a/libc/pthread.cc
+++ b/libc/pthread.cc
@@ -139,7 +139,8 @@ namespace pthread_private {
 return {attr.stack_begin, attr.stack_size};
 }
 size_t size = attr.stack_size;
-void *addr = mmu::map_anon(nullptr, size, mmu::mmap_populate, 
mmu::perm_rw);
+printf("--> Created stack of size: %ld\n", size);
+void *addr = mmu::map_anon(nullptr, size, mmu::mmap_stack, 
mmu::perm_rw);
 mmu::mprotect(addr, attr.guard_size, 0);
 sched::thread::stack_info si{addr, size};
 si.deleter = free_stack;

I ran unit tests once and all passed (I use the bare metal machine so the 
KVM is enabled) but I have not stressed them much. 

I played a bit more with Java apps to see if we truly see the memory 
savings. Indeed simple Java hello world uses around 10MB less with 12 Java 
application threads. 

I also played with Java httpserver app (./scripts/build -j4 
image=openjdk8-zulu-full,java-httpserver) and stressed it a bit with apache 
bench (ab -k -c 50 -n 1000 http://localhost:8000/). Overall it worked but 
here I got a crash which indicates some shortcomings of our solutions:

(gdb) bt
#0  abort (fmt=fmt@entry=0x40656db8 "Assertion failed: %s (%s: %s: %d)\n") 
at runtime.cc:105
#1  0x4023d4cf in __assert_fail (expr=expr@entry=0x4068f1c1 
"sched::preemptable()", file=file@entry=0x40685125 "arch/x64/mmu.cc", 
line=line@entry=37, func=func@entry=0x4068511a "page_fault") at 
runtime.cc:139
#2  0x403aee0c in page_fault (ef=0x80015048) at 
arch/x64/arch-cpu.hh:107
#3  
#4  0x403606c5 in elf::object::symtab_len (this=) at 
core/elf.cc:896
#5  0x40360780 in elf::object::lookup_addr 
(this=0xa14c9c00, addr=addr@entry=0x10f25eba 
) at core/elf.cc:928
#6  0x4036094f in elf::programoperator() (__closure=, 
__closure=, ml=...) at core/elf.cc:1464
#7  elf::program::with_modules > (f=..., 
this=0xa0091bb0) at include/osv/elf.hh:687
#8  elf::program::lookup_addr (this=0xa0091bb0, 
addr=addr@entry=0x10f25eba ) at 
core/elf.cc:1461
#9  0x4043e540 in osv::lookup_name_demangled 
(addr=addr@entry=0x10f25eba , 
buf=buf@entry=0x8247a750 "condvar::wait(lockfree::mutex*, 
sched::timer*)+526", len=len@entry=1024) at core/demangle.cc:47
#10 0x4023d2b0 in print_backtrace () at 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-10 Thread Waldek Kozaczuk
Hi,

On Tuesday, December 10, 2019 at 1:26:31 PM UTC-5, Matthew Pabst wrote:
>
> On Sunday, December 8, 2019 at 1:03:45 PM UTC-6, Waldek Kozaczuk wrote:
>>
>>
>> I think it is more than just "wasting the last page of a small stack". 
>> Without a check to validate if we are reading within the bounds of the 
>> stack, we could cause unintended faults that would actually crash the app, 
>> right? 
>> If we want to make the logic in `irq_disable`  as efficient as possible, 
>> we could place the bottom of the stack address in some well-known place 
>> (thread-local, cpu-local?) on each stack switch and check against it before 
>> trying to read. But some sort of extra 'if' seems to be unavoidable. 
>> Otherwise, we will crash the app if the operates on the last page of the 
>> stack, right?
>>
>
> This is what I was originally getting at, but after thinking about it for 
> a while I think that if the next page of the stack is not mapped, then the 
> code may cause a fault regardless. Of course there is the possibility that 
> the code uses exactly the amount of stack pages allocated and no more, in 
> which case the extra read will cause a segmentation fault.
>
> Here's a more recent diff I've been working on that passes most test cases 
> initially:
>
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
>
> @@ -148,6 +148,11 @@ void thread::init_stack()
>
>  _state.exception_stack = _arch.exception_stack + 
> sizeof(_arch.exception_stack);
>
> +
>
> +// Since thread stacks are lazily allocated and the thread initially 
> starts
>
> +// running with preemption disabled, we need to pre-fault the first 
> stack page.
>
> +volatile char r = *((char*)(stacktop-1));
>
> +(void) r; // trick the compiler into thinking that r is used
>
>  }
>
>  
>
>  void thread::setup_tcb()
>
> diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
>
> @@ -20,8 +20,15 @@ namespace arch {
>
>  
>
> +inline void read_next_stack_page() {
>
> +char i;
>
> +asm volatile("movb -4096(%%rsp), %0" : "=r"(i));
>
> +}
>
> +
>
>  inline void irq_disable()
>
>  {
>
> +read_next_stack_page();
>
>  processor::cli();
>
>  }
>
>  
>
> @@ -30,6 +37,7 @@ inline void irq_disable_notrace();
>
>  
>
>  inline void irq_disable_notrace()
>
>  {
>
> +read_next_stack_page();
>
>  processor::cli_notrace();
>
>  }
>
>
>  enum {
>
> diff --git a/include/osv/sched.hh b/include/osv/sched.hh=
>
> @@ -1000,6 +1000,7 @@ inline void preempt()
>
>  
>
>  inline void preempt_disable()
>
>  {
>
> +arch::read_next_stack_page();
>
>  ++preempt_counter;
>
>  barrier();
>
>  }
>
> diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
>
> @@ -84,6 +84,7 @@ enum {
>
>  mmap_small   = 1ul << 5,
>
>  mmap_jvm_balloon = 1ul << 6,
>
>  mmap_file= 1ul << 7,
>
> +mmap_stack   = 1ul << 8,
>
>  };
>
> diff --git a/libc/mman.cc b/libc/mman.cc
>
> @@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)
>
>  // and did us the courtesy of telling this to ue (via MAP_STACK),
>
>  // let's return the courtesy by returning pre-faulted memory.
>
>  // FIXME: If issue #143 is fixed, this workaround should be 
> removed.
>
> -mmap_flags |= mmu::mmap_populate;
>
> +mmap_flags |= mmu::mmap_stack;
>
>  }
>
>  if (flags & MAP_SHARED) {
>
>  mmap_flags |= mmu::mmap_shared;
>
> diff --git a/libc/pthread.cc b/libc/pthread.cc=
>
> @@ -139,7 +139,7 @@ namespace pthread_private {
>
>  return {attr.stack_begin, attr.stack_size};
>
>  }
>
>  size_t size = attr.stack_size;
>
> -void *addr = mmu::map_anon(nullptr, size, mmu::mmap_populate, 
> mmu::perm_rw);
>
> +void *addr = mmu::map_anon(nullptr, size, mmu::mmap_stack, 
> mmu::perm_rw);
>
>  mmu::mprotect(addr, attr.guard_size, 0);
>
>  sched::thread::stack_info si{addr, size};
>  si.deleter = free_stack;  
>
> However there's some race condition still, as some tests fail after 
> running multiple times.
>
> tst-yield.so occasionally fails with the following logs: 
>
> OSv v0.54.0-50-g2b62eac1
>
> eth0: 192.168.122.15
>
> Booted up in 0.00 ms
>
> Cmdline: /tests/tst-yield.so
>
> 562665310100 < 601295415960, now=562665362700
>
>
> [backtrace]
>
> 0x403f57d8 
>
> QEMU: Terminated
>
> Test tst-yield.so FAILED
>
>
This crash looks similar to the one reported here 
- https://github.com/cloudius-systems/osv/issues/382#issuecomment-363589062. 
To best of our understanding, this issue seems to happen when hpet clock is 
used, which seems to be your case (I see message "booted up in 0.00 ms") 
which what happens if there is no paravirtual clock enabled. I bet you that 
when you run with the kvm on the issue disappears.  


tst-vfs.so occasionally fails with the following logs:
>
> OSv v0.54.0-50-g2b62eac1
>
> eth0: 192.168.122.15
>
> Booted up in 0.00 ms
>
> Cmdline: /tests/tst-vfs.so
>
> Running 3 test cases...
>
> Running dentry 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-10 Thread Matthew Pabst
On Sunday, December 8, 2019 at 1:03:45 PM UTC-6, Waldek Kozaczuk wrote:
>
>
> I think it is more than just "wasting the last page of a small stack". 
> Without a check to validate if we are reading within the bounds of the 
> stack, we could cause unintended faults that would actually crash the app, 
> right? 
> If we want to make the logic in `irq_disable`  as efficient as possible, 
> we could place the bottom of the stack address in some well-known place 
> (thread-local, cpu-local?) on each stack switch and check against it before 
> trying to read. But some sort of extra 'if' seems to be unavoidable. 
> Otherwise, we will crash the app if the operates on the last page of the 
> stack, right?
>

This is what I was originally getting at, but after thinking about it for a 
while I think that if the next page of the stack is not mapped, then the 
code may cause a fault regardless. Of course there is the possibility that 
the code uses exactly the amount of stack pages allocated and no more, in 
which case the extra read will cause a segmentation fault.

Here's a more recent diff I've been working on that passes most test cases 
initially:

diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh

@@ -148,6 +148,11 @@ void thread::init_stack()

 _state.exception_stack = _arch.exception_stack + 
sizeof(_arch.exception_stack);

+

+// Since thread stacks are lazily allocated and the thread initially 
starts

+// running with preemption disabled, we need to pre-fault the first 
stack page.

+volatile char r = *((char*)(stacktop-1));

+(void) r; // trick the compiler into thinking that r is used

 }

 

 void thread::setup_tcb()

diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh

@@ -20,8 +20,15 @@ namespace arch {

 

+inline void read_next_stack_page() {

+char i;

+asm volatile("movb -4096(%%rsp), %0" : "=r"(i));

+}

+

 inline void irq_disable()

 {

+read_next_stack_page();

 processor::cli();

 }

 

@@ -30,6 +37,7 @@ inline void irq_disable_notrace();

 

 inline void irq_disable_notrace()

 {

+read_next_stack_page();

 processor::cli_notrace();

 }


 enum {

diff --git a/include/osv/sched.hh b/include/osv/sched.hh=

@@ -1000,6 +1000,7 @@ inline void preempt()

 

 inline void preempt_disable()

 {

+arch::read_next_stack_page();

 ++preempt_counter;

 barrier();

 }

diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh

@@ -84,6 +84,7 @@ enum {

 mmap_small   = 1ul << 5,

 mmap_jvm_balloon = 1ul << 6,

 mmap_file= 1ul << 7,

+mmap_stack   = 1ul << 8,

 };

diff --git a/libc/mman.cc b/libc/mman.cc

@@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)

 // and did us the courtesy of telling this to ue (via MAP_STACK),

 // let's return the courtesy by returning pre-faulted memory.

 // FIXME: If issue #143 is fixed, this workaround should be 
removed.

-mmap_flags |= mmu::mmap_populate;

+mmap_flags |= mmu::mmap_stack;

 }

 if (flags & MAP_SHARED) {

 mmap_flags |= mmu::mmap_shared;

diff --git a/libc/pthread.cc b/libc/pthread.cc=

@@ -139,7 +139,7 @@ namespace pthread_private {

 return {attr.stack_begin, attr.stack_size};

 }

 size_t size = attr.stack_size;

-void *addr = mmu::map_anon(nullptr, size, mmu::mmap_populate, 
mmu::perm_rw);

+void *addr = mmu::map_anon(nullptr, size, mmu::mmap_stack, 
mmu::perm_rw);

 mmu::mprotect(addr, attr.guard_size, 0);

 sched::thread::stack_info si{addr, size};
 si.deleter = free_stack;  

However there's some race condition still, as some tests fail after running 
multiple times.

tst-yield.so occasionally fails with the following logs: 

OSv v0.54.0-50-g2b62eac1

eth0: 192.168.122.15

Booted up in 0.00 ms

Cmdline: /tests/tst-yield.so

562665310100 < 601295415960, now=562665362700


[backtrace]

0x403f57d8 

QEMU: Terminated

Test tst-yield.so FAILED

tst-vfs.so occasionally fails with the following logs:

OSv v0.54.0-50-g2b62eac1

eth0: 192.168.122.15

Booted up in 0.00 ms

Cmdline: /tests/tst-vfs.so

Running 3 test cases...

Running dentry hierarchy tests

dentry hierarchy tests succeeded

Running concurrent file operation tests

test1, with 10 threads

/git-repos/osv/tests/tst-vfs.cc(86): info:  passed





terminate called after throwing an instance of 'std::length_error'

  what():  basic_string::_S_create

Aborted


[backtrace]

0x404a2f6d 

0x408944ff 

Test tst-vfs.so FAILED

My initial thought was that I may need to read more than one stack page 
ahead, however implementing that idea didn't resolve the failures. Neither 
of the errors above are page faults, so I'm not sure where the problem 
could be.

Let me know if you have any idea what's going on,
Matthew


-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-08 Thread Waldek Kozaczuk


On Sunday, December 8, 2019 at 12:58:35 PM UTC-5, Nadav Har'El wrote:
>
> On Sun, Dec 8, 2019 at 4:10 PM Waldek Kozaczuk  > wrote:
>
>>
>>
>> On Sun, Dec 8, 2019 at 07:32 Nadav Har'El > > wrote:
>>
>>>
>>> On Sat, Dec 7, 2019 at 6:07 AM Waldek Kozaczuk >> > wrote:
>>>
 On second thought, let me tweak my suggestions and add some more.

 On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote:
>
>
>
> On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:
>>
>> Ah, I think I understand things better now.
>>
>> Here is my current idea for a solution:
>>
>> Everytime we disable interrupts and preemption (in irq_disable() and 
>> preempt_disable()), we call a method like ensure_stack_mapped(), which 
>> will 
>> do the following: 
>>
>
>>- If the current stack pointer is in a region mapped as 
>>mmu::mmap_stack (i.e. an application stack), then we read the next 
>> few 
>>stack pages.
>>- Otherwise the stack pointer is in a kernel stack, which is 
>>already mapped so reading ahead may cause a segmentation fault.
>>
>> I do not think you need to differentiate between "kernel" vs "app" 
> stack. In this context (forget fault handling kernel code that uses 
> separate stack) both kernel and application code would run on the same 
> application stack. Unless you meant simply checking if the reading byte 
> on 
> the stack 1 or 2 pages ahead still fits within the stack mapped area. 
> Which 
> is indeed what we have to do (application stack is almost full it is full 
> and we cannot do anything about). But I would start without making that 
> check.
>
 I think eventually we want to differentiate between "kernel" and 
 "application" threads and only ensure stack for application threads.

>>>
>>> Why are you saying this?
>>>
>>> The only reason I can think to do that is code which already has a tiny 
>>> stack (like 16KB) and we don't want to waste the last page of the stack.
>>> Is this the reason you were thinking of?
>>>
>> Well, the kernel stacks are always fully pre-allocated (and will stay 
>> like so, right?) so why bother triggering page fault if there will be none 
>> ever? But maybe extra if is more expensive. 
>>
>
> Ok, so it's not a "kernel" vs "application" code issue, but rather a 
> question of how the current thread's stack was allocated.
> We can easily have a thread-local boolean, whether the current stack was 
> allocated by mmap() or malloc(), and check that. But it will be an extra 
> if().
>
> The beauty of the simplistic approach I hoped we could do - to just read 
> one byte - was that such an instruction would, at least in theory, be even 
> cheaper than a if() needed to decide not to do it in the first place. But 
> wasting the last page of a small stack (especially the exception stacks, 
> etc.) may be more of a problem.
>
I think it is more than just "wasting the last page of a small stack". 
Without a check to validate if we are reading within the bounds of the 
stack, we could cause unintended faults that would actually crash the app, 
right? 
If we want to make the logic in `irq_disable`  as efficient as possible, we 
could place the bottom of the stack address in some well-known place 
(thread-local, cpu-local?) on each stack switch and check against it before 
trying to read. But some sort of extra 'if' seems to be unavoidable. 
Otherwise, we will crash the app if the operates on the last page of the 
stack, right?

>  
>
>>
>> But I am worried we might need some “if” just to make sure we do not 
>> cause some stack overflow and page fault into area that is not mapped when 
>> for example the code is executing very deep almost at the bottom of the 
>> stack. Especially it app uses tiny stacks like Golang does?
>>
>>>
>>> Why not "mov" for reading? Why writing? 
>>>
>> Where would you “mov” to?
>>
>
> I think you can do something like this (completely untested):
>
> int i;
> asm volatile("mov -4096(%rsp), %0" : "=r"(i));
>
>
> To let the compiler choose the register to write to, hoping there will be 
> one free in the calling code. I don't know if there's a mov replacement 
> which doesn't store its result anywhere :-(
>
>  
>
>> Can we use any general register without affecting surrounding compiler 
>> generated code? We could always push and then pop the read value but that 
>> would be 2 instructions instead of single one;
>>
>> push -4096(%rsp)
>> pop 
>>
>
>>
>>>
>>>
 mov $0,-4096(%rsp)
 mov $0,-8192(%rsp) //For second page, etc


 This effectively would read single-byte 4096 bytes deep in the stack 
> and automatically trigger page fault right at this moment IF -4096(%
> rsp)has not been allocated yet. For starters, it might be as simple 
> as that. 
>
> I am not sure if cmp is the best instruction to use here or maybe 
> 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-08 Thread Nadav Har'El
On Sun, Dec 8, 2019 at 4:10 PM Waldek Kozaczuk  wrote:

>
>
> On Sun, Dec 8, 2019 at 07:32 Nadav Har'El  wrote:
>
>>
>> On Sat, Dec 7, 2019 at 6:07 AM Waldek Kozaczuk 
>> wrote:
>>
>>> On second thought, let me tweak my suggestions and add some more.
>>>
>>> On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote:



 On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:
>
> Ah, I think I understand things better now.
>
> Here is my current idea for a solution:
>
> Everytime we disable interrupts and preemption (in irq_disable() and
> preempt_disable()), we call a method like ensure_stack_mapped(), which 
> will
> do the following:
>

>- If the current stack pointer is in a region mapped as
>mmu::mmap_stack (i.e. an application stack), then we read the next few
>stack pages.
>- Otherwise the stack pointer is in a kernel stack, which is
>already mapped so reading ahead may cause a segmentation fault.
>
> I do not think you need to differentiate between "kernel" vs "app"
 stack. In this context (forget fault handling kernel code that uses
 separate stack) both kernel and application code would run on the same
 application stack. Unless you meant simply checking if the reading byte on
 the stack 1 or 2 pages ahead still fits within the stack mapped area. Which
 is indeed what we have to do (application stack is almost full it is full
 and we cannot do anything about). But I would start without making that
 check.

>>> I think eventually we want to differentiate between "kernel" and
>>> "application" threads and only ensure stack for application threads.
>>>
>>
>> Why are you saying this?
>>
>> The only reason I can think to do that is code which already has a tiny
>> stack (like 16KB) and we don't want to waste the last page of the stack.
>> Is this the reason you were thinking of?
>>
> Well, the kernel stacks are always fully pre-allocated (and will stay like
> so, right?) so why bother triggering page fault if there will be none ever?
> But maybe extra if is more expensive.
>

Ok, so it's not a "kernel" vs "application" code issue, but rather a
question of how the current thread's stack was allocated.
We can easily have a thread-local boolean, whether the current stack was
allocated by mmap() or malloc(), and check that. But it will be an extra
if().

The beauty of the simplistic approach I hoped we could do - to just read
one byte - was that such an instruction would, at least in theory, be even
cheaper than a if() needed to decide not to do it in the first place. But
wasting the last page of a small stack (especially the exception stacks,
etc.) may be more of a problem.


>
> But I am worried we might need some “if” just to make sure we do not cause
> some stack overflow and page fault into area that is not mapped when for
> example the code is executing very deep almost at the bottom of the stack.
> Especially it app uses tiny stacks like Golang does?
>
>>
>> Why not "mov" for reading? Why writing?
>>
> Where would you “mov” to?
>

I think you can do something like this (completely untested):

int i;
asm volatile("mov -4096(%rsp), %0" : "=r"(i));


To let the compiler choose the register to write to, hoping there will be
one free in the calling code. I don't know if there's a mov replacement
which doesn't store its result anywhere :-(



> Can we use any general register without affecting surrounding compiler
> generated code? We could always push and then pop the read value but that
> would be 2 instructions instead of single one;
>
> push -4096(%rsp)
> pop
>

>
>>
>>
>>> mov $0,-4096(%rsp)
>>> mov $0,-8192(%rsp) //For second page, etc
>>>
>>>
>>> This effectively would read single-byte 4096 bytes deep in the stack and
 automatically trigger page fault right at this moment IF -4096(%rsp)has
 not been allocated yet. For starters, it might be as simple as that.

 I am not sure if cmp is the best instruction to use here or maybe there
 is a better assembly instruction for that.

 Finally to see the effect of this logic you will need to revert this
>>> commit -
>>> https://github.com/cloudius-systems/osv/commit/41efdc1cb8b19216d98fc0baf38e5312fd3c27d9
>>> .
>>>
>>> Testing these changes will be fun :-) I would start with unit tests and
>>> especially use tst-pipe to run it 100 or more times to verify things do not
>>> break. Then it would nice to measure the improvements of memory
>>> utilizations - please see this email thread -
>>> https://groups.google.com/d/msg/osv-dev/LqUAWjzVpSc/sF-G4dYYBgAJ.
>>> Hopefully, with your changes, my Java example with 400 threads will need
>>> much less memory than 600MB.
>>>
>>> Good luck!
>>>

 Let me know what you think,
> Matthew
>
> On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:
>>
>> On Tue, Dec 3, 2019 at 5:56 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-08 Thread Waldek Kozaczuk
On Sun, Dec 8, 2019 at 07:32 Nadav Har'El  wrote:

>
> On Sat, Dec 7, 2019 at 6:07 AM Waldek Kozaczuk 
> wrote:
>
>> On second thought, let me tweak my suggestions and add some more.
>>
>> On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote:
>>>
>>>
>>>
>>> On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:

 Ah, I think I understand things better now.

 Here is my current idea for a solution:

 Everytime we disable interrupts and preemption (in irq_disable() and
 preempt_disable()), we call a method like ensure_stack_mapped(), which will
 do the following:

>>>
- If the current stack pointer is in a region mapped as
mmu::mmap_stack (i.e. an application stack), then we read the next few
stack pages.
- Otherwise the stack pointer is in a kernel stack, which is
already mapped so reading ahead may cause a segmentation fault.

 I do not think you need to differentiate between "kernel" vs "app"
>>> stack. In this context (forget fault handling kernel code that uses
>>> separate stack) both kernel and application code would run on the same
>>> application stack. Unless you meant simply checking if the reading byte on
>>> the stack 1 or 2 pages ahead still fits within the stack mapped area. Which
>>> is indeed what we have to do (application stack is almost full it is full
>>> and we cannot do anything about). But I would start without making that
>>> check.
>>>
>> I think eventually we want to differentiate between "kernel" and
>> "application" threads and only ensure stack for application threads.
>>
>
> Why are you saying this?
>
> The only reason I can think to do that is code which already has a tiny
> stack (like 16KB) and we don't want to waste the last page of the stack.
> Is this the reason you were thinking of?
>
Well, the kernel stacks are always fully pre-allocated (and will stay like
so, right?) so why bother triggering page fault if there will be none ever?
But maybe extra if is more expensive.

But I am worried we might need some “if” just to make sure we do not cause
some stack overflow and page fault into area that is not mapped when for
example the code is executing very deep almost at the bottom of the stack.
Especially it app uses tiny stacks like Golang does?

>
>
>
>
>>> I also do not think you need ensure_stack_mapped(), I think we could
>>> directly change preempt_disable() and irq_disable() and prepend it with
>>> simple inline assembly and C like that:
>>>
>>> cmpb $0, -4096(%rsp)
>>>
>>> Given that the logic will be more than single assembly instruction, I
>> take mu suggestion back and agree with your ensure_stack_mapped() function
>> (or ensure_stack_allocated()). However, I still think it would be easier to
>> call it from preempt_disable() and irq_disable() instead of calling in all
>> places where these two are called.
>>
>
> I agree about the latter, but we need to be careful not to make this a
> complex function because preempt_disable() is supposed to be every
> efficient.
> I really hoped this function won't even have an if(). The hope was to have
> something very simple in the common case (where enough stack is available)
> but a more complex logic in the case of page fault.
>
>
>> Also instead of "cmp" instruction which is not side-effect-free, we could
>> actually write to instead of reading (I do not think it makes much
>> difference):
>>
>
> Why not "mov" for reading? Why writing?
>
Where would you “mov” to? Can we use any general register without affecting
surrounding compiler generated code? We could always push and then pop the
read value but that would be 2 instructions instead of single one;

push -4096(%rsp)
pop


>
>
>> mov $0,-4096(%rsp)
>> mov $0,-8192(%rsp) //For second page, etc
>>
>>
>> This effectively would read single-byte 4096 bytes deep in the stack and
>>> automatically trigger page fault right at this moment IF -4096(%rsp)has
>>> not been allocated yet. For starters, it might be as simple as that.
>>>
>>> I am not sure if cmp is the best instruction to use here or maybe there
>>> is a better assembly instruction for that.
>>>
>>> Finally to see the effect of this logic you will need to revert this
>> commit -
>> https://github.com/cloudius-systems/osv/commit/41efdc1cb8b19216d98fc0baf38e5312fd3c27d9
>> .
>>
>> Testing these changes will be fun :-) I would start with unit tests and
>> especially use tst-pipe to run it 100 or more times to verify things do not
>> break. Then it would nice to measure the improvements of memory
>> utilizations - please see this email thread -
>> https://groups.google.com/d/msg/osv-dev/LqUAWjzVpSc/sF-G4dYYBgAJ.
>> Hopefully, with your changes, my Java example with 400 threads will need
>> much less memory than 600MB.
>>
>> Good luck!
>>
>>>
>>> Let me know what you think,
 Matthew

 On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:
>
> On Tue, Dec 3, 2019 at 5:56 PM 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-08 Thread Nadav Har'El
On Sat, Dec 7, 2019 at 6:07 AM Waldek Kozaczuk  wrote:

> On second thought, let me tweak my suggestions and add some more.
>
> On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote:
>>
>>
>>
>> On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:
>>>
>>> Ah, I think I understand things better now.
>>>
>>> Here is my current idea for a solution:
>>>
>>> Everytime we disable interrupts and preemption (in irq_disable() and
>>> preempt_disable()), we call a method like ensure_stack_mapped(), which will
>>> do the following:
>>>
>>
>>>- If the current stack pointer is in a region mapped as
>>>mmu::mmap_stack (i.e. an application stack), then we read the next few
>>>stack pages.
>>>- Otherwise the stack pointer is in a kernel stack, which is already
>>>mapped so reading ahead may cause a segmentation fault.
>>>
>>> I do not think you need to differentiate between "kernel" vs "app"
>> stack. In this context (forget fault handling kernel code that uses
>> separate stack) both kernel and application code would run on the same
>> application stack. Unless you meant simply checking if the reading byte on
>> the stack 1 or 2 pages ahead still fits within the stack mapped area. Which
>> is indeed what we have to do (application stack is almost full it is full
>> and we cannot do anything about). But I would start without making that
>> check.
>>
> I think eventually we want to differentiate between "kernel" and
> "application" threads and only ensure stack for application threads.
>

Why are you saying this?

The only reason I can think to do that is code which already has a tiny
stack (like 16KB) and we don't want to waste the last page of the stack.
Is this the reason you were thinking of?



>> I also do not think you need ensure_stack_mapped(), I think we could
>> directly change preempt_disable() and irq_disable() and prepend it with
>> simple inline assembly and C like that:
>>
>> cmpb $0, -4096(%rsp)
>>
>> Given that the logic will be more than single assembly instruction, I
> take mu suggestion back and agree with your ensure_stack_mapped() function
> (or ensure_stack_allocated()). However, I still think it would be easier to
> call it from preempt_disable() and irq_disable() instead of calling in all
> places where these two are called.
>

I agree about the latter, but we need to be careful not to make this a
complex function because preempt_disable() is supposed to be every
efficient.
I really hoped this function won't even have an if(). The hope was to have
something very simple in the common case (where enough stack is available)
but a more complex logic in the case of page fault.


> Also instead of "cmp" instruction which is not side-effect-free, we could
> actually write to instead of reading (I do not think it makes much
> difference):
>

Why not "mov" for reading? Why writing?


> mov $0,-4096(%rsp)
> mov $0,-8192(%rsp) //For second page, etc
>
>
> This effectively would read single-byte 4096 bytes deep in the stack and
>> automatically trigger page fault right at this moment IF -4096(%rsp)has
>> not been allocated yet. For starters, it might be as simple as that.
>>
>> I am not sure if cmp is the best instruction to use here or maybe there
>> is a better assembly instruction for that.
>>
>> Finally to see the effect of this logic you will need to revert this
> commit -
> https://github.com/cloudius-systems/osv/commit/41efdc1cb8b19216d98fc0baf38e5312fd3c27d9
> .
>
> Testing these changes will be fun :-) I would start with unit tests and
> especially use tst-pipe to run it 100 or more times to verify things do not
> break. Then it would nice to measure the improvements of memory
> utilizations - please see this email thread -
> https://groups.google.com/d/msg/osv-dev/LqUAWjzVpSc/sF-G4dYYBgAJ.
> Hopefully, with your changes, my Java example with 400 threads will need
> much less memory than 600MB.
>
> Good luck!
>
>>
>> Let me know what you think,
>>> Matthew
>>>
>>> On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:

 On Tue, Dec 3, 2019 at 5:56 PM Matthew Pabst 
 wrote:

> Thanks for the response.
>
> I've been trying to think through the logic for this change more, and
> I'm don't completely understand the proposed solution.
>
> For example, say the program faults on a stack page. Then before
> disabling interrupts we access the next stack page(s). If one of those
> pages is not yet allocated, it will cause a second page fault.
>

 No, that is not a problem. The page-fault handling code already knows
 it can be used without a working stack - so we have separate interrupt and
 exception stacks which we switch to while handling those, and they work
 fine even if the user's stack is empty.

 Rather, the problem is what are in Linux known as "system calls", and
 in OSv are ordinary function calls.
 Imagine you do call read() or sleep() or 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-06 Thread Waldek Kozaczuk
On second thought, let me tweak my suggestions and add some more.

On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote:
>
>
>
> On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:
>>
>> Ah, I think I understand things better now.
>>
>> Here is my current idea for a solution:
>>
>> Everytime we disable interrupts and preemption (in irq_disable() and 
>> preempt_disable()), we call a method like ensure_stack_mapped(), which will 
>> do the following: 
>>
>
>>- If the current stack pointer is in a region mapped as 
>>mmu::mmap_stack (i.e. an application stack), then we read the next few 
>>stack pages.
>>- Otherwise the stack pointer is in a kernel stack, which is already 
>>mapped so reading ahead may cause a segmentation fault.
>>
>> I do not think you need to differentiate between "kernel" vs "app" stack. 
> In this context (forget fault handling kernel code that uses separate 
> stack) both kernel and application code would run on the same application 
> stack. Unless you meant simply checking if the reading byte on the stack 1 
> or 2 pages ahead still fits within the stack mapped area. Which is indeed 
> what we have to do (application stack is almost full it is full and we 
> cannot do anything about). But I would start without making that check.
>
I think eventually we want to differentiate between "kernel" and 
"application" threads and only ensure stack for application threads.

>
> I also do not think you need ensure_stack_mapped(), I think we could 
> directly change preempt_disable() and irq_disable() and prepend it with 
> simple inline assembly and C like that:
>
> cmpb $0, -4096(%rsp)
>
> Given that the logic will be more than single assembly instruction, I take 
mu suggestion back and agree with your ensure_stack_mapped() function (or 
ensure_stack_allocated()). However, I still think it would be easier to 
call it from preempt_disable() and irq_disable() instead of calling in all 
places where these two are called.

Also instead of "cmp" instruction which is not side-effect-free, we could 
actually write to instead of reading (I do not think it makes much 
difference):

mov $0,-4096(%rsp)
mov $0,-8192(%rsp) //For second page, etc


This effectively would read single-byte 4096 bytes deep in the stack and 
> automatically trigger page fault right at this moment IF -4096(%rsp)has 
> not been allocated yet. For starters, it might be as simple as that. 
>
> I am not sure if cmp is the best instruction to use here or maybe there is 
> a better assembly instruction for that.
>
> Finally to see the effect of this logic you will need to revert this 
commit - 
https://github.com/cloudius-systems/osv/commit/41efdc1cb8b19216d98fc0baf38e5312fd3c27d9
.

Testing these changes will be fun :-) I would start with unit tests and 
especially use tst-pipe to run it 100 or more times to verify things do not 
break. Then it would nice to measure the improvements of memory 
utilizations - please see this email thread -  
 https://groups.google.com/d/msg/osv-dev/LqUAWjzVpSc/sF-G4dYYBgAJ. 
Hopefully, with your changes, my Java example with 400 threads will need 
much less memory than 600MB.

Good luck!

>
> Let me know what you think,
>> Matthew
>>
>> On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:
>>>
>>> On Tue, Dec 3, 2019 at 5:56 PM Matthew Pabst  
>>> wrote:
>>>
 Thanks for the response.

 I've been trying to think through the logic for this change more, and 
 I'm don't completely understand the proposed solution.

 For example, say the program faults on a stack page. Then before 
 disabling interrupts we access the next stack page(s). If one of those 
 pages is not yet allocated, it will cause a second page fault. 

>>>
>>> No, that is not a problem. The page-fault handling code already knows it 
>>> can be used without a working stack - so we have separate interrupt and 
>>> exception stacks which we switch to while handling those, and they work 
>>> fine even if the user's stack is empty.
>>>
>>> Rather, the problem is what are in Linux known as "system calls", and in 
>>> OSv are ordinary function calls.
>>> Imagine you do call read() or sleep() or whatever in your application. 
>>> Some of the code implementing these
>>> functions, which considers itself "kernel" code, disables interrupts or 
>>> preemption. But it also uses the stack,
>>> and if it suddenly needs a new page of the stack, the resulting page 
>>> fault will refuse to run.
>>>  
>>>
 This second page fault will invoke the page fault handler, which will 
 attempt to run some code, which may access the first stack page and run 
 into deadlock.

 Is this a possible problem, or am I misunderstanding how page faults 
 are handled in OSv? 

 If this is a problem, could we switch to a pre-populated per-CPU 
 exception stack before calling the page fault handler?

>>>
>>> We do exactly that 

Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-06 Thread Waldek Kozaczuk


On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote:
>
> Ah, I think I understand things better now.
>
> Here is my current idea for a solution:
>
> Everytime we disable interrupts and preemption (in irq_disable() and 
> preempt_disable()), we call a method like ensure_stack_mapped(), which will 
> do the following: 
>

>- If the current stack pointer is in a region mapped as 
>mmu::mmap_stack (i.e. an application stack), then we read the next few 
>stack pages.
>- Otherwise the stack pointer is in a kernel stack, which is already 
>mapped so reading ahead may cause a segmentation fault.
>
> I do not think you need to differentiate between "kernel" vs "app" stack. 
In this context (forget fault handling kernel code that uses separate 
stack) both kernel and application code would run on the same application 
stack. Unless you meant simply checking if the reading byte on the stack 1 
or 2 pages ahead still fits within the stack mapped area. Which is indeed 
what we have to do (application stack is almost full it is full and we 
cannot do anything about). But I would start without making that check.

I also do not think you need ensure_stack_mapped(), I think we could 
directly change preempt_disable() and irq_disable() and prepend it with 
simple inline assembly and C like that:

cmpb $0, -4096(%rsp)

This effectively would read single-byte 4096 bytes deep in the stack and 
automatically trigger page fault right at this moment IF -4096(%rsp)has not 
been allocated yet. For starters, it might be as simple as that. 

I am not sure if cmp is the best instruction to use here or maybe there is 
a better assembly instruction for that.


Let me know what you think,
> Matthew
>
> On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:
>>
>> On Tue, Dec 3, 2019 at 5:56 PM Matthew Pabst  wrote:
>>
>>> Thanks for the response.
>>>
>>> I've been trying to think through the logic for this change more, and 
>>> I'm don't completely understand the proposed solution.
>>>
>>> For example, say the program faults on a stack page. Then before 
>>> disabling interrupts we access the next stack page(s). If one of those 
>>> pages is not yet allocated, it will cause a second page fault. 
>>>
>>
>> No, that is not a problem. The page-fault handling code already knows it 
>> can be used without a working stack - so we have separate interrupt and 
>> exception stacks which we switch to while handling those, and they work 
>> fine even if the user's stack is empty.
>>
>> Rather, the problem is what are in Linux known as "system calls", and in 
>> OSv are ordinary function calls.
>> Imagine you do call read() or sleep() or whatever in your application. 
>> Some of the code implementing these
>> functions, which considers itself "kernel" code, disables interrupts or 
>> preemption. But it also uses the stack,
>> and if it suddenly needs a new page of the stack, the resulting page 
>> fault will refuse to run.
>>  
>>
>>> This second page fault will invoke the page fault handler, which will 
>>> attempt to run some code, which may access the first stack page and run 
>>> into deadlock.
>>>
>>> Is this a possible problem, or am I misunderstanding how page faults are 
>>> handled in OSv? 
>>>
>>> If this is a problem, could we switch to a pre-populated per-CPU 
>>> exception stack before calling the page fault handler?
>>>
>>
>> We do exactly that already - see exception_stack and interrupt_stack in 
>> arch/x64/arch-cpu.hh
>>  
>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OSv Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to osv...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/osv-dev/92857c9c-3329-4743-bb88-a74053db613e%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/99639cbd-e560-44cb-bef3-9285da96ef92%40googlegroups.com.


Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-06 Thread Matthew Pabst
Ah, I think I understand things better now.

Here is my current idea for a solution:

Everytime we disable interrupts and preemption (in irq_disable() and 
preempt_disable()), we call a method like ensure_stack_mapped(), which will 
do the following:

   - If the current stack pointer is in a region mapped as mmu::mmap_stack 
   (i.e. an application stack), then we read the next few stack pages.
   - Otherwise the stack pointer is in a kernel stack, which is already 
   mapped so reading ahead may cause a segmentation fault.

Let me know what you think,
Matthew

On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote:
>
> On Tue, Dec 3, 2019 at 5:56 PM Matthew Pabst  > wrote:
>
>> Thanks for the response.
>>
>> I've been trying to think through the logic for this change more, and I'm 
>> don't completely understand the proposed solution.
>>
>> For example, say the program faults on a stack page. Then before 
>> disabling interrupts we access the next stack page(s). If one of those 
>> pages is not yet allocated, it will cause a second page fault. 
>>
>
> No, that is not a problem. The page-fault handling code already knows it 
> can be used without a working stack - so we have separate interrupt and 
> exception stacks which we switch to while handling those, and they work 
> fine even if the user's stack is empty.
>
> Rather, the problem is what are in Linux known as "system calls", and in 
> OSv are ordinary function calls.
> Imagine you do call read() or sleep() or whatever in your application. 
> Some of the code implementing these
> functions, which considers itself "kernel" code, disables interrupts or 
> preemption. But it also uses the stack,
> and if it suddenly needs a new page of the stack, the resulting page fault 
> will refuse to run.
>  
>
>> This second page fault will invoke the page fault handler, which will 
>> attempt to run some code, which may access the first stack page and run 
>> into deadlock.
>>
>> Is this a possible problem, or am I misunderstanding how page faults are 
>> handled in OSv? 
>>
>> If this is a problem, could we switch to a pre-populated per-CPU 
>> exception stack before calling the page fault handler?
>>
>
> We do exactly that already - see exception_stack and interrupt_stack in 
> arch/x64/arch-cpu.hh
>  
>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to osv...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/92857c9c-3329-4743-bb88-a74053db613e%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/9e0bde69-66e7-48a6-9b28-13ac13d0e109%40googlegroups.com.


Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-05 Thread Waldek Kozaczuk
Nadav would be the best authority to answer it. But I think that all 
exception handlers including page fault use its own exception stack 
(see 
https://github.com/cloudius-systems/osv/blob/3357ae0798fc3abece7a068d0b48cea050281852/arch/x64/arch-switch.hh#L86)
 
which is 4 pages long. So we should be fine. 

I think the main point of Nadav's idea is to trigger the page fault on the 
regular stack just before we disable preemption where we cannot handle page 
faults (I hope I am explaining it right). So the idea is to try to access a 
byte 4K (and possibly +8K, etc to make sure we have enough stack for the 
kernel code executing when preemption is disabled) ahead to where are now 
(per RSP) on the stack to cause a page fault if that part of the stack is 
not mapped to physical memory yet ("allocated").

On Tuesday, December 3, 2019 at 10:56:42 AM UTC-5, Matthew Pabst wrote:
>
> Thanks for the response.
>
> I've been trying to think through the logic for this change more, and I'm 
> don't completely understand the proposed solution.
>
> For example, say the program faults on a stack page. Then before disabling 
> interrupts we access the next stack page(s). If one of those pages is not 
> yet allocated, it will cause a second page fault. This second page fault 
> will invoke the page fault handler, which will attempt to run some code, 
> which may access the first stack page and run into deadlock.
>
> Is this a possible problem, or am I misunderstanding how page faults are 
> handled in OSv? 
>
> If this is a problem, could we switch to a pre-populated per-CPU exception 
> stack before calling the page fault handler?
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/4e28b12d-b6bc-4751-a1ba-99e49f524bcc%40googlegroups.com.


Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-03 Thread Matthew Pabst
Thanks for the response.

I've been trying to think through the logic for this change more, and I'm 
don't completely understand the proposed solution.

For example, say the program faults on a stack page. Then before disabling 
interrupts we access the next stack page(s). If one of those pages is not 
yet allocated, it will cause a second page fault. This second page fault 
will invoke the page fault handler, which will attempt to run some code, 
which may access the first stack page and run into deadlock.

Is this a possible problem, or am I misunderstanding how page faults are 
handled in OSv? 

If this is a problem, could we switch to a pre-populated per-CPU exception 
stack before calling the page fault handler?

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/92857c9c-3329-4743-bb88-a74053db613e%40googlegroups.com.


Re: [osv-dev] Lazily allocating thread stacks WIP

2019-12-02 Thread Nadav Har'El
On Tue, Dec 3, 2019 at 9:09 AM Matthew Pabst  wrote:

> I've been working on this issue
>  for the past couple
> days, and I have a couple questions as well as a WIP solution.
>
> I figured I would start by implementing a simple solution that accesses
> the next stack page in preempt_disable() and irq_disable() to ensure that
> the next stack page is mapped.
>
> Here's my WIP:
>
> diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh
>
> index 6803498f..11d8d3f8 100644
>
> --- a/arch/x64/arch-switch.hh
>
> +++ b/arch/x64/arch-switch.hh
>
> @@ -144,10 +144,17 @@ void thread::init_stack()
>
>  stack.deleter = stack.default_deleter;
>
>  }
>
>  void** stacktop = reinterpret_cast(stack.begin + stack.size);
>
> +
>
>  _state.rbp = this;
>
>  _state.rip = reinterpret_cast(thread_main);
>
>  _state.rsp = stacktop;
>
>  _state.exception_stack = _arch.exception_stack +
> sizeof(_arch.exception_stack);
>
> +
>
> +// We switch to a new thread with preemption disabled, so if the stack
>
> +// starts with a yet-unpopulated page, we will get a page fault when
> the
>
> +// thread starts. Read the first byte of the stack to get the fault
> now,
>
> +// when we're with preemption enabled.
>
> +asm volatile ("" : : "r" ( *((char*)stacktop - 1) ));
>
>  }
>
>
> This change is referenced in the original email thread (
> https://groups.google.com/d/msg/osv-dev/GCY8Q6vidPU/qsbaEU_gdsAJ) as a
> solution to the boot failure on the sched::preemptable() assertion in
> arch/x86/mmu.cc, however when it didn't resolve the issue for me. With this
> WIP, my code is failing on that assertion.
>
>
> Does this code result in any instruction being run? I'm not sure I see how.
You need to actually read (for example) in an actual instruction.

Also, reading the next byte is not enough... Imagine there are just a few
bytes left on the current page of the stack.
Reading one byte won't change anything, and then the kernel code gets
called, with preemption disabled,
with only a few bytes on the stack instead of a full page like we wanted.



> diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh
>
> index 17df5f5c..35e339cc 100644
>
> --- a/arch/x64/arch.hh
>
> +++ b/arch/x64/arch.hh
>
> @@ -16,12 +16,22 @@
>
>
>
>  namespace arch {
>
>
>
> +inline void access_next_stack_page() {
>
> +uint64_t rsp;
>
> +asm volatile("movq %%rsp, %[Var]" : [Var] "=r" (rsp));
>
> +char *next_stack_page = (char*) rsp - 4096;
>
> +*next_stack_page = 0;
>
> +}
>
> +
>
>  #define CACHELINE_ALIGNED __attribute__((aligned(64)))
>
>  #define INSTR_SIZE_MIN 1
>
>  #define ELF_IMAGE_START OSV_KERNEL_BASE
>
>
>
>  inline void irq_disable()
>
>  {
>
> +// To allow stack pages to be dynamically allocated,
>
> +// we must check that the next stack page is allocated before
> disabling interrupts
>
> +access_next_stack_page();
>
>  processor::cli();
>
>  }
>
>
>
>
> These changes reflect the need to access the next stack page before
> disabling interrupts. Would it be more clean to implement
> access_next_stack_page in a one-line assembly function, or leave it more
> human-readable?
>

In the above code it seems you do the same thing twice in two ways, just do
it once.
Human-readable is less important than efficiency (you can always write a
comment for humans), but it is possible that C code will anyway translate
to efficient assembly language code - you can disassemble (e.g., with
objdump -d) this function to see what instructions were generated from C
code.



>
> diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh
>
> index 18edf441..694815f8 100644
>
> --- a/include/osv/mmu-defs.hh
>
> +++ b/include/osv/mmu-defs.hh
>
> @@ -84,6 +84,7 @@ enum {
>
>  mmap_small   = 1ul << 5,
>
>  mmap_jvm_balloon = 1ul << 6,
>
>  mmap_file= 1ul << 7,
>
> +mmap_stack   = 1ul << 8,
>
>  };
>
>
>
>  enum {
>
> diff --git a/include/osv/sched.hh b/include/osv/sched.hh
>
> index 66c0a44a..f24df875 100644
>
> --- a/include/osv/sched.hh
>
> +++ b/include/osv/sched.hh
>
> @@ -1001,6 +1001,10 @@ inline void preempt()
>
>  inline void preempt_disable()
>
>  {
>
>  ++preempt_counter;
>
> +// Since stack pages are dynamically allocated,
>
> +// we must ensure that the next stack page is allocated before
> disabling interrupts
>
> +arch::access_next_stack_page();
>
> +
>
>  barrier();
>
>  }
>
>
>
> diff --git a/libc/mman.cc b/libc/mman.cc
>
> index bb573a80..5b0436e2 100644
>
> --- a/libc/mman.cc
>
> +++ b/libc/mman.cc
>
> @@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)
>
>  // and did us the courtesy of telling this to ue (via MAP_STACK),
>
>  // let's return the courtesy by returning pre-faulted memory.
>
>  // FIXME: If issue #143 is fixed, this workaround should be
> removed.
>
> -mmap_flags |= mmu::mmap_populate;
>
> +mmap_flags |= mmu::mmap_stack;
>
> 

[osv-dev] Lazily allocating thread stacks WIP

2019-12-02 Thread Matthew Pabst
I've been working on this issue 
 for the past couple 
days, and I have a couple questions as well as a WIP solution.

I figured I would start by implementing a simple solution that accesses the 
next stack page in preempt_disable() and irq_disable() to ensure that the 
next stack page is mapped.

Here's my WIP:

diff --git a/arch/x64/arch-switch.hh b/arch/x64/arch-switch.hh

index 6803498f..11d8d3f8 100644

--- a/arch/x64/arch-switch.hh

+++ b/arch/x64/arch-switch.hh

@@ -144,10 +144,17 @@ void thread::init_stack()

 stack.deleter = stack.default_deleter;

 }

 void** stacktop = reinterpret_cast(stack.begin + stack.size);

+

 _state.rbp = this;

 _state.rip = reinterpret_cast(thread_main);

 _state.rsp = stacktop;

 _state.exception_stack = _arch.exception_stack + 
sizeof(_arch.exception_stack);

+

+// We switch to a new thread with preemption disabled, so if the stack

+// starts with a yet-unpopulated page, we will get a page fault when 
the

+// thread starts. Read the first byte of the stack to get the fault 
now,

+// when we're with preemption enabled.

+asm volatile ("" : : "r" ( *((char*)stacktop - 1) ));

 }


This change is referenced in the original email thread 
(https://groups.google.com/d/msg/osv-dev/GCY8Q6vidPU/qsbaEU_gdsAJ) as a 
solution to the boot failure on the sched::preemptable() assertion in 
arch/x86/mmu.cc, however when it didn't resolve the issue for me. With this 
WIP, my code is failing on that assertion.



diff --git a/arch/x64/arch.hh b/arch/x64/arch.hh

index 17df5f5c..35e339cc 100644

--- a/arch/x64/arch.hh

+++ b/arch/x64/arch.hh

@@ -16,12 +16,22 @@

 

 namespace arch {

 

+inline void access_next_stack_page() {

+uint64_t rsp;

+asm volatile("movq %%rsp, %[Var]" : [Var] "=r" (rsp));

+char *next_stack_page = (char*) rsp - 4096;

+*next_stack_page = 0;

+}

+

 #define CACHELINE_ALIGNED __attribute__((aligned(64)))

 #define INSTR_SIZE_MIN 1

 #define ELF_IMAGE_START OSV_KERNEL_BASE

 

 inline void irq_disable()

 {

+// To allow stack pages to be dynamically allocated,

+// we must check that the next stack page is allocated before 
disabling interrupts

+access_next_stack_page();

 processor::cli();

 }




These changes reflect the need to access the next stack page before 
disabling interrupts. Would it be more clean to implement 
access_next_stack_page in a one-line assembly function, or leave it more 
human-readable?


diff --git a/include/osv/mmu-defs.hh b/include/osv/mmu-defs.hh

index 18edf441..694815f8 100644

--- a/include/osv/mmu-defs.hh

+++ b/include/osv/mmu-defs.hh

@@ -84,6 +84,7 @@ enum {

 mmap_small   = 1ul << 5,

 mmap_jvm_balloon = 1ul << 6,

 mmap_file= 1ul << 7,

+mmap_stack   = 1ul << 8,

 };

 

 enum {

diff --git a/include/osv/sched.hh b/include/osv/sched.hh

index 66c0a44a..f24df875 100644

--- a/include/osv/sched.hh

+++ b/include/osv/sched.hh

@@ -1001,6 +1001,10 @@ inline void preempt()

 inline void preempt_disable()

 {

 ++preempt_counter;

+// Since stack pages are dynamically allocated,

+// we must ensure that the next stack page is allocated before 
disabling interrupts

+arch::access_next_stack_page();

+

 barrier();

 }

 

diff --git a/libc/mman.cc b/libc/mman.cc

index bb573a80..5b0436e2 100644

--- a/libc/mman.cc

+++ b/libc/mman.cc

@@ -44,7 +44,7 @@ unsigned libc_flags_to_mmap(int flags)

 // and did us the courtesy of telling this to ue (via MAP_STACK),

 // let's return the courtesy by returning pre-faulted memory.

 // FIXME: If issue #143 is fixed, this workaround should be 
removed.

-mmap_flags |= mmu::mmap_populate;

+mmap_flags |= mmu::mmap_stack;

 }

 if (flags & MAP_SHARED) {

 mmap_flags |= mmu::mmap_shared;

diff --git a/libc/pthread.cc b/libc/pthread.cc

index 44b93b83..f063754b 100644

--- a/libc/pthread.cc

+++ b/libc/pthread.cc

@@ -139,7 +139,7 @@ namespace pthread_private {

 return {attr.stack_begin, attr.stack_size};

 }

 size_t size = attr.stack_size;

-void *addr = mmu::map_anon(nullptr, size, mmu::mmap_populate, 
mmu::perm_rw);

+void *addr = mmu::map_anon(nullptr, size, mmu::mmap_stack, 
mmu::perm_rw);

 mmu::mprotect(addr, attr.guard_size, 0);

 sched::thread::stack_info si{addr, size};

 si.deleter = free_stack;

These changes force dynamic allocation rather than the previous 
mmap_populate solution.

Interestingly, I couldn't find any test cases that called mmap in libc with 
the MAP_STACK flag - I'm assuming this is because it's a no-op in Linux. 
Instead all the stack allocations came from pthread::allocate_stack.

Let me know what you think of my WIP and my questions,
Matthew

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.