I tried the patches, but its crashing almost instantly...

page fault outside application, addr: 0x0000000056c00000
[registers]
RIP: 0x00000000403edd23 <???+1077861667>
RFL:
0x0000000000010206  CS:  0x0000000000000008  SS:  0x0000000000000010
RAX: 0x0000000056c00000  RBX: 0x0000200056c00040  RCX:
0x00000000004c0000  RDX: 0x0000000000000008
RSI: 0x00000000004c0000  RDI: 0x0000200056c00040  RBP:
0x0000200041501740  R8:  0x0000000000000000
R9:  0x000000005e7a7333  R10: 0x0000000000000000  R11:
0x0000000000000000  R12: 0x00000000004c0000
R13: 0x000000005e7a7333  R14: 0x0000000000000000  R15:
0x0000000000000000  RSP: 0x00002000415016f8
Aborted

[backtrace]
0x0000000040343779 <???+1077163897>
0x000000004034534d <mmu::vm_fault(unsigned long, exception_frame*)+397>
0x00000000403a667b <page_fault+155>
0x00000000403a54c6 <???+1077564614>
0x000010000174c2b0 <???+24429232>
0x0000000000000000 <???+0>

(gdb) osv heap
0xffff80000e9ad000 0x0000000022a53000
0xffff80000e9a2000 0x0000000000004000

Rick



On Mon, 2020-03-23 at 22:06 -0700, Waldek Kozaczuk wrote:
> I have sent a more complete patch that should also address
> fragmentation issue with requests >= 4K and < 2MB.
> 
> On Monday, March 23, 2020 at 6:12:51 PM UTC-4, Waldek Kozaczuk wrote:
> > I have just sent a new patch to the mailing list. I am hoping it
> > will address the OOM crash if my theory of heavy memory
> > fragmentation is right. It would be nice if Nadav could review it.
> > 
> > Regardless if you have another crash in production and are able to
> > connect with gdb, could you run 'osv heap' - it should show
> > free_page_ranges. If memory is heavily fragmented we should see a
> > long list.
> > 
> > It would be nice to recreate that load in dev env and capture the
> > memory trace data (BWT you do not need to enable backtrace to have
> > enough useful information). It would help us better understand how
> > memory is allocated by the app. I saw you send me one trace but it
> > does not seem to be revealing anything interesting.
> > 
> > Waldek 
> > 
> > On Monday, March 23, 2020 at 1:19:18 AM UTC-4, rickp wrote:
> > > On Sun, 2020-03-22 at 22:08 -0700, Waldek Kozaczuk wrote: 
> > > > 
> > > > 
> > > > On Monday, March 23, 2020 at 12:36:52 AM UTC-4, rickp wrote: 
> > > > > Looks to me like its trying to allocate 40MB but the
> > > available 
> > > > > memory 
> > > > > is 10GB, surely? 10933128KB is 10,933MB 
> > > > > 
> > > > 
> > > > I misread the number - forgot about 1K. 
> > > > 
> > > > Any chance you could run the app outside of production with
> > > memory 
> > > > tracing enabled - 
> > > > 
> > > https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py#tracing-memory-allocations
> > >  
> > > >  (without --trace-backtrace) for while? And then we can have a
> > > better 
> > > > sense of what kind of allocations it makes. The output of
> > > trace 
> > > > memory-analyzer would be really helpful. 
> > > 
> > > I can certainly run that local with locally generated workloads,
> > > which 
> > > should be close enough - but we've never managed to trigger the
> > > oom 
> > > condition that way (other than by really constraining the memory 
> > > artificially). It should be close enough though - let me see what
> > > I can 
> > > do. 
> > > 
> > > Rick 
> > > 
> > > 
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to osv-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/osv-dev/e88bb103-ca6f-4f12-b944-e2d1391e2f8e%40googlegroups.com
> .

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/8e9dae61b2f412998468916508404d4867e759a3.camel%40rossfell.co.uk.

Reply via email to