Re: [osv-dev] Re: [PATCH] stdio: fix setvbuf() to not set a weird buf size

2021-11-29 Thread Rick Payne
Thanks! That v2 patch solves the problem for me. Cheers Rick On Mon, 2021-11-29 at 14:33 +0200, Nadav Har'El wrote: > Oops, forgot a Makefile patch. I'll send a v2. > > -- > Nadav Har'El > n...@scylladb.com > > > On Mon, Nov 29, 2021 at 2:07 PM Nadav Har'El > wrote: > > When we upgraded

Re: [osv-dev] iso-read (cloud-init) issue

2021-11-28 Thread Rick Payne
he application was genuinely trying to > read a part of a block. OSv *could* implement this (reading more and > then dropping a part of it) but I'm not sure that it should - or why > this only started to happen recently - so I don't want to go in this > direction unless we know why. >

[osv-dev] iso-read (cloud-init) issue

2021-11-26 Thread Rick Payne
Hiya, Trying to get cloud-init working on a fairly recent OSv (0.55) image using an ISO to provide the YAML file. This used to work fine on my previous OSv image, but now I'm hitting an assert failure: Assertion failed: (uio->uio_resid % BSIZE) == 0 (fs/vfs/vfs_bdev.cc: bdev_read: 33)

Re: [osv-dev] shm_open / shm_unlink

2021-10-12 Thread Rick Payne
y step. Thanks for digging into the > missing pieces in OTP-24, if you want help or at least a reviewer I'm happy > to do what I can. > > -greg > >> On Mon, Oct 4, 2021 at 6:30 AM Rick Payne wrote: >> On Mon, 2021-10-04 at 10:37 +0300, Nadav Har'El wrote: >>

Re: [osv-dev] shm_open / shm_unlink

2021-10-04 Thread Rick Payne
On Mon, 2021-10-04 at 10:37 +0300, Nadav Har'El wrote: > You're welcome. If you have patches that might be useful to others as > well, please post them. Will do. Not there yet - but once I can verify things, I'll send patches.. Cheers Rick -- You received this message because you are

Re: [osv-dev] shm_open / shm_unlink

2021-10-04 Thread Rick Payne
Hi, On Mon, 2021-10-04 at 09:38 +0300, Nadav Har'El wrote: > > So the good (?) news is that the shm_* code didn't matter at all. > Maybe it isn't even getting used... The problem is something > completely different: Ah, I think the shm_open is still relevant. For now I'm including the musl file

Re: [osv-dev] shm_open / shm_unlink

2021-10-03 Thread Rick Payne
On Sun, 2021-10-03 at 11:04 +0300, Nadav Har'El wrote: > I'm curious where - > Maybe we have a bug in our /dev (fs/devfs/*) implementation? It > should generate good errors when trying to open /dev/shm/something - > not assertion failures and crashes. Well, this is one of those rabbit holes...

Re: [osv-dev] shm_open / shm_unlink

2021-10-03 Thread Rick Payne
On Sun, 2021-10-03 at 10:38 +0300, Nadav Har'El wrote: > > Shared memory support was never a big priority because it's main use > case is multiple processes that want to share memory, and those > (multiple processes) were never a thing in OSv. However, you're right > that it's a shame to lose

[osv-dev] shm_open / shm_unlink

2021-10-03 Thread Rick Payne
I've been playing around with my rebar3_osv tool (which turns an erlang application into an OSv image). I'm trying to move everything to the latest erlang release (24.1). OTP-24 comes with asmjit which can give quite a performance boost in some cases. However, it requires the use of shm_open

Re: [osv-dev] ISR and schd::preemptable problems in resolve_pltgot

2020-08-06 Thread Rick Payne
> On 6 Aug 2020, at 21:00, Nadav Har'El wrote: > > By the way, if this problem really bothered us, it should be possible with > relatively small effort to > make symbol-resolution lock-free. We actually have the list of objects > protected by RCU, > not a mutex (see commit

Re: [osv-dev] ISR and schd::preemptable problems in resolve_pltgot

2020-08-06 Thread Rick Payne
On Thu, 2020-08-06 at 11:02 +0300, Nadav Har'El wrote: > We have a trick that will do both the on-load symbol resolution and > loading the entire object > and not page by page: If you add: > > asm(".pushsection .note.osv-mlock, \"a\"; .long 0, 0, 0; > .popsection"); > > To your source code,

[osv-dev] ISR and schd::preemptable problems in resolve_pltgot

2020-08-05 Thread Rick Payne
I've been noodling with the old 'assigned virtio' code, trying to make it work again so I can use this method to get raw packets. See discusion on the 'raw sockets' thread. I'm mostly there (though I will admit, I'm far from a C++ programmer, so the code is pretty hideous), but I ran into an

Re: [osv-dev] Re: raw socket support ?

2020-07-23 Thread Rick Payne
On Thu, 2020-07-23 at 12:38 +0300, Nadav Har'El wrote: > We also had until recently something more general, "assigned virtio", > where the application > gets access directly to the viritio rings (and needs to work with > them - the kernel doesn't > touch them any more). Waldek recently removed it

Re: [osv-dev] Re: raw socket support ?

2020-07-23 Thread Rick Payne
Another alternative (for me and maybe others) would be to have a standard way to hook packets direct from the virtio interface. Especially if we could ensure that we don't dhcp on that interface. For instance, setup 2 interfaces from the host - 1 for OSv-y stuff, and one just for an application

Re: [osv-dev] Problems with socat

2020-07-22 Thread Rick Payne
that code wasn't really exercised). Rick On Wed, 2020-07-22 at 18:24 -0700, Dor Laor wrote: > Best is to put a breakpoint and start single stepping and read > those variable values > > On Wed, Jul 22, 2020 at 5:57 PM Rick Payne > wrote: > > Trying to characterise some perform

[osv-dev] Problems with socat

2020-07-22 Thread Rick Payne
Trying to characterise some performance stuff, I thought I'd run socat under OSv however it panics: $ sudo scripts/run.py -n -e 'socat tcp4-listen:6971 open:/dev/null'OSv v0.55.0 eth0: 192.168.122.76 Booted up in 3245.70 ms Cmdline: socat tcp4-listen:6971 open:/dev/null Assertion failed: type

Re: [osv-dev] [PATCH 2/3] lzloader: fix memset() implementation

2020-05-26 Thread Rick Payne
On Mon, 2020-05-25 at 09:15 +0300, Nadav Har'El wrote: > > Do you mean > https://github.com/cloudius-systems/osv/commit/d52cb12546ff2acd5255a2ac8897891e421f07dc > which just turned off optimization? Yes. > At the time, I suggested the memset() fix for #913, because seemed to > me like an

Re: [osv-dev] [PATCH 2/3] lzloader: fix memset() implementation

2020-05-24 Thread Rick Payne
I think this is also related to 913. I'll try without my patch (which we've been having to use since then). Rick On Sat, 2020-05-23 at 23:47 +0300, Nadav Har'El wrote: > Some compilers apparently optimize code in fastlz/ to call memset(), > so > the uncompressing boot loader,

Re: [osv-dev] Re: NMI crash in memcpy() between memory areas allocated with mmu::map_anon()

2020-03-28 Thread Rick Payne
With your latest 2 patches, our production box which was having problems has run fine for the last 48hours. Thanks for working so hard on fixing it! It has been quite the pain point for us. Are bugs 784 and 1077 something we should worry about? Rick On Thu, 2020-03-26 at 12:50 -0700, Waldek

Re: [osv-dev] Re: OOM query

2020-03-24 Thread Rick Payne
Kozaczuk wrote: > Is it with the exact same code as on master with the latest 2 patches > I sent applied ? Does ‘scripts/build check’ pass for you? > > On Tue, Mar 24, 2020 at 16:56 Rick Payne > wrote: > > I tried the patches, but its crashing almost instantly... > &

Re: [osv-dev] Re: OOM query

2020-03-24 Thread Rick Payne
I tried the patches, but its crashing almost instantly... page fault outside application, addr: 0x56c0 [registers] RIP: 0x403edd23 RFL: 0x00010206 CS: 0x0008 SS: 0x0010 RAX: 0x56c0 RBX: 0x200056c00040 RCX:

Re: [osv-dev] Re: OOM query

2020-03-22 Thread Rick Payne
On Sun, 2020-03-22 at 22:08 -0700, Waldek Kozaczuk wrote: > > > On Monday, March 23, 2020 at 12:36:52 AM UTC-4, rickp wrote: > > Looks to me like its trying to allocate 40MB but the available > > memory > > is 10GB, surely? 10933128KB is 10,933MB > > > > I misread the number - forgot about

Re: [osv-dev] Re: OOM query

2020-03-22 Thread Rick Payne
there > > recently as part of wide upgrade to Python 3. > > > > On Sun, Mar 22, 2020 at 20:27 Rick Payne > > wrote: > > > Does that command work for you? For me I get: > > > > > > (gdb) osv heap > > > Python Exception %x format:

Re: [osv-dev] Re: OOM query

2020-03-22 Thread Rick Payne
run ‘osv heap’ from gdb at this point? > > On Sun, Mar 22, 2020 at 19:31 Rick Payne > wrote: > > Ok, so i applied the patch, and the printf and this is what I see: > > > > page_range_allocator: no ranges found for size 39849984 and exact > > order: 14 > > Wa

Re: [osv-dev] Re: OOM query

2020-03-22 Thread Rick Payne
further. Current memory: > > 10871576Kb, target: 0 Kb > > > > 'osv mem' in gdb reports loads of free memory: > > > > (gdb) osv mem > > Total Memory: 12884372480 Bytes > > Mmap Memory: 1610067968 Bytes (12.50%) > > Free Memory: 11132493824 Byt

Re: [osv-dev] Re: OOM query

2020-03-21 Thread Rick Payne
Memory: 1610067968 Bytes (12.50%) Free Memory: 11132493824 Bytes (86.40%) So why is it failing to allocate 5MB when it claims to have 11GB free? Any more debug I can provide? Rick On Sat, 2020-03-21 at 15:32 +1000, Rick Payne wrote: > And indeed, back on the 'release' image, and we hit an

Re: [osv-dev] Re: OOM query

2020-03-20 Thread Rick Payne
_LOCK > > instead of > > > > > comparing the original value in the beginning of the body of > > the > > > > > loop? The line before - _shrinker_loop(target, [this] { > > return > > > > > _oom_blocked.has_waiters(); }); - might have

Re: [osv-dev] Re: OOM query

2020-03-09 Thread Rick Payne
Kozaczuk wrote: > Does it happen with the very latest OSv code? Did it start happening > at some point more often? > > I wonder if we could add some helpful printouts in wake_waiters(). > > Btw that assert() failure in tcp_do_segment() rings a bell. > > On Mon, Mar 9,

Re: [osv-dev] Re: OOM query

2020-03-09 Thread Rick Payne
I can't add much other than I doubt its fragmentation. Sometimes this happens within a few minutes of the system starting. At no point do I think we're using more than 2GB of ram (of the 12GB) either. I did compile up a debug verison of OSv and built the system with that, but I've been unable

[osv-dev] OOM query

2020-03-02 Thread Rick Payne
Had a crash on a system that I don't understand. Its a VM with 12GB allocated, we were running without about 10.5GB free according to the API. Out of the blue, we had a panic: Out of memory: could not reclaim any further. Current memory: 10954988 Kb [backtrace] 0x403f6320

Re: [osv-dev] aarch64 resurrected

2020-01-01 Thread Rick Payne
On Wed, 2020-01-01 at 21:17 -0800, Waldek Kozaczuk wrote: > I had to manually add uush.so to usr_ramfs.manifest. For whatever > reason, the image built on Ubuntu would never boot at all. I wonder > if that has to do with some mixed up gcc path on Ubuntu and maybe we > end up compiling and/or

Re: [osv-dev] Re: [PATCH 14/16] cloud-init: Added support for Network v1 and ConfigDrive data source

2019-12-10 Thread Rick Payne
Hi, > If you are interested and have time to help me, I am willing to > create ipv6 branch and apply the remaining patches in this series to > it. That way we can independently test it before we apply them to the > master branch. But I would need some help. I do not have expertise in > networking

Re: [osv-dev] Re: [PATCH 14/16] cloud-init: Added support for Network v1 and ConfigDrive data source

2019-12-10 Thread Rick Payne
Hi, On Tue, 2019-12-10 at 05:43 -0800, Waldek Kozaczuk wrote: > Hi, > > I think that Nadav had some code review comments he hoped to get > resolved before he could apply the entire series. I think he did > apply couple of the very first and trivial ones though. I wonder if > some of the

[osv-dev] Re: [PATCH 14/16] cloud-init: Added support for Network v1 and ConfigDrive data source

2019-12-09 Thread Rick Payne
I don't see that this was applied? The reason I ask is that we're using cloud-init to set the IP addresses on multiple interfaces inside OSv (to separate database traffic from the protocols). I have a secondary patch to turn off dhcp that we also use. I have my own network-module.cc, as I

Re: [osv-dev] [PATCH] memory: enforce physical free memory ranges do not start at 0

2019-08-23 Thread Rick Payne
Great, thanks for that! This patch certainly solved my issue - its no longer crashing at that point and is making much better progress... Rick On Sat, 2019-08-24 at 00:14 -0400, Waldemar Kozaczuk wrote: > Most of the time the kernel code references memory using virtual > addresses. > However

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
> I understand we have found slew of possible other bugs. > > Sorry I am a bit confused, > Waldek > > On Thu, Aug 22, 2019 at 15:53 Rick Payne > wrote: > > On Thu, 2019-08-22 at 21:49 +1100, Rick Payne wrote: > > > On Thu, 2019-08-22 at 12:30 +0300, Nadav Har'El

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
On Thu, 2019-08-22 at 21:49 +1100, Rick Payne wrote: > On Thu, 2019-08-22 at 12:30 +0300, Nadav Har'El wrote: > > > Please run "osv syms" to allow gdb to find your application object > > files, and show lines there. Perhaps it's a segfault inside your > > appli

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
On Thu, 2019-08-22 at 12:30 +0300, Nadav Har'El wrote: > Please run "osv syms" to allow gdb to find your application object > files, and show lines there. Perhaps it's a segfault inside your > application, not the kernel? I had, but I had forgotten to add our stuff to the usr.manifest so the

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
On Thu, 2019-08-22 at 10:33 +0300, Nadav Har'El wrote: > You're right, it seems there's should be a "return" in the recursive > case! > That being said, I think the spurious wakeup doesn't cause any harm, > because the wait code rwlock::writer_wait_lockable() loops, and if a > thread > is woken

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
On Thu, 2019-08-22 at 10:47 +0300, Nadav Har'El wrote: > Do you know how to get a backtrace from gdb? Yes, see below. It wasn't running out of memory: (gdb) osv mem Total Memory: 8589392896 Bytes Mmap Memory: 2231259136 Bytes (25.98%) Free Memory: 7464316928 Bytes (86.90%) This is the most

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-22 Thread Rick Payne
On Wed, 2019-08-21 at 13:21 +0300, Nadav Har'El wrote: > > This is often not the problem itself, but rather a result of an > earlier bug, which caused > us to want to print an error message and that generated another > error, and so on. Understood. Still working on testing 0.53, and I'm now

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-21 Thread Rick Payne
On Wed, 2019-08-21 at 12:22 +0300, Nadav Har'El wrote: > I am guessing (need to verify...) that our rwlock implementation is > not recursive - a thread already holding the write lock needs to wait > (forever) for the read lock. If this is true, this is an rwlock bug. So I was puzzled a bit by the

Re: [osv-dev] Resurrecting ARM support

2019-08-21 Thread Rick Payne
On Tue, 2019-07-30 at 05:22 -0700, claudio.font...@gmail.com wrote: > Another issur might be with relocations and thread local variables, > if I remember correctly there are a few things to fix there, will > impact applications that use Thread local variables a lot. Possibily > new relocations

Re: [osv-dev] README

2019-08-21 Thread Rick Payne
On Sun, 2019-08-18 at 21:00 -0700, Waldek Kozaczuk wrote: > Hi, > > I have just pushed a commit to update the main README page to make it > better reflect the current state of OSv. > > If you see anything you want to be changed/improved/added/removed or > if anything can be phrased better or is

Re: [osv-dev] Re: [PATCH] Fix bug in arch_setup_free_memory

2019-08-21 Thread Rick Payne
Thanks for this - I was hitting a wierd page fault issue on our application as we've recently moved from 0.52 to the latest OSv. Something like this, which occurs early on in startup: Assertion failed: ef->rflags & processor::rflags_if (arch/x64/mmu.cc: page_fault: 34) [backtrace]

Re: [osv-dev] Modernizing and cleaning build system

2019-03-31 Thread Rick Payne
On Sun, 2019-03-31 at 11:54 -0700, Waldek Kozaczuk wrote: > The second group's need should be addressed by Capstan. So we should > avoid duplication between what Capstan does well (and hopefully will > do even better in future) and OSv build system. Now capstan packages > are often generated using

Re: [osv-dev] Re: Problems to pass arguments to a JAR in run.yaml

2019-03-31 Thread Rick Payne
> On 31 Mar 2019, at 19:56, roberto battistoni wrote: > > Sorry but the DHCP offers the IP both in the NAT and BRIDGE configuration. I > think that the "forward" does not work in the bridge configuration. Why would it be forwarding in ‘bridge’ mode? I think you’re slightly confused. In

Re: [osv-dev] Re: Problems to pass arguments to a JAR in run.yaml

2019-03-30 Thread Rick Payne
On Thu, 2019-03-28 at 14:20 +0100, roberto battistoni wrote: > [I/211 dhcp]: Received DHCPACK message from DHCP server: > 192.168.122.1 regarding offerred IP address: 192.168.122.168 This is the typical subnet used by libvirt/qemu, and is typically only available locally on the machine unless you

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-25 Thread Rick Payne
and > I quite likely the arm version would support it as well. Just > saying... > > Waldek > > On Monday, March 25, 2019 at 3:43:40 PM UTC-4, rickp wrote: > > On Mon, 2019-03-25 at 20:19 +0100, Rick Payne wrote: > > > > > > The amazon A1 instances has these de

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-25 Thread Rick Payne
On Mon, 2019-03-25 at 20:19 +0100, Rick Payne wrote: > > The amazon A1 instances has these devices. I guess we need ENA > support > for Amazon's new hypervisors anyway... (SR-IOV). Another thing we'd need to do - the A1 instances use a GICv3, whereas OSv supports GICv2. Whether

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-25 Thread Rick Payne
On Mon, 2019-03-25 at 12:05 +0100, Rick Payne wrote: > Also once I get my console working, we hit the DTB issue as we're not > specifying a device tree, which will be the next issue to work on. I built the dtb for the amazon instance (alpine), and get a bit further. Feels like the wrong

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-25 Thread Rick Payne
On Sun, 2019-03-24 at 18:39 +0100, Rick Payne wrote: > On Sun, 2019-03-24 at 18:08 +0200, Nadav Har'El wrote: > > > > $ qemu-system-aarch64 --version > > QEMU emulator version 3.0.0 (qemu-3.0.0-3.fc29) > > Hmm, I'm on: > > QEMU emulator version 2.11.1(De

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-24 Thread Rick Payne
On Sun, 2019-03-24 at 18:08 +0200, Nadav Har'El wrote: > > $ qemu-system-aarch64 --version > QEMU emulator version 3.0.0 (qemu-3.0.0-3.fc29) Hmm, I'm on: QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.10) I wonder if thats the issue. I'll try updating (though no Ubuntu package for

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-24 Thread Rick Payne
On Sun, 2019-03-24 at 17:36 +0200, Nadav Har'El wrote: > $ aarch64-linux-gnu-gcc --version > aarch64-linux-gnu-gcc (GCC) 8.1.1 20180626 (Red Hat Cross 8.1.1-3) I tried this version: aarch64-linux-gnu-gcc (Ubuntu 8.2.0-1ubuntu2~18.04) 8.2.0 Same issue. What qemu are you using? Rick -- You

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-24 Thread Rick Payne
Hi Nadav, On Sun, 2019-03-24 at 12:11 +0200, Nadav Har'El wrote: > > Why the "-S"? It causes the machine not really to start. Oh sorry, I was using the debugger as it just crashes everytime, so no point not using it. > OSv v0.53.0-3-g8cd7d8aa So thats interesting, you're getting passed the

Re: [osv-dev] Re: AWS EC2 ARM instance support?

2019-03-23 Thread Rick Payne
On Fri, 2019-03-22 at 14:24 -0700, Waldek Kozaczuk wrote: > I would be interested except I have literally zero experience with > ARM (except I know how to spell it ;-)) :) > The only ARM machine I have access to is Raspberry PI 3 ( >

[osv-dev] Re: AWS EC2 ARM instance support?

2019-03-22 Thread Rick Payne
On Mon, 2018-12-03 at 10:14 +0200, Nadav Har'El wrote: > > Unfortunately, I haven't heard from anyone in the two groups who > previously contributed the ARM support to OSv, so it's not making any > progress. > If I understand correctly, OSv still builds correctly for ARM (make > arch=aarch64) but

Re: Hitting the assert in lockfree::mutex::unlock()

2019-01-24 Thread Rick Payne
On Thu, 2019-01-24 at 16:28 +0200, Nadav Har'El wrote: > Before I invest too much more time into trying to think if there's a > case you missed, can you please confirm or deny that this bug happens > only in the futex-with-timeout use case? Unfortunately, I can't at the moment. Its an erlang

Hitting the assert in lockfree::mutex::unlock()

2019-01-23 Thread Rick Payne
Anyone seen a crash like this? [backtrace] 0x0022926a <__assert_fail+26> 0x003d3884 0x0041e889 0x003e7aeb 0x003e8086 <__syscall+1254> The particular assert is this one: while(true) { wait_record *other = waitqueue.pop(); if (other)

Re: [PATCH] Handle wall clock MSR correctly

2018-11-20 Thread Rick Payne
On Tue, 2018-11-20 at 21:27 +0200, Nadav Har'El wrote: > On Wed, Nov 14, 2018 at 10:19 AM Nadav Har'El > wrote: > > On Tue, Oct 23, 2018 at 2:47 AM Rick Payne > > wrote: > > Glauber, I'd love to get your opinion about this patch. Clearly our > > original as

Re: Next (0.53.0 and onwards) OSv releases roadmap proposal

2018-11-13 Thread Rick Payne
On Tue, 2018-11-13 at 10:54 -0800, Waldek Kozaczuk wrote: > COMPLETE BUT NOT COMMITTED PATCHES Be nice if we could get the timing fix either committed or commented on (I sent a second patch, but heard nothing). This is our major pain point at the moment. Rick -- You received this message

Re: [PATCH] Handle wall clock MSR correctly

2018-10-22 Thread Rick Payne
On Mon, Oct 22, 2018 at 1:53 AM Rick Payne > wrote: > > We need to write the wall clock MSR every time we use it, to ensure > > we > > get the updated value. This allows the guest OSv to track the time > > of > > the host correctly. > > After this patch, the

Re: [PATCH] Handle wall clock MSR correctly

2018-10-22 Thread Rick Payne
On Mon, 2018-10-22 at 16:13 +0300, Nadav Har'El wrote: > > After this patch, the host NTP keeps the time accurate on the guest > as well? Not verified yet - should be able to do that soon though. I suspect it totally sorts the issue though. > Another thing we could do is to reread the wall

Re: ELF linker woes with GraalVM

2018-10-19 Thread Rick Payne
On Fri, 2018-10-19 at 10:58 -0700, Waldek Kozaczuk wrote: > Recently I have been playing with GraalVM ( > https://github.com/oracle/graal) to see if it is possible to run it > on OSv. To that extent I created new OSv app - > https://github.com/cloudius-systems/osv-apps/tree/master/graalvm-example

Re: OSv time drifting when running under KVM

2018-09-16 Thread Rick Payne
On Mon, 2018-09-17 at 01:41 +0300, Nadav Har'El wrote: > I have a wild guess, but I'm not a big clock expert, and I'm CCing > Glauber who might have better ideas. > > My guess is that you have ntp running in the *host*, but not in the > guest (we don't have an ntp client for OSv), and somehow

Re: OSv time drifting when running under KVM

2018-09-16 Thread Rick Payne
On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote: > Stupid question: are you sure the KVM accellarion is actually enabled > and kvm-clock actually used? If not OSv would fall back to hpet which > we know we have some problems with. The processor flags are: "sse3 cmpxchg16b x2apic clflush

Re: OSv time drifting when running under KVM

2018-09-14 Thread Rick Payne
On Fri, 2018-09-14 at 16:25 -0700, Dor Laor wrote: > Need to look at the guest log and the restapi The log says: /usr/bin/qemu-system-x86_64 -name xxx -S -machine pc-i440fx- xenial,accel=kvm,usb=off -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 ... Rick -- You received this

Re: OSv time drifting when running under KVM

2018-09-14 Thread Rick Payne
On Fri, 2018-09-14 at 12:12 -0700, Waldek Kozaczuk wrote: > Stupid question: are you sure the KVM accellarion is actually enabled > and kvm-clock actually used? If not OSv would fall back to hpet which > we know we have some problems with. My turn for a stupid question - how would I know? I do

OSv time drifting when running under KVM

2018-09-13 Thread Rick Payne
We have a problem with OSv's wall clock drifting away from the hypervisor's. For example, this VM has been running for under 24hrs, and when I compare the hypervisor time, with that retrieved from the httpserver-api, I get this: $ curl http://192.168.x.x/os/date && TZ=UTC date "Thu Sep 13

Re: Exodus: Lightweight relocation engine. Useful?

2018-02-02 Thread Rick Payne
On Fri, 2018-02-02 at 16:28 -0800, Dan Kaminsky wrote: > That's where I hit my wall, but it seems pretty much a bullseye on > your wheelhouse. Suggestions? I saw the exodus announcement too - very nice. I wondered if you could alter it to emit a C program that dlopen()s the binary, looks up

Re: Page fault outside of application

2018-01-30 Thread Rick Payne
On Tue, 2018-01-30 at 11:47 +0200, Nadav Har'El wrote: > I have a vague feeling that fix_permissions() cannot just work on the > whole object it needs to know which of the PT_LOAD segments (see > file::load_segment()) the RELRO falls in, but I'm hazy on the > details. Maybe even

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 11:43 +0200, Nadav Har'El wrote: > 1. Your compiler defaults to "full relro" (-Wl,-z,now -Wl,-z,relro) > but for some reason object::relocate_pltgot() doesn't recognize the > bind_now. FWIW, on both workign and non-working builds, I see '-pie -z now -z relro' being passed to

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 12:27 +0200, Nadav Har'El wrote: > Both versions used "-pie", not "-shared"? Should be, yes. Its exactly the same build setup and the Makefile shows '-pie' for LDFLAGS. I don't think gcc7.2 contains any of the -mindirect-branch changes, so thats a red-herring. I'll continue

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 11:43 +0200, Nadav Har'El wrote: > > Hmm, I don't know, I wasn't aware anything like that changed. > We usually change parts of the object marked by PT_GNU_RELRO to read- > only in object::fix_permissions(), I'm guessing (but didn't check) > this what caused the read-only

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 10:54 +0200, Nadav Har'El wrote: > This all seems reasonable. > Maybe we somehow got the PLT becoming read-only, so we are getting a > pagefault trying to write to it? > Can you please try in gdb "osv mmap" and look at the mapping which > includes the faulting address

Re: Using stock PIE executables from standard distributions?

2018-01-28 Thread Rick Payne
> On 28 Jan 2018, at 22:23, Nadav Har'El wrote: > > However, sadly, we still do have bugs with PIE support that need to be fixed > before running PIEs on OSv becomes a hassle-free experience: > > A bug which relatively-recently became relevant (as gcc changed) is >

Re: Page fault outside of application

2018-01-24 Thread Rick Payne
On 24/01/18 17:09, Rick Payne wrote: Hi Geraldo, On 23/01/18 19:58, Geraldo Netto wrote: Hello Rick, Rick, could you please, provide the full output with the -V ? eg: scripts/run.py -V Its a custom build, I'm running it via qemu direct. Here it is: qemu-system-x86_64: -mon chardev=stdio

Re: Page fault outside of application

2018-01-24 Thread Rick Payne
Hi, On 23/01/18 20:16, Nadav Har'El wrote: I don't have any bright ideas, but just a few small comments below, hopefully (?) they will help something... Appreciated... This writes in "addr", which seems a reasonable address (doesn't seem like junk). In object::resolve_pltgot() you can see

Page fault outside of application

2018-01-23 Thread Rick Payne
A few moving parts, so not sure what is causing this - but trying to start an erlang application I'm seeing this: eth0: 192.168.122.61 page fault outside application, addr: 0x1a60fe28 [registers] RIP: 0x00492dd1

Re: Memory limit?

2017-11-07 Thread Rick Payne
Hi, > I am not sure how this will help, as the later malloc() can still fail when > it wants to allocate physically-contiguous memory. > > One hack you can try to fix > https://github.com/cloudius-systems/osv/issues/854 and hopefully your issue > is to change in core/mempool.cc, the function

Re: Memory limit?

2017-11-06 Thread Rick Payne (Offshore)
> Out of memory: could not reclaim any further. Current memory: 5122256 Kb > > This suggests there was 5GB free while the allocation failed. > This *can* be a fragmentation issue (e.g., you asked for a 1 GB allocation, > but we couldn't free a 1GB consecutive area), but can also be a malloc() of

Re: Memory limit?

2017-11-02 Thread Rick Payne
> This suggests there was 5GB free while the allocation failed. > This *can* be a fragmentation issue (e.g., you asked for a 1 GB allocation, > but we couldn't free a 1GB consecutive area), but can also be a malloc() of a > ridiculous amount. Since commit 7ea953ca7d6533c025e535be49ee5bd2567fc8f3

Memory limit?

2017-11-01 Thread Rick Payne
I’m stressing OSv a bit, and though I start the VM with 10G of memory, it seems to fail after just over 5GB. Is there a limit that I’m hitting, or perhaps my memory usage is fragmenting things too much? Out of memory: could not reclaim any further. Current memory: 5122256 Kb [backtrace]

Re: OSv image under ProxMox VE5.1

2017-10-30 Thread Rick Payne
> On 31 Oct 2017, at 01:45, Player, Timmons wrote: > > I’ve run into the same issue under VMware on VM’s that lack a serial port. > I’ve used the following patch locally with success… Ah yes, that works too, thanks! Cheers, Rick -- You received this message

Re: OSv image under ProxMox VE5.1

2017-10-30 Thread Rick Payne
> On 30 Oct 2017, at 17:40, Rick Payne <ri...@rossfell.co.uk> wrote: > > > Anyone tried this? I took one of my apps, converted it to a raw image and dd > that into the ProxMox disk container. It boots, but seems to get horribly > confused by the console - continuo

OSv image under ProxMox VE5.1

2017-10-29 Thread Rick Payne
Anyone tried this? I took one of my apps, converted it to a raw image and dd that into the ProxMox disk container. It boots, but seems to get horribly confused by the console - continuously receiving something and taking 100% of the CPU. I tried with just a very simple image (cli,http-server)

lzloader compile issue

2017-09-30 Thread Rick Payne
Hi, I’m still having to apply this patch to get OSv to work on my Ubuntu 16.10 box. Without it, the produced image fails to start even for the initial cpio phase. Its clearly some optimisation issue with gcc 6.2 (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12)), so just in case anyone else

Re: SO_BINDTODEVICE

2017-08-28 Thread Rick Payne
Hi, > If you come across such differences, please open issues in the OSv bug > tracker, so even if we don't fix them immediately, we'll fix them eventually. > > The intention of OSv is to be compatible with Linux's ABI, not BSD, so such > differences should be - eventually - eliminated.

Re: SO_BINDTODEVICE

2017-08-26 Thread Rick Payne (Offshore)
> I saw that. I also noticed that multicast won’t work because the SIOCADDMULTI > call does nothing in the hypervisor drivers (so the multicast mac addresses > are never added to the filter tables for the interface concerned). I’m part > way through the code to use the controlq to handle that

Re: SO_BINDTODEVICE

2017-08-21 Thread Rick Payne (Offshore)
> On 22 Aug 2017, at 01:33, Nadav Har'El wrote: > > > I don't remember what libpcap uses on Linux. Please try :-) Haven’t had a chance yet. > If I remember correctly (I have't touched this stuff in two decades ;-)), > libpcap was implemented on BSD using a "BPF" (Berkeley

Re: SO_BINDTODEVICE

2017-08-17 Thread Rick Payne (Offshore)
> On 17 Aug 2017, at 06:43, Rick Payne (Offshore) <ri...@rossfell.co.uk> wrote: > > Does carp run on Freebsd or is it just OpenBSD? That has to solve a similar > issue, namely sending announcements to a multicast address on a particular > interface. I’ll in

SO_BINDTODEVICE

2017-08-16 Thread Rick Payne (Offshore)
Is there an equivalent to SO_BINDTODEVICE? I’m trying to get an implementation of VRRP working and the first issue is that the sending socket requires to be bound to the appropriate interface. After that, I’ll probably run into issues relating to the virtual IP and mac address - but the

Re: Symbol lookup errors while building C++ source code

2017-04-23 Thread Rick Payne (Offshore)
> What I think is a better approach is to take separatey the OSv kernel with > just cpiod.so and mkfs.so (i.e., take build/release/loader.img) and the > files, and compose the image with the size you want and uploading the files > you want to it. This is more-or-less what the Mikelangelo

Re: Symbol lookup errors while building C++ source code

2017-04-23 Thread Rick Payne (Offshore)
> On 23 Apr 2017, at 05:14, Nadav Har'El wrote: > > Pekka, another question we should ask ourselves is what is the plan with > Capstan, and more importantly, the binary OSv releases. The OSv v0.24 version > he used is two years old (!). If we're not going to produce any

Re: Example of using virtio network interfaces from application code

2017-04-18 Thread Rick Payne (Offshore)
> On 13 Apr 2017, at 05:46, Nadav Har'El wrote: > > It's not exactly what Rick asked for (virtio), but maybe it's good enough. The pfil_add_hook() functionality may be sufficient, thanks! I’ll give it a go. Cheers, Rick -- You received this message because you are

Example of using virtio network interfaces from application code

2017-04-09 Thread Rick Payne (Offshore)
Hi, Is there a good example that uses the virtio network interfaces direct? ie. an application that uses ‘assign-net’ and talks directly to the network interface code in OSv? I’m looking for a sample I can learn from. Cheers, Rick -- You received this message because you are subscribed to

Re: Race condition/bug in cloud-init/dhcp?

2017-01-07 Thread Rick Payne
> On 7 Jan 2017, at 08:03, Rick Payne <ri...@rossfell.co.uk> wrote: > I wondered if it was something to do with getting the hostname via > cloud-init, and re-doing the DHCP? I can’t say I’ve tried this before that > recent change went in - if I get a chance later I will do

Race condition/bug in cloud-init/dhcp?

2017-01-07 Thread Rick Payne
I’ve started playing with cloud-init, and I’m occasionally seeing things halt during startup. Not very frequently, but frequently enough that it looks like some race condition. The indicator is the DHCP message ‘Got packet with wrong transaction ID’. Anyone else seen this? BSD shrinker:

[PATCH] Add missing pthread_getname_np()

2016-12-21 Thread Rick Payne
Signed-off-by: Rick Payne <ri...@rossfell.co.uk> --- include/api/pthread.h | 1 + libc/pthread.cc | 7 +++ 2 files changed, 8 insertions(+) diff --git a/include/api/pthread.h b/include/api/pthread.h index 85743f7..105e8d4 100644 --- a/include/api/pthread.h +++ b/include/api/pth

Re: Problems building on Ubuntu 16.10

2016-12-19 Thread Rick Payne
> On 15 Dec 2016, at 17:34, Rick Payne <ri...@rossfell.co.uk> wrote: > > LZ loader-stripped.elf > OBJCOPY loader-stripped.elf.lz -> loader-stripped.elf.lz.o > LINK lzloader.elf > ALIGN lzloader.elf > DD loader.img boot.bin > DD loader.img lzloader

  1   2   >