Hi,
Sorry for the late reply.
I did manage to replicate your issue. It turns out that when building so
large images, one needs more memory to run the so-called ZFS builder. After
increasing the memory from 512M to 1G I was able to build the image
successfully.
diff --git a/scripts/upload_manifest.py b/scripts/upload_manifest.py
index a3796f95..65e91a9c 100755
--- a/scripts/upload_manifest.py
+++ b/scripts/upload_manifest.py
@@ -164,7 +164,7 @@ def main():
console = '--console=serial'
zfs_builder_name = 'zfs_builder-stripped.elf'
- osv = subprocess.Popen('cd ../..; scripts/run.py -k --kernel-path
build/release/%s --arch=%s --vnc none -m 512 -c1 -i "%s"
--block-device-cache unsafe -s -e "%s --norandom --nomount --noinit
--preload-zfs-library /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/;
/zfs.so set compression=off osv" --forward tcp:127.0.0.1:%s-:10000' %
(zfs_builder_name,arch,image_path,console,upload_port), shell=True,
stdout=subprocess.PIPE)
+ osv = subprocess.Popen('cd ../..; scripts/run.py -k --kernel-path
build/release/%s --arch=%s --vnc none -m 1G -c1 -i "%s"
--block-device-cache unsafe -s -e "%s --norandom --nomount --noinit
--preload-zfs-library /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/;
/zfs.so set compression=off osv" --forward tcp:127.0.0.1:%s-:10000' %
(zfs_builder_name,arch,image_path,console,upload_port), shell=True,
stdout=subprocess.PIPE)
upload(osv, manifest, depends, upload_port)
In case you see a similar error when running the image, also try to bump up
memory like - run.py -m 1G
We should probably detect this condition in core/mmu.cc accordingly and
better handle it possibly by informing user there is not enough physical
memory:
181
182 // For now, only allow non-mmaped areas. Later, we can either
183 // bounce such addresses, or lock them in memory and translate
184 assert(virt >= phys_mem);
185 return reinterpret_cast<uintptr_t>(virt) & (mem_area_size - 1);
186 }
On Wednesday, December 13, 2023 at 10:57:13 PM UTC-5 Darren L wrote:
> Hi Waldek,
>
> I am able to make the issue appear again by creating a big file (i.e. dd
> if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock), creating a
> new app within the apps folder called big_file, setting the usr.manifest to
> include /big_file/**: ${MODULE_DIR}/**, and running:
>
> ./scripts/build fs_size_mb=8192 image=python-from-host,java8,big_file
>
> I updated the buf.h file, and the issue remains.
>
> When I used fs=rofs instead, the issue returns but it doesn't appear until
> I run the run.py script. Example: the last few lines of the build state:
>
> First block: 5242316, blocks count: 5127
> Directory entries count 43078
> Symlinks count 10
> Inodes count 43079
>
> But when I run
>
> ./scripts/run.py -e "/python3.10"
>
> I get the following message:
>
> OSv v0.57.0-86-g873cb55a
>
> Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)
>
> [backtrace]
> 0x0000000040244a7c <__assert_fail+28>
> 0x00000000402b85fc <mmu::virt_to_phys(void*)+92>
> 0x00000000402ef3e2 <void mmu::virt_to_phys<virtio::vring::add_sg(void*,
> unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned
> long)#1}>(void*, unsigned long, virtio::vring::add_sg(void*, unsigned int,
> virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1})+34>
> 0x00000000402ef148 <virtio::blk::make_request(bio*)+312>
> 0x00000000403bfd75 <multiplex_strategy+197>
> 0x00000000403cacbb <rofs_read_blocks(device*, unsigned long, unsigned
> long, void*)+107>
> 0x00000000403c91fd <???+1077711357>
> 0x00000000403c1476 <sys_mount+694>
> 0x00000000403beadb <mount_rofs_rootfs+59>
> 0x0000000040239491 <do_main_thread(void*)+7809>
> 0x00000000403e4e69 <???+1077825129>
> 0x000000004037db2d <thread_main_c+45>
> 0x000000004030b361 <???+1076933473>
>
> When I use Virtio-FS, the error does not seem to appear, but this seems to
> cause different errors possibly. When trying to run Python in the
> unikernel, I get:
>
> sudo PATH=/usr/lib/qemu:$PATH ./scripts/run.py --virtio-fs-tag=myfs
> --virtio-fs-dir=$(pwd)/build/export -e "/python3.10"
> OSv v0.57.0-86-g873cb55a
> eth0: 192.168.122.15
> Booted up in 109.01 ms
> Cmdline: /python3.10
> Fatal Python error: _Py_HashRandomization_Init: failed to get random
> numbers to initialize Python
> Python runtime state: preinitialized
>
> Hope this is enough to replicate the issue. Thank you!
>
> On Monday, December 11, 2023 at 3:03:01 PM UTC-5 [email protected] wrote:
>
>> Hi,
>>
>> On Thursday, December 7, 2023 at 6:43:00 PM UTC-5 Darren L wrote:
>>
>> Hi Waldek,
>>
>> Thanks for the quick response. For more details, I am trying to run a
>> research prototype that requires both Python 3 and Java 8 to run correctly,
>> and I placed the prototype's executable files (which includes large amounts
>> of static data required to run the program) as an image in the apps
>> directory to be linked to the image. The individual files are not huge (up
>> to a few hundred MB each), but rather there are a lot of files that need to
>> be included to run the prototype (totalling altogether 2-3 GB). I wasn't
>> sure if this was the best approach. It seems that I could also dynamically
>> link it in the .img file after the fact, since I only need the Python and
>> Java images to run the program correctly?
>>
>> The error I am getting occurs when I run the following command:
>> `./scripts/build
>> fs_size_mb=8192 image=python-from-host,java8,prototype` where prototype
>> contains my large executable files. The error I receive is during the build
>> process. It will state:
>>
>> ```
>> Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)
>>
>> [backtrace]
>> 0x0000000040244a7c <__assert_fail+28>
>> 0x00000000402b85fc <mmu::virt_to_phys(void*)+92>
>> 0x00000000402ef3e2 <void mmu::virt_to_phys<virtio::vring::add_sg(void*,
>> unsigned int, virtio::vring_desc::flags)::{lambda(unsigned long, unsigned
>> long)#1}>(void*, unsigned long, virtio::vring::add_sg(void*, unsigned int,
>> virtio::vring_desc::flags)::{lambda(unsigned long, unsigned long)#1})+34>
>> 0x00000000402ef1e6 <virtio::blk::make_request(bio*)+470>
>> 0x000010000006b1e5 <???+438757>
>> 0x000010000009620a <???+614922>
>> 0x000010000006e476 <???+451702>
>> 0x000010000009620a <???+614922>
>> 0x0000000040260e1e <???+1076235806>
>> 0x0000000040260ef2 <taskqueue_thread_loop+82>
>> 0x000000004037db2d <thread_main_c+45>
>> 0x000000004030b361 <???+1076933473>
>> ```
>>
>> I do not think this error has anything to do with the ELF layout or the
>> kernel shift. I think this issue is different and it happens when OSv runs
>> during the build process (ZFS builder) to create the ZFS disk and upload
>> all files.
>> I am still very interested in replicating and fixing it.
>>
>> My wild guess is that it may be caused by the bug somebody else
>> discovered and fixed in his pull request -
>> https://github.com/cloudius-systems/osv/pull/1284/files. Can you only
>> update the include/osv/buf.h file to see if your problem goes away?
>> Otherwise, I have to be able to replicate it somehow.
>>
>> Also, can you try to build a ROFS image (add fs=rofs to your build
>> command) and run it?
>>
>> There is also an option to use Virtio-FS (see
>> https://github.com/cloudius-systems/osv/wiki/virtio-fs).
>>
>> I'm using the most recent OSv pulled from Github. I'm not using node but
>> Python. To my best understanding, this is what I got using readelf for Java
>> and Python. I used the files within my own machine that were then
>> transferred into OSv. I wasn't sure how to use `readelf` within the
>> unikernel itself, so I might need guidance on using readelf within OSv if
>> this is not sufficient.
>>
>> openjdk-8-zulu-full's java binary:
>>
>> ```
>> Elf file type is EXEC (Executable file)
>> Entry point 0x400570
>> There are 9 program headers, starting at offset 64
>>
>>
>> Program Headers:
>> Type Offset VirtAddr PhysAddr FileSiz
>> MemSiz Flg Align
>> PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8
>> 0x0001f8 R E 0x8
>> INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c
>> 0x00001c R 0x1
>> [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
>> LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0008c4
>> 0x0008c4 R E 0x200000
>> LOAD 0x000d98 0x0000000000600d98 0x0000000000600d98 0x00027c
>> 0x000290 RW 0x200000
>> DYNAMIC 0x000dd0 0x0000000000600dd0 0x0000000000600dd0 0x000210
>> 0x000210 RW 0x8
>> NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044
>> 0x000044 R 0x4
>> GNU_EH_FRAME 0x000818 0x0000000000400818 0x0000000000400818 0x000024
>> 0x000024 R 0x4
>> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
>> 0x000000 RW 0x8
>> GNU_RELRO 0x000d98 0x0000000000600d98 0x0000000000600d98 0x000268
>> 0x000268 R 0x1
>>
>> Section to Segment mapping:
>> Segment Sections...
>> 00
>> 01 .interp
>> 02 .interp .note.ABI-tag .note.gnu.build-id .hash .gnu.hash
>> .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn
>> .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
>> 03 .ctors .dtors .jcr .data.rel.ro .dynamic .got .got.plt .data
>> .bss
>> 04 .dynamic
>> 05 .note.ABI-tag .note.gnu.build-id
>> 06 .eh_frame_hdr
>> 07
>> 08 .ctors .dtors .jcr .data.rel.ro .dynamic .got
>> ```
>>
>> python3.10.12:
>>
>> ```
>> Elf file type is DYN (Position-Independent Executable file)
>> Entry point 0x22cb80
>>
>> There are 13 program headers, starting at offset 64
>>
>> Program Headers:
>> Type Offset VirtAddr PhysAddr FileSiz
>> MemSiz Flg Align
>> PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0002d8
>> 0x0002d8 R 0x8
>> INTERP 0x000318 0x0000000000000318 0x0000000000000318 0x00001c
>> 0x00001c R 0x1
>> [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
>> LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x06c228
>> 0x06c228 R 0x1000
>> LOAD 0x06d000 0x000000000006d000 0x000000000006d000 0x2b0dad
>> 0x2b0dad R E 0x1000
>> LOAD 0x31e000 0x000000000031e000 0x000000000031e000 0x23ee58
>> 0x23ee58 R 0x1000
>> LOAD 0x55d810 0x000000000055e810 0x000000000055e810 0x045528
>> 0x08b6c8 RW 0x1000
>> DYNAMIC 0x562bc8 0x0000000000563bc8 0x0000000000563bc8 0x000220
>> 0x000220 RW 0x8
>>
>> NOTE 0x000338 0x0000000000000338 0x0000000000000338 0x000030
>> 0x000030 R 0x8
>> NOTE 0x000368 0x0000000000000368 0x0000000000000368 0x000044
>> 0x000044 R 0x4
>> GNU_PROPERTY 0x000338 0x0000000000000338 0x0000000000000338 0x000030
>> 0x000030 R 0x8
>> GNU_EH_FRAME 0x4e72a4 0x00000000004e72a4 0x00000000004e72a4 0x012e74
>> 0x012e74 R 0x4
>>
>> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
>> 0x000000 RW 0x10
>> GNU_RELRO 0x55d810 0x000000000055e810 0x000000000055e810 0x0067f0
>> 0x0067f0 R 0x1
>>
>> Section to Segment mapping:
>> Segment Sections...
>> 00
>> 01 .interp
>> 02 .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag
>> .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
>> 03 .init .plt .plt.got .plt.sec .text .fini
>> 04 .rodata .stapsdt.base .eh_frame_hdr .eh_frame
>> 05 .init_array .fini_array .data.rel.ro .dynamic .got .data
>> .PyRuntime .probes .bss
>> 06 .dynamic
>> 07 .note.gnu.property
>> 08 .note.gnu.build-id .note.ABI-tag
>> 09 .note.gnu.property
>> 10 .eh_frame_hdr
>> 11
>> 12 .init_array .fini_array .data.rel.ro .dynamic .got
>> ```
>>
>> Btw you can use a stock JDK or Python from your Linux host (see
>> modules/openjdk9_1x-from-host -> image=...,openjdk9_1x-from-host) and the
>> same for Python (apps/python-from-host -> image=...,python-from-host ).
>>
>> Let me know if you have any issues.
>>
>> Let me know if there is anything else you would like to see, or if you
>> would like more clarification on anything mentioned above. Thank you for
>> the help!
>>
>> Sincerely,
>> Darren
>>
>> On Thursday, December 7, 2023 at 9:56:34 AM UTC-5 [email protected]
>> wrote:
>>
>> Hi,
>>
>> The 2GB limit and the commit you are referring to should only limit the
>> size of the position-dependent executables (these executables typically
>> want to be loaded in a place where OSv kernel used to be before this
>> commit).
>>
>> Are your executables larger than 2GB in size? Can you run 'readelf -W -l'
>> against java and node like in this example:
>>
>> readelf -W -l /usr/lib/jvm/java-8-openjdk-amd64/bin/java
>>
>> Elf file type is DYN (Position-Independent Executable file)
>> Entry point 0x10b0
>> There are 13 program headers, starting at offset 64
>>
>> Program Headers:
>> Type Offset VirtAddr PhysAddr FileSiz
>> MemSiz Flg Align
>> PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0002d8
>> 0x0002d8 R 0x8
>> INTERP 0x000318 0x0000000000000318 0x0000000000000318 0x00001c
>> 0x00001c R 0x1
>> [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
>> LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000750
>> 0x000750 R 0x1000
>> LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x0001a9
>> 0x0001a9 R E 0x1000
>> LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x00011c
>> 0x00011c R 0x1000
>> LOAD 0x002d48 0x0000000000003d48 0x0000000000003d48 0x0002c8
>> 0x0002d0 RW 0x1000
>> DYNAMIC 0x002d58 0x0000000000003d58 0x0000000000003d58 0x000260
>> 0x000260 RW 0x8
>> NOTE 0x000338 0x0000000000000338 0x0000000000000338 0x000030
>> 0x000030 R 0x8
>> NOTE 0x000368 0x0000000000000368 0x0000000000000368 0x000044
>> 0x000044 R 0x4
>> GNU_PROPERTY 0x000338 0x0000000000000338 0x0000000000000338 0x000030
>> 0x000030 R 0x8
>> GNU_EH_FRAME 0x002038 0x0000000000002038 0x0000000000002038 0x000034
>> 0x000034 R 0x4
>> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
>> 0x000000 RW 0x10
>> GNU_RELRO 0x002d48 0x0000000000003d48 0x0000000000003d48 0x0002b8
>> 0x0002b8 R 0x1
>>
>> Can you give us more details about your use case and the error you are
>> getting with the existing limit of 2GB (I am assuming there is a reason you
>> want to increase it).
>>
>> Regards,
>> Waldek
>>
>> On Thu, Dec 7, 2023 at 3:52 AM Darren Lim <[email protected]> wrote:
>>
>> Hello,
>>
>> Per the patch here (
>> https://github.com/cloudius-systems/osv/commit/2a1795db8a22b0b963a64d068f5d8acc93e5785d),
>>
>> I was hoping to get help with making the changes to increase the kernel
>> limit from 2GB to a larger size.
>>
>> For context, I am trying to load a large project (~3GB) into the
>> unikernel, along with decently large languages (Java, Python) by creating a
>> custom image for the build script. It currently complains in mmu.cc,
>> stating:
>>
>> Assertion failed: virt >= phys_mem (core/mmu.cc: virt_to_phys: 184)
>>
>> Thank you!
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/osv-dev/a5c2844a-e03e-4ac4-8e0a-de81f575889fn%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/osv-dev/a5c2844a-e03e-4ac4-8e0a-de81f575889fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/osv-dev/3eb2890d-bc74-4a5e-a522-c66f9df9b03dn%40googlegroups.com.