Hey!

Interesting paper Waldek!
I'm trying to push OSv to my friends here, let's hope
This sparks some contributions :)


Kind Regards,

Geraldo Netto
Sapere Aude => Non dvcor, dvco
http://exdev.sf.net/

On Thu, 7 Mar 2019 at 13:56, Waldek Kozaczuk <[email protected]> wrote:
>
> Hi,
>
> I am forwarding here my exchange with the author of the paper about OSv on 
> Xen beating docker in single vCPU setup. Adding the link to the article here:
> https://biblio.ugent.be/publication/8582433/file/8582438.pdf
>
> Enjoy,
> Waldek
> ---------- Forwarded message ---------
> From: Waldek Kozaczuk <[email protected]>
> Date: Sat, Mar 2, 2019 at 5:17 PM
> Subject: Re: Unikernel performance review paper inquiry
> To: Tom Goethals <[email protected]>
>
>
> Tom,
>
> Thanks for replying to my email.
>
> Do you mind if I forward our email exchange to the OSv development group 
> [email protected]? I think that other would also be very interested in 
> your article and findings. Also you might also get some good insight from 
> more experienced than me OSv users and developers.
>
> Please see my comments to some of your comments below.
>
> Regards,
> Waldek
>
> On Tue, Feb 26, 2019 at 5:19 AM Tom Goethals <[email protected]> 
> wrote:
>>
>> Hello Waldek,
>>
>>
>> Thank you for the thorough read of my paper. It's interesting to see it from 
>> the perspective of an OSv contributor. I have to admit I was very new to 
>> unikernels (I had barely started my PhD) when I wrote the paper, so I may 
>> have missed some things. Considering the amount of bullet points, I inserted 
>> my answers/comments below for each point.
>>
>>
>> Generally, I did choose OSv as a focus because it seemed (by far) the most 
>> mature and stable platform to create and run unikernels with, so it's good 
>> to see new features and compatibility for languages being added. In the 
>> future, I would really like to do some research on mixed container-unikernel 
>> deployments with a single orchestration platform, but someone else may very 
>> well beat me to it :)
>>
>>
>> During the tests I did not find any large problems or lack of features in 
>> OSv, so I have little to add there. In fact, once I got the hang of it, OSv 
>> unikernels were quite easy to build and run on a lot of hypervisors. Most 
>> problems were about getting Python3 to work, but that has apparently been 
>> fixed. Note that the tests were mostly centered on network performance and 
>> simple workloads, so more in-depth future work could still result in 
>> suggestions.
>>
>>
>> Regards,
>>
>> Tom
>>
>> ________________________________
>> Van: Waldek Kozaczuk <[email protected]>
>> Verzonden: dinsdag 26 februari 2019 0:15
>> Aan: Tom Goethals
>> Onderwerp: Unikernel performance review paper inquiry
>>
>> Hi,
>>
>> Congratulations on writing this interesting and thorough paper - 
>> https://biblio.ugent.be/publication/8582433/file/8582438.pdf !!! Indeed it 
>> is probably first paper trying to compare performance of OSv and Docker 
>> containers.
>>
>> I hope this email finds the authors of this paper as I would like to ask 
>> some questions regarding it as well as clarify/explain some observations 
>> about OSv you touched on.
>>
>> First I wanted to introduce myself. My name is Waldemar Kozaczuk and I am 
>> one of the OSv committers. Though I am not one of the original authors of 
>> OSv, I have been playing with it and contributing to since 2015. My major 
>> contributions have been:
>>
>> Implementing ROFS (Read-Only FS)
>> Adding support of golang and python 3
>> Enhancing OSv to make it run on AWS firecracker
>>
>> I must say I have been very pleased to learn about how well OSv did in the 
>> single-vCPU performance tests but also a little disappointed that OSv did 
>> not fare that well in multi-vCPU tests ;-) Not surprised with results of 
>> workload tests on other hand. I would love to be able to reproduce it myself 
>> at some point. I noticed you refer to this project 
>> https://github.com/togoetha/unikernels-v-containers where I have found the 
>> source code and build scripts for OSv and Docker images. On other hand I 
>> could not locate any scripts nor JMeter setups that would let me run those 
>> tests. Could you possibly point me to those?
>> => I did not actually make any JMeter scripts. The JMeter GUI was used to 
>> find the breaking point for each service/language, but admittedly it was 
>> time-consuming work that could have been handled better. I can however tell 
>> you the basic settings: 40 threads, started simultaneously (no buildup), 
>> each sending 50000 requests ASAP. Results reflect the number of responses 
>> per second and latency. Some fiddling was done with the settings (less 
>> threads, more threads, varying buildup, ...), but the best results were 
>> actually gained by simply unleashing concurrency hell on the services and 
>> seeing how fast they could handle it. It's somewhat possible the real limits 
>> were slightly higher, but ironically the machine used to generate the load 
>> with JMeter couldn't go any higher.
>>
>> Let me structure remaining part of this email as a list of bullet points 
>> referring to your article:
>>
>> You mention in the introduction that unikernels are hard to debug and lack 
>> good debugging tools. I would agree with it but also point out that OSv 
>> shines in this aspect:
>>
>> can be easily debugged with gdb - 
>> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
>> provides management and monitoring REST API to monitor - 
>> http://osv.io/api/swagger-ui/dist/index.html and HTML5 terminal app - 
>> https://github.com/wkozaczuk/osv-html5-terminal
>> can be profiled - 
>> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py
>>
>> => Debugging was mostly a general remark about the state of the art in 
>> unikernels. Indeed, I had an easier time troubleshooting and debugging OSv 
>> unikernels than other unikernel platforms. Being POSIX compatible (as 
>> opposed to "clean slate" alternatives such as IncludeOS) really helps there.
>>
>> Which version of capstan did you use? Were you using latest mikelangelo 
>> capstan - https://github.com/mikelangelo-project/capstan that supports 
>> packages (similar to docker compose)?
>>
>> => The original Capstan from Cloudius Systems was used 
>> (https://github.com/cloudius-systems/capstan). I see much has changed in the 
>> newer version, time to catch up.
>>
>> You mention you had problem running vertx Java microservice on Java > 8. 
>> Could you please clarify what the exact problem was? We have number of 
>> example apps (like 
>> https://github.com/cloudius-systems/osv-apps/tree/master/openjdk11-zulu-java-base)
>>  that demonstrate running simple "hello world" even on latest Java 11. I 
>> wonder if you hit an issue related to non-isolated vs isolated mode of 
>> running Java mode. The isolated one was a default one before this commit 
>> https://github.com/cloudius-systems/osv/commit/99dd1c5b521a0ab4642e79a2e992c50ad719f7c6
>>  (after release of 0.51.0) and unlike on-isolated one is not supported on > 
>> Java 8. I am also aware that our tiny run-java wrapper does not support new 
>> options added to Java 9 and beyond - like "--add-exports" which might be 
>> necessary when running vertx app which uses netty.
>>
>> => Actually, I did get Java to work properly with the included JDK's up to 
>> Java 10 or so. As I remember, my problem was in not getting minimal JRE's to 
>> run correctly on OSv, resulting from the fact that those were cross-compiled 
>> from the JDK on my machine in a rather hackish way in an attempt to get them 
>> to run on OSv. However, that was not a problem for the tests, it was just a 
>> curiosity. I'm guessing this could work if the entire Java JDK on the local 
>> machine is recompiled to suit OSv first, and then building minimal JRE's 
>> from it on the local machine to run on OSv. However, time was a bit short 
>> and I just dropped it.
>
> In general OSv should be able to run any unmodified (without need to 
> recompile) Linux JDK distribution. The best I have found are from Azul 
> (Zulu). But Amazon recently started releasing their own OpenJDK distribution 
> Coretto (https://aws.amazon.com/corretto/) which I have not tried yet.
>>
>>
>> Indeed python 3 was not supported as of 0.51.0 but is supported now as of 
>> 0.52.0 - https://github.com/cloudius-systems/osv/releases/tag/v0.52.0
>>
>> => Nice! A lot of improvement in resource use too, and ffmpeg looks 
>> interesting for handling camera streams.
>
> BTW I worked with some people from EU Mikelangelo project to get ffmpeg on 
> OSv do some video transcoding.
> Also hopefully within next week or two I will be trying to cut new 0.53.0 
> release of OSv. The most exciting feature will be support of AWS firecracker 
> (https://firecracker-microvm.github.io/) which allows OSv to boot in 7ms. 
> Stay tuned!
>>
>>
>> As you have noticed the go wrapper does not affect performance. It was 
>> merely added to provide a workaround around TLS-related issues with golang 
>> apps build as shared library; please see description of this commit - 
>> https://github.com/cloudius-systems/osv/commit/438008362a8ef74666b4e44af4b3205b86a52d06
>>  for details.
>>
>> => From the code, it was pretty clear the wrapper did not have any adverse 
>> effects, but I sort of had to confirm it for the paper. I did find the post 
>> you supplied, but didn't realize it was for TLS issues. Thanks.
>>
>> You have not mentioned specifically, but I am guessing all OSv images you 
>> built were with ZFS filesystem. Please note that even with 0.51.0 we had 
>> support for simple Read-Only Filesystem (ROFS). I think ROFS is even better 
>> fit for microservice apps on OSv. You can built OSv images with ROFS using 
>> latest mikelangelo capstan - 
>> https://github.com/mikelangelo-project/capstan/blob/master/Documentation/OsvFilesystem.md
>>
>> => Yes, all images were using ZFS. Considering how small the services are, I 
>> would guess they're kept entirely in memory so I'm not sure how much of an 
>> improvement ROFS may be (not very familiar with that sort of thing), but 
>> it's definitely interesting for real-life stateless services.
>>
>> I must say I was a little astonished you were able to successfully test 
>> golang apps built as --buildmode=pie. As you can see see in 
>> https://github.com/cloudius-systems/osv/issues/352, OSv currently does not 
>> support TLS (Thread Local Storage) in local-exec mode (you probably saw this 
>> warning printed by OSv - "WARNING: XYZ.so is a PIE using TLS..."). But 
>> apparently Golang apps built as pie do not use TLS or we are just lucky. I 
>> was myself surprised I could run pie Golang apps like httpserver without any 
>> issues.
>>
>> => Yes, that warning popped on my screen plenty of times. But since it 
>> worked, it didn't seem very important.
>>
>> I was surprised to hear about OSv scaling poorly under multi-CPU tests. I 
>> have not really tested OSv much on Xen except for running it on AWS EC2 
>> instances so I do not know what the reasons for that might be. I will ask on 
>> our mailing list. On other hand I must say that during casual tests on my 
>> MacBook Pro with 4 hyper-threading i7 cores and Ubuntu 18.10 on, I was able 
>> to see OSv scale pretty well with QEMU/KVM (I believe type 2 hypervisor):
>>
>> for example I was able to see 50-60% performance increase when going from 1 
>> to 2 vCPUs
>> also with 2 vCPUs I was able to see 10-15% better performance on OSv 
>> comparing to the same app running directly on the Linux host which shocked 
>> me a little I must admit
>> for my tests I was using this app - 
>> https://github.com/cloudius-systems/osv-apps/tree/master/golang-httpserver - 
>> and the ab tool (Apache Bench) to simulate load
>> lastly I saw even better performance of a microservice written in Rust 
>> (https://github.com/cloudius-systems/osv-apps/tree/master/rust-httpserver)
>>
>> => Actually, I ran some of the unikernels on VirtualBox first as proof of 
>> concept. While VirtualBox gave comparatively disastrous numbers (I guess 
>> it's not really the best hypervisor), the effect of multithreading was the 
>> same there. At first, I thought my coding was to blame, but the problem 
>> seems to persist across programming languages (with varying impact). The 
>> only real lead I managed to discover was that most of the requests under Go 
>> are in fact handled way faster in a multi-threaded application, but a tiny 
>> percentage gets held up for up to a second, which seems to block general 
>> throughput a bit. The tests used a rather large connection pool (up to 40), 
>> so my best guess at the time was that the sheer amount of concurrent 
>> requests coming in caused a few too many locks on something in the 
>> scheduler, but I don't really have experience with that level of 
>> programming. Since I didn't do any tests with 2 vCPUs, maybe there's an 
>> issue larger vCPU pools? It does seem to be very language dependent, and I'm 
>> still not clear on Go's behavior in that case. Of course, OSv unikernels 
>> have excellent single core performance, so if the scaling problem is fixed 
>> or was somehow due to the way I tested I'm pretty sure they would demolish 
>> containers or standard Linux in any comparison.
>>
>> I saw you mentioned that occasionally OSv would stop responding during 
>> multi-cpu tests. Could you please elaborate on this? I myself occasionally 
>> see a "lingering connection closed" issue where couple of requests never 
>> come back when OSv gets fired with a long batch (>100K) of requests by ab. 
>> However in my case if I simply restart the test with ab, OSv will continue 
>> responding. Is this similar to what you saw?
>>
>> => This sometimes happened when running Go unikernels without the wrapper, 
>> and in all languages when they were multi-thread enabled. I don't seem to 
>> have saved any screenshots or output, but as I remember all requests got 
>> locked up (and eventually gave a timeout) and I had to restart the entire 
>> test anyway because the results got invalidated by the gap. I will try to 
>> get hold of the original test setup again to see if OSv still responds if 
>> new requests are started. Note that it was pretty rare, since it sometimes 
>> only happened after up to 10 million requests.
>>
>> I would also like to point out that most container deployments at least in 
>> public clouds (even AWS ECS offering) use virtual machines like EC2 
>> instances NOT bare metal computers. So the difference in performance between 
>> containers and OSv that can run directly as a guest OS on EC2 VM might be 
>> even more profound. Am I wrong? But that might not be fair to containers ;-)
>>
>> => Indeed, this was also mentioned a few times by reviewers and attendees at 
>> the conference. However, I was trying to get as close to bare metal for both 
>> unikernels and containers to make the comparison more "fair". I guess in 
>> real life it's even easier for unikernels to win...
>>
>> Have you considered testing Node.js microservices or ones written in Rust? 
>> We also have a working example of running GraalVM Java native images on OSv.
>>
>> => This may happen in the future. The focus was put on Go, Java and Python 
>> because Java and Python are commonly used in my research team, and we're 
>> trying to switch some things to Go as well.
>>
>> Lastly besides improving performance of multithreaded apps, what else would 
>> you want to be enhanced or improved in OSv? Interested in your thoughts.
>>
>> Looking forward to a reply and sorry for my long email. Thanks in advance 
>> for your reply.
>>
>> My regards,
>> Waldek
>>
>> PS. Please follow us on Google groups 
>> (https://groups.google.com/forum/#!forum/osv-dev) and on Twitter at 
>> #OSv_unikernel
>>
>>
> --
> You received this message because you are subscribed to the Google Groups 
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to