Disk/file IO speed Re: OT: Upload and Download to/from an OpenBSD host
> On Mon, Oct 30, 2017 at 09:23:51PM +0200, Mihai Popescu wrote: .. > Back in a former life, I often had to transfer terabytes of information > between hosts in my network. The goal was to reduce the overhead of the > transfer so that more of the data would get transferred. I had been > using netcat/nc and pushing data through the connection as fast as I > could. This worked great, provided I could verify that things were being > transferred without errors. Any connection problems and I'd be screwed. .. > Essentially, he's compressing files and then sending them over to fit > more data down the pipe. I also learned of this other program called .. > Check those out, and I'm very curious if anyone else on the list has > more OpenBSD-centric techniques! For file transfers today, I presume speed is limited not by the networking subsystem or even by compression speed, but instead by the block device/file IO subsystem, which has a cap last time I checked of around 120MB/sec system-wide. (Faster for retrieval from the buffer cache, however that boost drops very quickly.) A SATA SSD gives you ~~450MB/sec, and an M.2 NVME SSD gives you ~~900MB/sec, sequential and random. You get multipliers of that in a RAID. You get some of this performance in OpenBSD today if you make your accesses to /dev/rsd* only, and of 16KB-aligned 16KB multiples, that makes OpenBSD yield ~600MB/sec or so, however this access mode has tons of limitations, for instance /dev/rsd* doesn't support mmap(). My best understanding of OpenBSD's block device/file IO subsystem is that it has a "biglock" and that all underlying disk/file access is serialized / totally synchronous, with no support for or use of multiqueuing or any other parallellization. Implementing asynchronous logics is tedious and from a security perspective I would understand the charm of keeping an implementation that is cemented to be synchronous only. Is this what's going on? All this leads me to the perception that of all areas in OpenBSD, disk/file IO is the weakest/the one with the least strengths, compared to many other areas where obviously OpenBSD is the very best. (Way lower priority topics, however related, are unified buffer cache https://marc.info/?t=14322826671&r=1&w=2 and the buffer cache 4GB limit https://marc.info/?t=14553871052&r=1&w=2 , https://marc.info/?l=openbsd-tech&m=147905745617957&w=2 , https://marc.info/?t=14682443664&r=1&w=2 , https://unix.stackexchange.com/questions/61459/does-sysctl-kern-bufcachepercent-not-work-in-openbsd-5-2-above-1-7gb .) Is there any interest among developers to speed up the disk/file IO? What would it actually involve? What's needed from the community, donations? Would directed donations be welcome? Thank you very much and sorry for the buzz.
Re: OT: Upload and Download to/from an OpenBSD host
Hi Mihai On Mon, 30 Oct 2017 21:23:51 +0200 Mihai Popescu wrote: > I am trying to setup a solution on an OpenBSD computer, where i want > to upload and then download large volume of data. I was using ftpd > daemon to do this, but I wonder if there is another way to do this, > regarding speed of transfer. > If on a trustworthy private network or via a cross over network cable, netcat can be quiet fast, e.g: # I started netcat listening on a host with spare space: $ umask 077; nc -l 5 | dd of=/mnt/kingswood/_home.dump # On the cramped host, I unmounted & disk dumped to netcat: $ mktemp /tmp/operator/tmp.UZEOHQyzDH $ dump -0anu -f - /dev/rwd1f 2>/tmp/operator/tmp.UZEOHQyzDH | nc -N -w 15 torana.internal 5 DUMP: Date of this level 0 dump: Fri Aug 21 12:56:36 2015 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/rwd1f (/home) to standard output DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 190840212 tape blocks. DUMP: Volume 1 started at: Fri Aug 21 12:56:48 2015 DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 0.80% done, finished in 10:21 DUMP: 1.62% done, finished in 10:06 DUMP: 2.44% done, finished in 9:59 DUMP: 3.26% done, finished in 9:52 DUMP: 4.08% done, finished in 9:47 DUMP: 4.91% done, finished in 9:40 . ... DUMP: 97.54% done, finished in 0:15 DUMP: 98.35% done, finished in 0:10 DUMP: 99.17% done, finished in 0:05 DUMP: 99.99% done, finished in 0:00 DUMP: 190837578 tape blocks DUMP: Date of this level 0 dump: Fri Aug 21 12:56:36 2015 DUMP: Volume 1 completed at: Fri Aug 21 23:06:51 2015 DUMP: Volume 1 took 10:10:03 DUMP: Volume 1 transfer rate: 5213 KB/s DUMP: Date this dump completed: Fri Aug 21 23:06:51 2015 DUMP: Average transfer rate: 5213 KB/s DUMP: level 0 dump on Fri Aug 21 12:56:36 2015 DUMP: DUMP IS DONE # Netcat to dd on the spacious host logged: 314140569+87238623 records in 381675140+0 records out 195417671680 bytes transferred in 37251.937 secs (5245839 bytes/sec) $ df -h /mnt/kingswood Filesystem SizeUsed Avail Capacity Mounted on /dev/sd1g 210G182G 17.5G91%/mnt/kingswood $ ls -lh /mnt/kingswood/ total 381722816 -rw--- 1 operator operator 182G Aug 21 23:06 _home.dump ... # After rejigging the disks on the cramped host, newfs, etc, I restored: # nc -l 5 | restore -ryvf - > restore.output.$RANDOM 2>&1 # Transfer the dump back to the previously cramped host, via netcat: $ dd if=/mnt/kingswood/_home.dump | nc -v -N -w 15 kingswood.internal 5 Connection to kingswood.internal 5 port [tcp/*] succeeded! 381675140+0 records in 381675140+0 records out 195417671680 bytes transferred in 29107.667 secs (6713615 bytes/sec) # less /home/restore.output.569 Level 0 dump of /home on kingswood.internal:/dev/wd1f Label: none Verify tape and initialize maps Dump date: Fri Aug 21 12:56:36 2015 Dumped from: the epoch Begin level 0 restore Initialize symbol table. Extract directories from tape Calculate extraction list. Make node .. $ df -h /home Filesystem SizeUsed Avail Capacity Mounted on /dev/wd1d 299G182G102G64%/home 182G was restored on the newly formatted and enlarged partition (now 'd' instead of 'f'), via netcat, from another host. As well as disk partitions, dump(8) works on files & directories too. Everything needed is in base OpenBSD. Ace! -- Craig Skinner | http://linkd.in/yGqkv7
Re: OT: Upload and Download to/from an OpenBSD host
Je 2017-10-30 20:23, Mihai Popescu skribis: Hi, I am trying to setup a solution on an OpenBSD computer, where i want to upload and then download large volume of data. I was using ftpd daemon to do this, but I wonder if there is another way to do this, regarding speed of transfer. Sometimes I was in situations to upload and then download data to/from and OpenBSD computer. This was happened whenever I got a Windows or Linux machines hooked in to retrieve large volume of data from their internal disks. The machines are both plugged in in a switch or they can be directly linked with an Ethernet cable. If you think there is another way to do it, other than moving the disks between machines, please put some ideas here. Thanks. Hello, If you trust your LAN (are the computers on the same LAN ?) I would recommend using rsync and rsyncd. Without encryption it's really fast and efficient.
Re: OT: Upload and Download to/from an OpenBSD host
On Mon, Oct 30, 2017 at 09:23:51PM +0200, Mihai Popescu wrote: > Hi, > > I am trying to setup a solution on an OpenBSD computer, where i want > to upload and then download large volume of data. I was using ftpd > daemon to do this, but I wonder if there is another way to do this, > regarding speed of transfer. > Back in a former life, I often had to transfer terabytes of information between hosts in my network. The goal was to reduce the overhead of the transfer so that more of the data would get transferred. I had been using netcat/nc and pushing data through the connection as fast as I could. This worked great, provided I could verify that things were being transferred without errors. Any connection problems and I'd be screwed. Eventually, I found this web page which details another guy's research into the same problem: http://intermediatesql.com/linux/scrap-the-scp-how-to-copy-data-fast-using-pigz-and-nc/ Essentially, he's compressing files and then sending them over to fit more data down the pipe. I also learned of this other program called bbcp which was developed specifically for our purposes. It's mentioned in the above article, and that author wasn't big on it. I like it though; it's what I ended up using because to me it is a more elegant solution and worked well for my needs. http://www.slac.stanford.edu/~abh/bbcp/ Check those out, and I'm very curious if anyone else on the list has more OpenBSD-centric techniques! -- Put your Nose to the Grindstone! -- Amalgamated Plastic Surgeons and Toolmakers, Ltd.
OT: Upload and Download to/from an OpenBSD host
Hi, I am trying to setup a solution on an OpenBSD computer, where i want to upload and then download large volume of data. I was using ftpd daemon to do this, but I wonder if there is another way to do this, regarding speed of transfer. Sometimes I was in situations to upload and then download data to/from and OpenBSD computer. This was happened whenever I got a Windows or Linux machines hooked in to retrieve large volume of data from their internal disks. The machines are both plugged in in a switch or they can be directly linked with an Ethernet cable. If you think there is another way to do it, other than moving the disks between machines, please put some ideas here. Thanks.