Disk/file IO speed Re: OT: Upload and Download to/from an OpenBSD host

2017-10-31 Thread tinkr
> On Mon, Oct 30, 2017 at 09:23:51PM +0200, Mihai Popescu wrote:
..
> Back in a former life, I often had to transfer terabytes of information
> between hosts in my network. The goal was to reduce the overhead of the
> transfer so that more of the data would get transferred. I had been
> using netcat/nc and pushing data through the connection as fast as I
> could. This worked great, provided I could verify that things were being
> transferred without errors. Any connection problems and I'd be screwed.
..
> Essentially, he's compressing files and then sending them over to fit
> more data down the pipe. I also learned of this other program called
..
> Check those out, and I'm very curious if anyone else on the list has
> more OpenBSD-centric techniques!

For file transfers today, I presume speed is limited not by the networking 
subsystem  or even by compression speed, but instead by the block device/file 
IO subsystem, which has a cap last time I checked of around 120MB/sec 
system-wide.

(Faster for retrieval from the buffer cache, however that boost drops very 
quickly.)

A SATA SSD gives you ~~450MB/sec, and an M.2 NVME SSD gives you ~~900MB/sec, 
sequential and random. You get multipliers of that in a RAID.


You get some of this performance in OpenBSD today if you make your accesses to 
/dev/rsd* only, and of 16KB-aligned 16KB multiples, that makes OpenBSD yield 
~600MB/sec or so, however this access mode has tons of limitations, for 
instance /dev/rsd* doesn't support mmap().


My best understanding of OpenBSD's block device/file IO subsystem is that it 
has a "biglock" and that all underlying disk/file access is serialized / 
totally synchronous, with no support for or use of multiqueuing or any other 
parallellization.

Implementing asynchronous logics is tedious and from a security perspective I 
would understand the charm of keeping an implementation that is cemented to be 
synchronous only. Is this what's going on?


All this leads me to the perception that of all areas in OpenBSD, disk/file IO 
is the weakest/the one with the least strengths, compared to many other areas 
where obviously OpenBSD is the very best.

(Way lower priority topics, however related, are unified buffer cache 
https://marc.info/?t=14322826671&r=1&w=2 and the buffer cache 4GB limit 
https://marc.info/?t=14553871052&r=1&w=2 , 
https://marc.info/?l=openbsd-tech&m=147905745617957&w=2 , 
https://marc.info/?t=14682443664&r=1&w=2 , 
https://unix.stackexchange.com/questions/61459/does-sysctl-kern-bufcachepercent-not-work-in-openbsd-5-2-above-1-7gb
 .)


Is there any interest among developers to speed up the disk/file IO?

What would it actually involve?

What's needed from the community, donations? Would directed donations be 
welcome?

Thank you very much and sorry for the buzz.

Re: OT: Upload and Download to/from an OpenBSD host

2017-10-31 Thread Craig Skinner
Hi Mihai

On Mon, 30 Oct 2017 21:23:51 +0200 Mihai Popescu wrote:
> I am trying to setup a solution on an OpenBSD computer, where i want
> to upload and then download large volume of data. I was using ftpd
> daemon to do this, but I wonder if there is another way to do this,
> regarding speed of transfer.
>

If on a trustworthy private network or via a cross over network cable,
netcat can be quiet fast, e.g:


# I started netcat listening on a host with spare space:

$ umask 077; nc -l 5 | dd 
of=/mnt/kingswood/_home.dump


# On the cramped host, I unmounted & disk dumped to netcat:

$ mktemp
/tmp/operator/tmp.UZEOHQyzDH
$ dump -0anu -f - /dev/rwd1f 
2>/tmp/operator/tmp.UZEOHQyzDH |
  nc -N -w 15 torana.internal 5

  DUMP: Date of this level 0 dump: Fri Aug 21 12:56:36 2015
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/rwd1f (/home) to standard output
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 190840212 tape blocks.
  DUMP: Volume 1 started at: Fri Aug 21 12:56:48 2015
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 0.80% done, finished in 10:21
  DUMP: 1.62% done, finished in 10:06
  DUMP: 2.44% done, finished in 9:59
  DUMP: 3.26% done, finished in 9:52
  DUMP: 4.08% done, finished in 9:47
  DUMP: 4.91% done, finished in 9:40
  .
  
  ...
  DUMP: 97.54% done, finished in 0:15
  DUMP: 98.35% done, finished in 0:10
  DUMP: 99.17% done, finished in 0:05
  DUMP: 99.99% done, finished in 0:00
  DUMP: 190837578 tape blocks
  DUMP: Date of this level 0 dump: Fri Aug 21 12:56:36 2015
  DUMP: Volume 1 completed at: Fri Aug 21 23:06:51 2015
  DUMP: Volume 1 took 10:10:03
  DUMP: Volume 1 transfer rate: 5213 KB/s
  DUMP: Date this dump completed:  Fri Aug 21 23:06:51 2015
  DUMP: Average transfer rate: 5213 KB/s
  DUMP: level 0 dump on Fri Aug 21 12:56:36 2015
  DUMP: DUMP IS DONE





# Netcat to dd on the spacious host logged:
314140569+87238623 records in
381675140+0 records out
195417671680 bytes transferred in 37251.937 secs (5245839 bytes/sec)


$ df -h /mnt/kingswood
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/sd1g  210G182G   17.5G91%/mnt/kingswood
$ ls -lh /mnt/kingswood/
total 381722816
-rw---  1 operator  operator   182G Aug 21 23:06 _home.dump

...


# After rejigging the disks on the cramped host, newfs, etc, I restored:

# nc -l 5 | restore -ryvf - > 
restore.output.$RANDOM 2>&1



# Transfer the dump back to the previously cramped host, via netcat:

$ dd if=/mnt/kingswood/_home.dump |
  nc -v -N -w 15 kingswood.internal 5
Connection to kingswood.internal 5 port [tcp/*] succeeded!
381675140+0 records in
381675140+0 records out
195417671680 bytes transferred in 29107.667 secs (6713615 bytes/sec)


# less /home/restore.output.569
Level 0 dump of /home on kingswood.internal:/dev/wd1f
Label: none
Verify tape and initialize maps
Dump   date: Fri Aug 21 12:56:36 2015
Dumped from: the epoch
Begin level 0 restore
Initialize symbol table.
Extract directories from tape
Calculate extraction list.
Make node ..



$ df -h /home
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/wd1d  299G182G102G64%/home


182G was restored on the newly formatted and enlarged partition
(now 'd' instead of 'f'), via netcat, from another host.


As well as disk partitions, dump(8) works on files & directories too.


Everything needed is in base OpenBSD.

Ace!
-- 
Craig Skinner | http://linkd.in/yGqkv7



Re: OT: Upload and Download to/from an OpenBSD host

2017-10-30 Thread Solène Rapenne

Je 2017-10-30 20:23, Mihai Popescu skribis:

Hi,

I am trying to setup a solution on an OpenBSD computer, where i want
to upload and then download large volume of data. I was using ftpd
daemon to do this, but I wonder if there is another way to do this,
regarding speed of transfer.

Sometimes I was in situations to upload and then download data to/from
and OpenBSD computer.
This was happened whenever I got a Windows or Linux machines hooked in
to retrieve large volume of data from their internal disks. The
machines are both plugged in in a switch or they can be directly
linked with an Ethernet cable.

If you think there is another way to do it, other than moving the
disks between machines, please put some ideas here.

Thanks.


Hello,

If you trust your LAN (are the computers on the same LAN ?)
I would recommend using rsync and rsyncd. Without encryption it's really
fast and efficient.



Re: OT: Upload and Download to/from an OpenBSD host

2017-10-30 Thread Mike Coddington
On Mon, Oct 30, 2017 at 09:23:51PM +0200, Mihai Popescu wrote:
> Hi,
> 
> I am trying to setup a solution on an OpenBSD computer, where i want
> to upload and then download large volume of data. I was using ftpd
> daemon to do this, but I wonder if there is another way to do this,
> regarding speed of transfer.
> 

Back in a former life, I often had to transfer terabytes of information
between hosts in my network. The goal was to reduce the overhead of the
transfer so that more of the data would get transferred. I had been
using netcat/nc and pushing data through the connection as fast as I
could. This worked great, provided I could verify that things were being
transferred without errors. Any connection problems and I'd be screwed.
Eventually, I found this web page which details another guy's research
into the same problem:

http://intermediatesql.com/linux/scrap-the-scp-how-to-copy-data-fast-using-pigz-and-nc/

Essentially, he's compressing files and then sending them over to fit
more data down the pipe. I also learned of this other program called
bbcp which was developed specifically for our purposes. It's mentioned
in the above article, and that author wasn't big on it. I like it
though; it's what I ended up using because to me it is a more elegant
solution and worked well for my needs.

http://www.slac.stanford.edu/~abh/bbcp/

Check those out, and I'm very curious if anyone else on the list has
more OpenBSD-centric techniques!


-- 
Put your Nose to the Grindstone!
-- Amalgamated Plastic Surgeons and Toolmakers, Ltd.



OT: Upload and Download to/from an OpenBSD host

2017-10-30 Thread Mihai Popescu
Hi,

I am trying to setup a solution on an OpenBSD computer, where i want
to upload and then download large volume of data. I was using ftpd
daemon to do this, but I wonder if there is another way to do this,
regarding speed of transfer.

Sometimes I was in situations to upload and then download data to/from
and OpenBSD computer.
This was happened whenever I got a Windows or Linux machines hooked in
to retrieve large volume of data from their internal disks. The
machines are both plugged in in a switch or they can be directly
linked with an Ethernet cable.

If you think there is another way to do it, other than moving the
disks between machines, please put some ideas here.

Thanks.