re: scsipi: physio split the request
> Of course larger transfers would also mitigate the overhead for each I/O > operation, but we already do several Gigabyte/s with 64k transfers and > filesystem I/O tends to be even smaller. yes - the benefits will be in the 0-10% range for most things. it will help, but only a fairly small amount, most of us won't notice. i've seen peaks of 1.4GB/s with an nvme(4) device with ffs on top. .mrg.
Re: scsipi: physio split the request
jnem...@cue.bc.ca (John Nemeth) writes: >On Dec 27, 6:49pm, Michael van Elst wrote: >} So far that's mostly a problem with software raid and modern tape I/O. > Wouldn't hardware RAID also benefit from bigger buffers? >Although, I suppose a battery backed cache be used to workaround >small transfer sizes. The transfer size currently limits I/O of stripes because it is split over all stripe units (drives). A hardware controller does this internally and isn't affected by MAXPHYS. Of course larger transfers would also mitigate the overhead for each I/O operation, but we already do several Gigabyte/s with 64k transfers and filesystem I/O tends to be even smaller. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: scsipi: physio split the request
On Dec 27, 6:49pm, Michael van Elst wrote: } m...@netbsd.org (Emmanuel Dreyfus) writes: } } >Is there a reason other than historical for NetBSD 64kB limit? } } It's a compromise. Some buffers are statically sized for MAXPHYS } and some ancient hardware cannot exceed 64k (or even less) DMA transfers. } The buffer size is mostly a problem because we don't support } scatter-gather transfers, so the buffers need to be contigous in } physical RAM (and some hardware doesn't support s-g either). } } So far that's mostly a problem with software raid and modern tape I/O. Wouldn't hardware RAID also benefit from bigger buffers? Although, I suppose a battery backed cache be used to workaround small transfer sizes. }-- End of excerpt from Michael van Elst
Re: scsipi: physio split the request
thor...@me.com (Jason Thorpe) writes: >> You need a really huge amount of RAM for that, and also a huge >> KVA space. >...but it doesn't have to be that way. >The fundamental problem is that for physio, we currently have to map the >buffer into kernel space at all. Mapping into KVA is another problem. > We really should have a more abstract way to describe memory that is passed > down to device drivers that currently take struct buf *s, call it an I/O > memory descriptor ("iomd"). This iomd would have, say, an array of vm_page > *'s, or perhaps an array of paddr_t's, but would also have a pointer to the > buffer as mapped into kernel address space. The problem is that currently we and also some hardware cannot handle such a construct. >Then a new bus_dmamap_load_iomd() call could take an iomd as an argument, and >skip doing a bunch of work (calling into the pmap later to get the physical >address), and just build the bus_dma_segment_t's directly. There is hardware that can only handle a single bus_dma_segment. So that's: - support some more abstract MAXPHYS (i.e. not a global constant). - make buffers based on scatter-gather lists instead of a single linear piece of memory. - make drivers use these scatter-gather buffers - try to emulate this behaviour when hardware is too limited. - make other users of buffers compatible with scatter-gather lists That's a long way to go and still not related to mapping buffers into KVA. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: scsipi: physio split the request
On Dec 27, 12:29pm, buh...@nfbcal.org (Brian Buhrow) wrote: -- Subject: Re: scsipi: physio split the request | hello. Just out of curiosity, why did the tls-maxphys branch never | get merged with head once the work was done or mostly done? mostly done... christos
Re: scsipi: physio split the request
> On Dec 27, 2018, at 10:51 AM, Michael van Elst wrote: > > m...@netbsd.org (Emmanuel Dreyfus) writes: > >> What happens if I just #define MAXPHYS (1024*1204*1024) ? > > You need a really huge amount of RAM for that, and also a huge > KVA space. ...but it doesn't have to be that way. The fundamental problem is that for physio, we currently have to map the buffer into kernel space at all. We really should have a more abstract way to describe memory that is passed down to device drivers that currently take struct buf *s, call it an I/O memory descriptor ("iomd"). This iomd would have, say, an array of vm_page *'s, or perhaps an array of paddr_t's, but would also have a pointer to the buffer as mapped into kernel address space. The necessary part is having the page array filled in, along with an offset, and a length. If not sufficient, then callers could map the buffer ONLY if needed, e.g. if you have to do PIO to your device. Then a new bus_dmamap_load_iomd() call could take an iomd as an argument, and skip doing a bunch of work (calling into the pmap later to get the physical address), and just build the bus_dma_segment_t's directly. If it needs to bounce-buffer, then the back-end takes care of calling iomd_map() or whatever. This isn't a fully fleshed-out proposal, or anything, but I know it's ben brought up off and on for years... we really ought to just get around to doing it. Unfortunately, it's going to mean modifying a lot of drivers before the upper layers can assume "I can pass iomds down everywhere for buf I/O". -- thorpej
Re: scsipi: physio split the request
hello. Just out of curiosity, why did the tls-maxphys branch never get merged with head once the work was done or mostly done? -thanks -Brian
Re: scsipi: physio split the request
In article <20181227153028.gr4...@homeworld.netbsd.org>, Emmanuel Dreyfus wrote: >On Thu, Dec 27, 2018 at 09:47:03AM -0500, Christos Zoulas wrote: >> | What happens if I just #define MAXPHYS (1024*1204*1024) ? >> I don't think that's a good idea. My guess is that things are going to >blow up. > >At least if I try to be on par with Linux limit and build with >-DMAXPHYS=1048576 the system goes to multiuser without a hitch. > >Running mkltfs raises aa few errors on the console, though: >mpii0: error 27 loading dmamap >st0(mpii0:0:2:0): passthrough: adapter inconsistency >mpii0: error 27 loading dmamap >st0(mpii0:0:2:0): passthrough: adapter inconsistency Told you: EFBIG :-) Why don't you try tls-maxphys? christos
Re: scsipi: physio split the request
m...@netbsd.org (Emmanuel Dreyfus) writes: >What happens if I just #define MAXPHYS (1024*1204*1024) ? You need a really huge amount of RAM for that, and also a huge KVA space. Try MAXPHYS (1024*1024) for a start. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: scsipi: physio split the request
m...@netbsd.org (Emmanuel Dreyfus) writes: >Is there a reason other than historical for NetBSD 64kB limit? It's a compromise. Some buffers are statically sized for MAXPHYS and some ancient hardware cannot exceed 64k (or even less) DMA transfers. The buffer size is mostly a problem because we don't support scatter-gather transfers, so the buffers need to be contigous in physical RAM (and some hardware doesn't support s-g either). So far that's mostly a problem with software raid and modern tape I/O. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: scsipi: physio split the request
m...@netbsd.org (Emmanuel Dreyfus) writes: >On Thu, Dec 27, 2018 at 10:44:46AM +0100, Manuel Bouyer wrote: >> tape block size are usually larger than 512 (I use 64k here). >> What block size did mkltfs use ? Actually we can't do larger than 64k. >It seems to attempt transfers of 256kB We are limited to MAXPHYS which is currently 64k. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: scsipi: physio split the request
On Thu, Dec 27, 2018 at 09:47:03AM -0500, Christos Zoulas wrote: > | What happens if I just #define MAXPHYS (1024*1204*1024) ? > I don't think that's a good idea. My guess is that things are going to blow > up. At least if I try to be on par with Linux limit and build with -DMAXPHYS=1048576 the system goes to multiuser without a hitch. Running mkltfs raises aa few errors on the console, though: mpii0: error 27 loading dmamap st0(mpii0:0:2:0): passthrough: adapter inconsistency mpii0: error 27 loading dmamap st0(mpii0:0:2:0): passthrough: adapter inconsistency -- Emmanuel Dreyfus m...@netbsd.org
Re: svr4, again
Le 21/12/2018 à 12:19, Maxime Villard a écrit : Le 21/12/2018 à 10:25, Anders Magnusson a écrit : Den 2018-12-20 kl. 21:29, skrev Maxime Villard: Le 20/12/2018 à 18:11, Kamil Rytarowski a écrit : https://github.com/krytarowski/franz-lisp-netbsd-0.9-i386 On the other hand unless we need it for bootloaders, drivers or something needed to run NetBSD, I'm for removal of srv3, sunos etc compat. Yes. So, first things first, and to come back to my email about ibcs2: what are the reasons for keeping it? As I said previously, this is not for x86 but for Vax. As was also said, FreeBSD removed it just a few days ago. I'm bringing up compat_ibcs2 because I did start a thread on port-vax@ about it last year (as quoted earlier), and back then it seemed that no one knew what was the use case on Vax. It was something that Matt Thomas used for a customer running some commercial program, but it was a long time ago (15 years?). I've never heard of any other use, so from my perspective IBCS2 not relevant (anymore). -- ragge Alright, so I propose that we retire it. After a quick scroll-reread of the thread it seems to me we all agree on that. Anyone objecting etc? So, no one? I will remove it soon...
Re: scsipi: physio split the request
On Dec 27, 2:41pm, m...@netbsd.org (Emmanuel Dreyfus) wrote: -- Subject: Re: scsipi: physio split the request | On Thu, Dec 27, 2018 at 02:33:28PM +, Christos Zoulas wrote: | > I think you need resurrect the tls-maxphys branch... It was close to working | > IIRC. | | What happens if I just #define MAXPHYS (1024*1204*1024) ? I don't think that's a good idea. My guess is that things are going to blow up. christos
Re: scsipi: physio split the request
On Thu, Dec 27, 2018 at 02:33:28PM +, Christos Zoulas wrote: > I think you need resurrect the tls-maxphys branch... It was close to working > IIRC. What happens if I just #define MAXPHYS (1024*1204*1024) ? -- Emmanuel Dreyfus m...@netbsd.org
Re: scsipi: physio split the request
In article <20181227123711.go4...@homeworld.netbsd.org>, Emmanuel Dreyfus wrote: >On Thu, Dec 27, 2018 at 10:44:46AM +0100, Manuel Bouyer wrote: >> tape block size are usually larger than 512 (I use 64k here). >> What block size did mkltfs use ? Actually we can't do larger than 64k. > >It seems to attempt transfers of 256kB > >LTFS20010D SCSI request: [ A3 1F 08 00 00 00 04 00 00 00 00 00 ] >Requested length=262144 >LTFS20089D Driver detail:errno = 0x5 >LTFS20089D Driver detail: host_status = 0x0 >LTFS20089D Driver detail:driver_status = 0x0 >LTFS20089D Driver detail: status = 0x0 >LTFS20011D SCSI outcome: Driver status=0xFF SCSI status=0xFF Actual length=0 I think you need resurrect the tls-maxphys branch... It was close to working IIRC. christos
Re: scsipi: physio split the request
On Thu, Dec 27, 2018 at 10:44:46AM +0100, Manuel Bouyer wrote: > tape block size are usually larger than 512 (I use 64k here). I patched ltfs so that all the max sizes (256kB and 512kB for Linux) are set to 64kB for NetBSD. I can now format and mount the LTFS filesystem, but I need to limit the block size to under 64kB. This will work: dump -0f - / | dd obs=63k of=/ltfs/dump20181227 This hangs the filesystem: dump -0f - / | dd obs=64k of=/ltfs/dump20181227 I tested on glusterfs that our FUSE implementation does not limit writes to 64k chunks, hence I assume I introduced a bug in ltfs with the 64kB limit everywhere. Is there a reason other than historical for NetBSD 64kB limit? -- Emmanuel Dreyfus m...@netbsd.org
Re: scsipi: physio split the request
On Thu, Dec 27, 2018 at 10:44:46AM +0100, Manuel Bouyer wrote: > tape block size are usually larger than 512 (I use 64k here). > What block size did mkltfs use ? Actually we can't do larger than 64k. It seems to attempt transfers of 256kB LTFS20010D SCSI request: [ A3 1F 08 00 00 00 04 00 00 00 00 00 ] Requested length=262144 LTFS20089D Driver detail:errno = 0x5 LTFS20089D Driver detail: host_status = 0x0 LTFS20089D Driver detail:driver_status = 0x0 LTFS20089D Driver detail: status = 0x0 LTFS20011D SCSI outcome: Driver status=0xFF SCSI status=0xFF Actual length=0 -- Emmanuel Dreyfus m...@netbsd.org
Re: scsipi: physio split the request
On Thu, Dec 27, 2018 at 09:07:41AM +, Emmanuel Dreyfus wrote: > Hello > > A few years ago I made a failed attempt at running LTFS on a LTO 6 drive. > I resumed the effort, and once I got the LTFS code ported, running > a command like mkltfs fails with kernel console saying: > st0(mpii0:0:2:0): physio split the request.. cannot proceed > > This is netbsd-current from yesterday. > > I understand this is about tape block size larger than usual 512. tape block size are usually larger than 512 (I use 64k here). What block size did mkltfs use ? Actually we can't do larger than 64k. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
scsipi: physio split the request
Hello A few years ago I made a failed attempt at running LTFS on a LTO 6 drive. I resumed the effort, and once I got the LTFS code ported, running a command like mkltfs fails with kernel console saying: st0(mpii0:0:2:0): physio split the request.. cannot proceed This is netbsd-current from yesterday. I understand this is about tape block size larger than usual 512. src/sys/dev/st.c seems to have provision for that, with drive specific quirks like below. Do I read it correctly? {{T_SEQUENTIAL, T_REMOV, "TANDBERG", " TDC 3600 ", ""}, {0, 12, { {0, 0, 0}, /* minor 0-3 */ {ST_Q_FORCE_BLKSIZE, 0, QIC_525}, /* minor 4-7 */ {0, 0, QIC_150},/* minor 8-11 */ {0, 0, QIC_120} /* minor 12-15 */ }}}, {{T_SEQUENTIAL, T_REMOV, "TANDBERG", " TDC 3800 ", ""}, {0, 0, { {ST_Q_FORCE_BLKSIZE, 512, 0}, /* minor 0-3 */ {0, 0, QIC_525},/* minor 4-7 */ {0, 0, QIC_150},/* minor 8-11 */ {0, 0, QIC_120} /* minor 12-15 */ }}}, Mine is detected as: st0 at scsibus0 target 2 lun 0: tape removable st0: density code 90, variable blocks, write-enabled st0: tagged queueing Is it just that I need quirks for this specific drive? If this is the case, where the appropriate information can be found? Looking in Linux kernel code, I can only find stuff about TANDBERG TDC 3600, with nothing about block size. -- Emmanuel Dreyfus m...@netbsd.org