led too.
>
> I have CCed Jens and the io_uring mailing list to clarify:
> 1. Are short IORING_OP_READV reads possible on files/block devices?
> 2. Are short IORING_OP_WRITEV writes possible on files/block devices?
In general we try very hard to avoid them, but if eg we get a short read
or write from blocking context (eg io-wq), then io_uring does return
that. There's really not much we can do here, it seems futile to retry
IO which was issued just like it would've been from a normal blocking
syscall yet it is still short.
--
Jens Axboe
On 10/17/20 8:29 AM, Ju Hyung Park wrote:
> Hi Jens.
>
> On Sat, Oct 17, 2020 at 3:07 AM Jens Axboe wrote:
>>
>> Would be great if you could try 5.4.71 and see if that helps for your
>> issue.
>>
>
> Oh wow, yeah it did fix the issue.
>
> I'm a
Would be great if you could try 5.4.71 and see if that helps for your
issue.
--
Jens Axboe
On 11/26/2014 02:51 PM, Mike Snitzer wrote:
> On Wed, Nov 26 2014 at 3:54pm -0500,
> Jens Axboe wrote:
>
>> On 11/26/2014 01:51 PM, Mike Snitzer wrote:
>>> On Wed, Nov 26 2014 at 2:48pm -0500,
>>> Jens Axboe wrote:
>>>
>>>>
>&
On 11/26/2014 01:51 PM, Mike Snitzer wrote:
> On Wed, Nov 26 2014 at 2:48pm -0500,
> Jens Axboe wrote:
>
>> On 11/21/2014 08:49 AM, Mike Snitzer wrote:
>>> On Fri, Nov 21 2014 at 4:54am -0500,
>>> Christoph Hellwig wrote:
>>>
>>>> On Thu
0xa0
> [] ? do_vfs_ioctl+0x84/0x580
> [] ? security_file_permission+0x16/0x20
> [] vfs_read+0xb5/0x1a0
> [] sys_read+0x51/0x90
> [] ? __audit_syscall_exit+0x25e/0x290
> [] system_call_fastpath+0x16/0x1b
> Code: fe ff ff c7 85 fc fe ff ff 00 00 00 00 48 89 95 10 ff ff ff 8b 95 34 ff
> ff ff e8 46 ac ff ff 3b 85 34 ff ff ff 0f 84 fc 02 00 00 <0f> 0b eb fe 8b 9d
> 34 ff ff ff 8b 85 30 ff ff ff 01 d8 85 c0 0f
> RIP [] __blockdev_direct_IO_newtrunc+0x986/0x1270
> RSP
> ---[ end trace 73be5dcaf8939399 ]---
> Kernel panic - not syncing: Fatal exception
That code isn't even in mainline, as far as I can tell...
--
Jens Axboe
ds very reasonable. Let me know if there's anything you
need help or advice with.
> Jens: when experimenting with multiqueue virtio-blk, how far did you
> modify QEMU to eliminate global request processing state from block.c?
I did very little scaling testing on virtio-blk, it was more a demo case
for conversion than anything else. So probably not of much use to what
you are looking for...
--
Jens Axboe
lts when they are in.
I'm very interested as well, I have been hoping for some more adoption
of this. I have mptsas and mpt2sas patches pending as well.
I have not done enough and fully exhaustive weight analysis, so note me
down for wanting such an analysis on virtio_blk as well.
--
Jens Axboe
ch an API years ago, so CC'ing the
> usual I/O suspects...
It would be nice to have a more fuller API for this, but the reality is
that only the flush approach is really workable. Even just strict
ordering of requests could only be supported on SCSI, and even there the
kernel still lacks proper guarantees on error handling to prevent
reordering there.
--
Jens Axboe
ium_type == MST_NO_DISC) ||
> (cdp->cap.medium_type == MST_DOOR_OPEN) ||
> (cdp->cap.medium_type == MST_FMT_ERROR))
> return EIO;
> else
> break;
> }
> pause("acdld", hz / 2);
> }
> [...]
>
> There have been reports of this also being broken on real hw tho,
> like,
> http://lists.freebsd.org/pipermail/freebsd-current/2007-November/079760.html
> so I'm not sure what to make of this...
Well if you ask me (I used to maintain the linux atapi driver), the
freebsd driver suffers from a classic case of 'but the specs says so!'
syndrome. In this case it's even ancient documentation. Drivers should
never try to be 100% spec oriented, they also need a bit of real life
sensibility. The code you quote right above this text is clearly too
anal.
--
Jens Axboe
;buf[0], 28 + 6);
-buf[2] = 0x70;
+ if (!bdrv_is_inserted(s->bs))
+ buf[2] = 0x70;
+ else
+ buf[2] = 0;
buf[3] = 0;
buf[4] = 0;
buf[5] = 0;
--
Jens Axboe
On Sun, Jun 24 2007, Rob Landley wrote:
> On Saturday 23 June 2007 07:00:03 Jens Axboe wrote:
> > > I realize releases are a bit out of fashion, but is there any way to go
> > > through cvs to track down which checkin broke this stuff? I can do it in
> > > git, merc
ize releases are a bit out of fashion, but is there any way to go
> through cvs to track down which checkin broke this stuff? I can do it in
> git, mercurial, or subversion. But cvs isn't really set up for this sort of
> thing...
git clone git://git.kernel.dk/data/git/qemu.git
and bisect on that then. It's a continued git import of the cvs repo,
gets updated every night.
--
Jens Axboe
t; 16bits, or 64k, that is.
Yeah, it's for larger requests. It would be nice to track elsewhere,
though. I'll take a look at it.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Mon, Feb 19 2007, Thiemo Seufer wrote:
> Jens Axboe wrote:
> > On Mon, Feb 19 2007, Thiemo Seufer wrote:
> > > CVSROOT: /sources/qemu
> > > Module name: qemu
> > > Changes by: Thiemo Seufer 07/02/19 00:59:34
> > >
> &g
e.
>
> CVSWeb URLs:
> http://cvs.savannah.gnu.org/viewcvs/qemu/hw/ide.c?cvsroot=qemu&r1=1.53&r2=1.54
Why is nsector uint32_t to begin with?
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
terprets the x86 code when the interrupts are
> disabled, so it is very slow in this case.
>
> A potential solution I am investigating is to use the new
> paravirtualization API of the kernel versions >= 2.6.20.
That'd be a great idea!
--
Jens Axboe
__
On Tue, Aug 01 2006, Jamie Lokier wrote:
> Jens Axboe wrote:
> > > > > If you just want to evict all data from the drive's cache, and don't
> > > > > actually have other data to write, there is a CACHEFLUSH command you
> > > > > can send
On Tue, Aug 01 2006, Jamie Lokier wrote:
> Jens Axboe wrote:
> > On Tue, Aug 01 2006, Jamie Lokier wrote:
> > > > Of course, guessing the disk drive write buffer size and trying not to
> > > > kill
> > > > system I/O performance with all these
data has been written (to cache). At least
reiserfs w/barriers on Linux does this.
Random write tricks are worthless, as you cannot make any assumptions
about what the drive firmware will do.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel
akes IO unreliable in QEMU is that IO errors on the
> host are not reported to the guest by the IDE emulation and there's an
> exact place in hw/ide.c where they are arrogantly ignored.
Send a patch, I'm pretty sure nobody would disagree :-)
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Mon, Jul 31 2006, Jonas Maebe wrote:
>
> On 31 jul 2006, at 09:08, Jens Axboe wrote:
>
> >>Applications running on the host can count on fsync doing the
> >>right thing, meaning that if they call fsync, the data *will*
> >>have made it to disk. Application
the QEMU hard drive should get
notified. If the guest OS isn't doing what it's supposed to, QEMU can't
help you. And, in fact, running your app on the same host OS with write
back caching would screw you as well. The timing window will probably
it just means the guest can do something else while it's waiting.
Depends on the app, if the io workload is parallel then you should see a
nice speedup as well (as QEMU is then no longer the serializing bottle
neck).
--
Jens Axboe
___
Qemu-dev
led disk and it supports disk flushes as well. So
essentially it's the OS on top of QEMU that needs to take care for
flushing data out, like using barriers on the file system and
propagating fsync() properly down.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
ces,
though. You should be able to use the bull stuff with qemu, it would
most likely overloading the glibc function for posix aio.
> Which other OS do also support the POSIX AIO API?
No idea really, but I would guess any "unixy" OS out there.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Wed, Jul 26 2006, Paul Brook wrote:
> On Wednesday 26 July 2006 13:23, Jens Axboe wrote:
> > On Wed, Jul 26 2006, Paul Brook wrote:
> > > > Sounds good, so at least it's on its way :-)
> > > > It's on of those big items left on the TODO, so will be
> Or use the scsi emulation :-)
Ah, did not know that queueing was fully implemented there yet!
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Tue, Jul 25 2006, Fabrice Bellard wrote:
> Jens Axboe wrote:
> >On Tue, Jul 25 2006, Sven Köhler wrote:
> >
> >>>>>So the current thread-based async dma patch is really just the
> >>>>>wrong long term solution. A more long term solution is l
ing on this, or is it just speculation? I'd
greatly prefer (and might do, if no one is working on it and Fabrice
would take it) do a libaio version, since that'll for sure perform the
best on Linux. But a posixaio version might be saner, as that should
er feature that appears
> >in my host CPUID, which the booting linux image tries to make use
> >of, but which qemu does not emulate.
Until that gets fixed up, you can boot with idle=halt.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Tue, Jun 20 2006, Jens Axboe wrote:
> On Tue, Jun 20 2006, malc wrote:
> > On Tue, 20 Jun 2006, Sylvain Petreolle wrote:
> >
> > >--- Julian Seward <[EMAIL PROTECTED]> a ?crit :
> > >>
> > >>The SSE2 instructions cvttps2dq, movdq2q,
dq2q_1 ... not ok
result0.uq[0] = 240518168588 (expected 5124095577148911)
movq2dq_1 ... not ok
result0.uq[0] = 0 (expected 5124095577148911)
result0.uq[1] = 0 (expected 0)
[EMAIL PROTECTED]:/home/axboe $ ./a
Segmentation fault
Varies between the two. Compiling without -O2 makes the last two
suceed, the others still not. This CPU has sse2.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
+#else
MODULE_PARM(major,"i");
+#endif
/* Lock the page at virtual address 'user_addr' and return its
physical address (page index). Return a host OS private user page
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@non
tainted VLI
> EFLAGS: 00010246 (2.6.14-1.1656_FC4)
> EIP is at mwait_idle+0x2f/0x41
I don't think qemu supports PNI, which includes the monitor/mwait
additions. I wonder why Linux detects that. You can probably get around
it for now by either passing idle=poll as a boot parameter, or c
On Mon, Mar 13 2006, Mario Goppold wrote:
> Am Samstag, 11. März 2006 13:31 schrieb Jens Axboe:
> > On Fri, Mar 10 2006, Mario Goppold wrote:
> > > Hi,
> > >
> > > I try to install SuSE92-64 on an 400G HD but it fails:
> > >
> > > hda: max r
uot;-smp 2" and see what I want "unkown partition
> table ..."
>
> So my question is : Is lba48 not smp save or is smp support broken (or
> incomplete)?
lba48 support is not committed yet, read the linux messasge - it says it
cannot use lba48, beca
ot to put 256
unconditionally in ->nsector if it is written as zero. It's really a
special case for only the read/write commands, not a generel fixup.
I'd suggest adding a nsector_internal to fixup this internally in the
read/write path so all register correctly reflect what w
On Wed, Feb 01 2006, Fabrice Bellard wrote:
> Jens Axboe wrote:
> >Subject: [PATCH] Add lba48 support to ide
> >From: Jens Axboe <[EMAIL PROTECTED]>
> >Date: 1136376117 +0100
> >
> >Add lba48 support for the ide code. Read back of hob registers isn't
&
On Thu, Jan 05 2006, Johannes Schindelin wrote:
> Hi,
>
> On Thu, 5 Jan 2006, Jens Axboe wrote:
>
> > On Thu, Jan 05 2006, Jens Axboe wrote:
> > > Are you using a persistent git repo for qemu (ie continually importing
> > > new changes)? I've considered
On Thu, Jan 05 2006, Jens Axboe wrote:
> Are you using a persistent git repo for qemu (ie continually importing
> new changes)? I've considered setting one up :-)
I set up such a gateway, should be updated every night from Fabrices cvs
repository. The web interface is
On Wed, Jan 04 2006, Johannes Schindelin wrote:
> Hi,
>
> On Wed, 4 Jan 2006, Jens Axboe wrote:
>
> > 1.0.GIT
>
> Using git for QEmu development? Welcome to the club. ;-)
Yes I just imported the repo into git, cvs isn't really my cup of tea
and it isn't very
Hi,
Subject: [PATCH] ide id updates
From: Jens Axboe <[EMAIL PROTECTED]>
Date: 1136375788 +0100
Some changes to the ata/atapi identify code and default values:
- Store the drive id in the IDEState, so we can reliably set and query
new values. Right now doing things like:
doesn't w
Subject: [PATCH] Add lba48 support to ide
From: Jens Axboe <[EMAIL PROTECTED]>
Date: 1136376117 +0100
Add lba48 support for the ide code. Read back of hob registers isn't
there yet, though.
---
hw/ide.c | 148 ++
1 fi
Subject: [PATCH] Properly support the ide flush cache commands
From: Jens Axboe <[EMAIL PROTECTED]>
Date: 1136376567 +0100
Add a ->bdrv_sync() hook to the BlockDriver, as it should know how to
sync the cached state with what is on disk. I updated the raw and dmg
drivers, they just need
Hi,
Here's the set of 3 patches I currently have for the qemu ide/block
code.
1/3: The ide id updates
2/3: lba48 support
3/3: Proper support of the flush cache command
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
6(s->identify_data + 63,0x07);
+ put_le16(s->identify_data + 88,0x3f | (1 << (val + 8)));
+ break;
+ default:
+ goto abort_cmd;
+ }
+s->status = READY_STAT | SEEK_STAT;
+
On Fri, Dec 30 2005, Fabrice Bellard wrote:
> Jens Axboe wrote:
> >Saw the posts on this the other day and had a few spare hours to play
> >with this. Works for me, with and without DMA (didn't test mult mode,
> >but that should work fine too).
> >
> >Test
On Fri, Dec 30 2005, Fabrice Bellard wrote:
> Jens Axboe wrote:
> >Saw the posts on this the other day and had a few spare hours to play
> >with this. Works for me, with and without DMA (didn't test mult mode,
> >but that should work fine too).
> >
> >Test
+ case WIN_WRITEDMA_EXT:
+ lba48_cmd = 1;
case WIN_WRITEDMA:
case WIN_WRITEDMA_ONCE:
if (!s->bs)
goto abort_cmd;
ide_sector_write_dma(s);
break;
+case WIN_READ_NATIVE_MAX_EXT:
+ lba48_cmd = 1;
case WIN_READ_NATIVE_MAX:
ide_set_sector(s, s->nb_sectors - 1);
s->status = READY_STAT;
@@ -1615,6 +1713,7 @@
case WIN_STANDBYNOW1:
case WIN_IDLEIMMEDIATE:
case WIN_FLUSH_CACHE:
+case WIN_FLUSH_CACHE_EXT:
s->status = READY_STAT;
ide_set_irq(s);
break;
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
eded for recording some months
ago, but never really wrapped it up and submitted it. If there's any
interesting in this, I'll dust it off when I have some spare time.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
g from the file system layer. You would also need to
put some effort into the page cache to allow non-power-of-2 block sizes
for this to work. So it's not trivial :-)
For reading audio tracks, you can use either some pass through command
mechanism like CDROM_SEND_PACKET or SG_IO. Or the CDROM
the compile.
If you did the kernel compile with a hot disk cache, I'm not surprised
you're not seeing a performance benefit of the non-blocking io patch.
Even for a cold cache compile it will generally be cpu bound.
--
Jens Axboe
___
-b aligment, on 2.4 you may have to ensure 1k/4k alignment.
--
Jens Axboe
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
On Mon, Oct 03 2005, John Coiner wrote:
>
>
> Jens Axboe wrote:
> > Why not use aio for this instead, seems like a better fit than spawning
> >a thread per block device? That would still require a thread for
> >handling completions, but you could easily just use a
r fit than spawning
a thread per block device? That would still require a thread for
handling completions, but you could easily just use a single completion
thread for all devices for this as it would not need to do any real
work.
--
Jens Axboe
___
56 matches
Mail list logo