Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-07 Thread Corey Bryant



On 08/06/2012 10:15 AM, Corey Bryant wrote:



On 08/06/2012 09:51 AM, Kevin Wolf wrote:

Am 06.08.2012 15:32, schrieb Corey Bryant:

On 08/06/2012 05:15 AM, Kevin Wolf wrote:

Am 03.08.2012 00:21, schrieb Corey Bryant:

@@ -84,6 +158,36 @@ int qemu_open(const char *name, int flags, ...)
int ret;
int mode = 0;

+#ifndef _WIN32
+const char *fdset_id_str;
+
+/* Attempt dup of fd from fd set */
+if (strstart(name, /dev/fdset/, fdset_id_str)) {
+int64_t fdset_id;
+int fd, dupfd;
+
+fdset_id = qemu_parse_fdset(fdset_id_str);
+if (fdset_id == -1) {
+errno = EINVAL;
+return -1;
+}
+
+fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);


I know that use of default_mon in this patch is not correct, but I
wanted to get these patches out for review. I used default_mon for
testing because cur_mon wasn't pointing to the monitor I'd added
fd sets
to.  I need to figure out why.



Does it make sense to use default_mon here?  After digging into this
some more, I'm thinking it makes sense, and I'll explain why.

It looks like cur_mon can't be used.  cur_mon will point to the
monitor
object for the duration of a command, and be reset to old_mon (NULL in
my case) after the command completes.

qemu_open() and qemu_close() are frequently called long after a
monitor
command has come and gone, so cur_mon won't work.  For example,
drive_add will cause qemu_open() to be called, but after the
command has
completed, the file will keep getting opened/closed during normal QEMU
operation.  I'm not sure why, I've just noticed this behavior.

Does anyone have any thoughts on this?  It would require fd sets to be
added to the default monitor only.


I think we have two design options that would make sense:

1. Make the file descriptors global instead of per-monitor. Is there a
 reason why each monitor has its own set of fds? (Also I'm
wondering
 if they survive a monitor disconnect this way?)


I'd prefer to have them associated with a monitor object so that we can
more easily keep track of whether or not they're in use by a monitor
connection.


Hm, I see.


2. Save a monitor reference with the fdset information.



Are you saying that each monitor would have the same copy of fdset
information?


This one doesn't really make sense indeed...




Allowing to send file descriptors on every monitor, but making only
those of the default monitor actually usable, sounds like a bad choice
to me.


What if we also allow them to be added only to the default monitor?


Would get you some kind of consistency at least, even though I don't
like that secondary monitors can't use the functionality.

Can't we make the fdset information global, so that a qemu_open/close()
searches all of them, but let it have a Monitor* owner for keeping track
whether it's in use?


I think global fdsets might work (sorry I didn't think it through enough
on my first reply).  I think I'll need to drop the in_use flag and tie
monitor references into the refcount.  (I know I know, you suggested
that a while back.. :).  I'll give it a shot and see how it goes.


I just submitted the v7 patch series which makes fd sets global, rather 
than each Monitor object having an fd set.  This allows the list of fd 
sets to be shared among all monitor connections.


--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-06 Thread Kevin Wolf
Am 03.08.2012 00:21, schrieb Corey Bryant:
 @@ -84,6 +158,36 @@ int qemu_open(const char *name, int flags, ...)
   int ret;
   int mode = 0;

 +#ifndef _WIN32
 +const char *fdset_id_str;
 +
 +/* Attempt dup of fd from fd set */
 +if (strstart(name, /dev/fdset/, fdset_id_str)) {
 +int64_t fdset_id;
 +int fd, dupfd;
 +
 +fdset_id = qemu_parse_fdset(fdset_id_str);
 +if (fdset_id == -1) {
 +errno = EINVAL;
 +return -1;
 +}
 +
 +fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);

 I know that use of default_mon in this patch is not correct, but I
 wanted to get these patches out for review. I used default_mon for
 testing because cur_mon wasn't pointing to the monitor I'd added fd sets
 to.  I need to figure out why.

 
 Does it make sense to use default_mon here?  After digging into this 
 some more, I'm thinking it makes sense, and I'll explain why.
 
 It looks like cur_mon can't be used.  cur_mon will point to the monitor 
 object for the duration of a command, and be reset to old_mon (NULL in 
 my case) after the command completes.
 
 qemu_open() and qemu_close() are frequently called long after a monitor 
 command has come and gone, so cur_mon won't work.  For example, 
 drive_add will cause qemu_open() to be called, but after the command has 
 completed, the file will keep getting opened/closed during normal QEMU 
 operation.  I'm not sure why, I've just noticed this behavior.
 
 Does anyone have any thoughts on this?  It would require fd sets to be 
 added to the default monitor only.

I think we have two design options that would make sense:

1. Make the file descriptors global instead of per-monitor. Is there a
   reason why each monitor has its own set of fds? (Also I'm wondering
   if they survive a monitor disconnect this way?)

2. Save a monitor reference with the fdset information.

Allowing to send file descriptors on every monitor, but making only
those of the default monitor actually usable, sounds like a bad choice
to me.

Kevin



Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-06 Thread Corey Bryant



On 08/06/2012 05:15 AM, Kevin Wolf wrote:

Am 03.08.2012 00:21, schrieb Corey Bryant:

@@ -84,6 +158,36 @@ int qemu_open(const char *name, int flags, ...)
   int ret;
   int mode = 0;

+#ifndef _WIN32
+const char *fdset_id_str;
+
+/* Attempt dup of fd from fd set */
+if (strstart(name, /dev/fdset/, fdset_id_str)) {
+int64_t fdset_id;
+int fd, dupfd;
+
+fdset_id = qemu_parse_fdset(fdset_id_str);
+if (fdset_id == -1) {
+errno = EINVAL;
+return -1;
+}
+
+fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);


I know that use of default_mon in this patch is not correct, but I
wanted to get these patches out for review. I used default_mon for
testing because cur_mon wasn't pointing to the monitor I'd added fd sets
to.  I need to figure out why.



Does it make sense to use default_mon here?  After digging into this
some more, I'm thinking it makes sense, and I'll explain why.

It looks like cur_mon can't be used.  cur_mon will point to the monitor
object for the duration of a command, and be reset to old_mon (NULL in
my case) after the command completes.

qemu_open() and qemu_close() are frequently called long after a monitor
command has come and gone, so cur_mon won't work.  For example,
drive_add will cause qemu_open() to be called, but after the command has
completed, the file will keep getting opened/closed during normal QEMU
operation.  I'm not sure why, I've just noticed this behavior.

Does anyone have any thoughts on this?  It would require fd sets to be
added to the default monitor only.


I think we have two design options that would make sense:

1. Make the file descriptors global instead of per-monitor. Is there a
reason why each monitor has its own set of fds? (Also I'm wondering
if they survive a monitor disconnect this way?)


I'd prefer to have them associated with a monitor object so that we can 
more easily keep track of whether or not they're in use by a monitor 
connection.




2. Save a monitor reference with the fdset information.



Are you saying that each monitor would have the same copy of fdset 
information?



Allowing to send file descriptors on every monitor, but making only
those of the default monitor actually usable, sounds like a bad choice
to me.


What if we also allow them to be added only to the default monitor?

--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-06 Thread Kevin Wolf
Am 06.08.2012 15:32, schrieb Corey Bryant:
 On 08/06/2012 05:15 AM, Kevin Wolf wrote:
 Am 03.08.2012 00:21, schrieb Corey Bryant:
 @@ -84,6 +158,36 @@ int qemu_open(const char *name, int flags, ...)
int ret;
int mode = 0;

 +#ifndef _WIN32
 +const char *fdset_id_str;
 +
 +/* Attempt dup of fd from fd set */
 +if (strstart(name, /dev/fdset/, fdset_id_str)) {
 +int64_t fdset_id;
 +int fd, dupfd;
 +
 +fdset_id = qemu_parse_fdset(fdset_id_str);
 +if (fdset_id == -1) {
 +errno = EINVAL;
 +return -1;
 +}
 +
 +fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);

 I know that use of default_mon in this patch is not correct, but I
 wanted to get these patches out for review. I used default_mon for
 testing because cur_mon wasn't pointing to the monitor I'd added fd sets
 to.  I need to figure out why.


 Does it make sense to use default_mon here?  After digging into this
 some more, I'm thinking it makes sense, and I'll explain why.

 It looks like cur_mon can't be used.  cur_mon will point to the monitor
 object for the duration of a command, and be reset to old_mon (NULL in
 my case) after the command completes.

 qemu_open() and qemu_close() are frequently called long after a monitor
 command has come and gone, so cur_mon won't work.  For example,
 drive_add will cause qemu_open() to be called, but after the command has
 completed, the file will keep getting opened/closed during normal QEMU
 operation.  I'm not sure why, I've just noticed this behavior.

 Does anyone have any thoughts on this?  It would require fd sets to be
 added to the default monitor only.

 I think we have two design options that would make sense:

 1. Make the file descriptors global instead of per-monitor. Is there a
 reason why each monitor has its own set of fds? (Also I'm wondering
 if they survive a monitor disconnect this way?)
 
 I'd prefer to have them associated with a monitor object so that we can 
 more easily keep track of whether or not they're in use by a monitor 
 connection.

Hm, I see.

 2. Save a monitor reference with the fdset information.

 
 Are you saying that each monitor would have the same copy of fdset 
 information?

This one doesn't really make sense indeed...

 
 Allowing to send file descriptors on every monitor, but making only
 those of the default monitor actually usable, sounds like a bad choice
 to me.
 
 What if we also allow them to be added only to the default monitor?

Would get you some kind of consistency at least, even though I don't
like that secondary monitors can't use the functionality.

Can't we make the fdset information global, so that a qemu_open/close()
searches all of them, but let it have a Monitor* owner for keeping track
whether it's in use?

Kevin



Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-06 Thread Corey Bryant



On 08/06/2012 09:51 AM, Kevin Wolf wrote:

Am 06.08.2012 15:32, schrieb Corey Bryant:

On 08/06/2012 05:15 AM, Kevin Wolf wrote:

Am 03.08.2012 00:21, schrieb Corey Bryant:

@@ -84,6 +158,36 @@ int qemu_open(const char *name, int flags, ...)
int ret;
int mode = 0;

+#ifndef _WIN32
+const char *fdset_id_str;
+
+/* Attempt dup of fd from fd set */
+if (strstart(name, /dev/fdset/, fdset_id_str)) {
+int64_t fdset_id;
+int fd, dupfd;
+
+fdset_id = qemu_parse_fdset(fdset_id_str);
+if (fdset_id == -1) {
+errno = EINVAL;
+return -1;
+}
+
+fd = monitor_fdset_get_fd(default_mon, fdset_id, flags);


I know that use of default_mon in this patch is not correct, but I
wanted to get these patches out for review. I used default_mon for
testing because cur_mon wasn't pointing to the monitor I'd added fd sets
to.  I need to figure out why.



Does it make sense to use default_mon here?  After digging into this
some more, I'm thinking it makes sense, and I'll explain why.

It looks like cur_mon can't be used.  cur_mon will point to the monitor
object for the duration of a command, and be reset to old_mon (NULL in
my case) after the command completes.

qemu_open() and qemu_close() are frequently called long after a monitor
command has come and gone, so cur_mon won't work.  For example,
drive_add will cause qemu_open() to be called, but after the command has
completed, the file will keep getting opened/closed during normal QEMU
operation.  I'm not sure why, I've just noticed this behavior.

Does anyone have any thoughts on this?  It would require fd sets to be
added to the default monitor only.


I think we have two design options that would make sense:

1. Make the file descriptors global instead of per-monitor. Is there a
 reason why each monitor has its own set of fds? (Also I'm wondering
 if they survive a monitor disconnect this way?)


I'd prefer to have them associated with a monitor object so that we can
more easily keep track of whether or not they're in use by a monitor
connection.


Hm, I see.


2. Save a monitor reference with the fdset information.



Are you saying that each monitor would have the same copy of fdset
information?


This one doesn't really make sense indeed...




Allowing to send file descriptors on every monitor, but making only
those of the default monitor actually usable, sounds like a bad choice
to me.


What if we also allow them to be added only to the default monitor?


Would get you some kind of consistency at least, even though I don't
like that secondary monitors can't use the functionality.

Can't we make the fdset information global, so that a qemu_open/close()
searches all of them, but let it have a Monitor* owner for keeping track
whether it's in use?


I think global fdsets might work (sorry I didn't think it through enough 
on my first reply).  I think I'll need to drop the in_use flag and tie 
monitor references into the refcount.  (I know I know, you suggested 
that a while back.. :).  I'll give it a shot and see how it goes.

--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-02 Thread Corey Bryant



On 07/25/2012 11:57 PM, Corey Bryant wrote:



On 07/25/2012 03:43 PM, Eric Blake wrote:

On 07/23/2012 07:08 AM, Corey Bryant wrote:

When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.




+++ b/monitor.c
@@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor
*mon, bool in_use)
  }
  }

+void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
+{
+mon_fdset_t *mon_fdset;
+
+if (!mon) {
+return;
+}


Am I reading this code right by stating that 'if there is no monitor, we
don't increment the refcount'?  How does a monitor reattach affect
things?  Or am I missing something fundamental about the cases when
'mon==NULL' will exist?



Yes you're reading this correctly.

I'm pretty sure that mon will only be NULL if QEMU is started without a
monitor.

If QEMU has a monitor, and libvirt disconnects it's connection to the
qemu monitor, then I believe mon will remain non-NULL.


I've verified this to be true and everything is working as expected.

If libvirt's connection to the monitor fd is closed, mon will remain 
non-NULL and the refcount will still be incremented/decremented on 
qemu_open()/qemu_close().  When libvirt reconnects, any fdsets that 
haven't been cleaned up will still be available.  query-fdsets can be 
used to determine what's available.


--
Regards,
Corey




Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-08-02 Thread Corey Bryant



On 07/23/2012 09:14 AM, Corey Bryant wrote:



On 07/23/2012 09:08 AM, Corey Bryant wrote:

When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com

v2:
  -Get rid of file_open and move dup code to qemu_open
   (kw...@redhat.com)
  -Use strtol wrapper instead of atoi (kw...@redhat.com)

v3:
  -Add note about fd leakage (ebl...@redhat.com)

v4
  -Moved patch to be later in series (lcapitul...@redhat.com)
  -Update qemu_open to check access mode flags and set flags that
   can be set (ebl...@redhat.com, kw...@redhat.com)

v5:
  -This patch was overhauled quite a bit in this version, with
   the addition of fd set and refcount support.
  -Use qemu_set_cloexec() on dup'd fd (ebl...@redhat.com)
  -Modify flags set by fcntl on dup'd fd (ebl...@redhat.com)
  -Reduce syscalls when setting flags for dup'd fd (ebl...@redhat.com)
  -Fix O_RDWR, O_RDONLY, O_WRONLY checks (ebl...@redhat.com)
---
  block/raw-posix.c |   24 +-
  block/raw-win32.c |2 +-
  block/vmdk.c  |4 +-
  block/vpc.c   |2 +-
  block/vvfat.c |   12 ++---
  cutils.c  |5 ++
  monitor.c |   85 +
  monitor.h |4 ++
  osdep.c   |  138
-
  qemu-common.h |3 +-
  qemu-tool.c   |   12 +
  11 files changed, 267 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a172de3..5d0a801 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs,
const char *filename,
  out_free_buf:
  qemu_vfree(s-aligned_buf);
  out_close:
-qemu_close(fd);
+qemu_close(fd, filename);
  return -errno;
  }

@@ -376,7 +376,7 @@ static void raw_close(BlockDriverState *bs)
  {
  BDRVRawState *s = bs-opaque;
  if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  if (s-aligned_buf != NULL)
  qemu_vfree(s-aligned_buf);
@@ -580,7 +580,7 @@ static int raw_create(const char *filename,
QEMUOptionParameter *options)
  if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
  result = -errno;
  }
-if (qemu_close(fd) != 0) {
+if (qemu_close(fd, filename) != 0) {
  result = -errno;
  }
  }
@@ -850,7 +850,7 @@ static int hdev_open(BlockDriverState *bs, const
char *filename, int flags)
  if (fd  0) {
  bsdPath[strlen(bsdPath)-1] = '1';
  } else {
-qemu_close(fd);
+qemu_close(fd, bsdPath);
  }
  filename = bsdPath;
  }
@@ -889,7 +889,7 @@ static int fd_open(BlockDriverState *bs)
  last_media_present = (s-fd = 0);
  if (s-fd = 0 
  (get_clock() - s-fd_open_time) = FD_OPEN_TIMEOUT) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  #ifdef DEBUG_FLOPPY
  printf(Floppy closed\n);
@@ -988,7 +988,7 @@ static int hdev_create(const char *filename,
QEMUOptionParameter *options)
  else if (lseek(fd, 0, SEEK_END)  total_size * BDRV_SECTOR_SIZE)
  ret = -ENOSPC;

-qemu_close(fd);
+qemu_close(fd, filename);
  return ret;
  }

@@ -1038,7 +1038,7 @@ static int floppy_open(BlockDriverState *bs,
const char *filename, int flags)
  return ret;

  /* close fd so that we can reopen it as needed */
-qemu_close(s-fd);
+qemu_close(s-fd, filename);
  s-fd = -1;
  s-fd_media_changed = 1;

@@ -1070,7 +1070,7 @@ static int floppy_probe_device(const char
*filename)
  prio = 100;

  outc:
-qemu_close(fd);
+qemu_close(fd, filename);
  out:
  return prio;
  }
@@ -1105,14 +1105,14 @@ static void floppy_eject(BlockDriverState *bs,
bool eject_flag)
  int fd;

  if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  }
  fd = qemu_open(bs-filename, s-open_flags | O_NONBLOCK);
  if (fd = 0) {
  if (ioctl(fd, FDEJECT, 0)  0)
  perror(FDEJECT);
-qemu_close(fd);
+qemu_close(fd, bs-filename);
  }
  }

@@ 

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Kevin Wolf
Am 26.07.2012 05:57, schrieb Corey Bryant:
 On 07/25/2012 03:43 PM, Eric Blake wrote:
 On 07/23/2012 07:08 AM, Corey Bryant wrote:
 +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
 +{
 +mon_fdset_t *mon_fdset;
 +mon_fdset_fd_t *mon_fdset_fd;
 +int mon_fd_flags;
 +
 +if (!mon) {
 +errno = ENOENT;
 +return -1;
 +}
 +
 +QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
 +if (mon_fdset-id != fdset_id) {
 +continue;
 +}
 +QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
 +if (mon_fdset_fd-removed) {
 +continue;
 +}
 +
 +mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
 +if (mon_fd_flags == -1) {
 +return -1;

 This says we fail on the first fcntl() failure, instead of trying other
 fds in the set.  Granted, an fcntl() failure is probably the sign of a
 bigger bug (such as closing an fd at the wrong point in time), so I
 guess trying to go on doesn't make much sense once we already know we
 are hosed.

 
 I think I'll stick with it the way it is.  If fcntl() fails we might 
 have a tainted fd set so I think we should fail.

The alternative would be s/return 1/continue/, right? I think either way
is acceptable.

 +}
 +
 +switch (flags  O_ACCMODE) {
 +case O_RDWR:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +case O_RDONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
 +return mon_fdset_fd-fd;
 +}
 +break;

 Do we want to allow the case where the caller asked for O_RDONLY, but
 the set only has O_RDWR?  After all, the caller is getting a compatible
 subset of what the set offers.
 
 I don't see a problem with it.

I would require exact matches like you implemented, in order to prevent
damage if we ever had a bug that writes to a read-only file. I believe
it also makes the semantics clearer and the code simpler, while it
shouldn't make much of a difference for clients.

Kevin



Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Eric Blake
On 07/25/2012 09:21 PM, Corey Bryant wrote:
 
 
 On 07/25/2012 03:25 PM, Eric Blake wrote:
 On 07/25/2012 02:22 AM, Kevin Wolf wrote:
 Hm, not a nice interface where qemu_close() needs the filename and
 (worse) could be given a wrong filename. Maybe it would be better to
 maintain a list of fd - fdset mappings in qemu_open/close?


 I agree, I don't really like it either.

 We already have a list of fd - fdset mappings (mon_fdset_fd_t -
 mon_fdset_t).  Would it be too costly to loop through all the
 fdsets/fds
 at the beginning of every qemu_close()?

 I don't think so. qemu_close() is not a fast path and happens almost
 never, and the list is short enough that searching it isn't a problem
 anyway.

 I agree - just do the loop to do the reverse lookup yourself, rather
 than making qemu_close() have a different signature than close().

 
 Great, I'll do this then.

You may want an optimization of using a bitset for tracking which fds
are tracked by fdset in the first place, so that the fast path of
qemu_close() will be a check against the bitset to see if you even have
to waste time on the reverse lookup in the first place.  The bitset will
typically be small (bounded not only by the maximum possible fd, but
further by the fact that we don't usually open that many fds in the
first place), but I'm not sure if you can get away with static sizing.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Kevin Wolf
Am 26.07.2012 15:13, schrieb Eric Blake:
 On 07/25/2012 09:21 PM, Corey Bryant wrote:


 On 07/25/2012 03:25 PM, Eric Blake wrote:
 On 07/25/2012 02:22 AM, Kevin Wolf wrote:
 Hm, not a nice interface where qemu_close() needs the filename and
 (worse) could be given a wrong filename. Maybe it would be better to
 maintain a list of fd - fdset mappings in qemu_open/close?


 I agree, I don't really like it either.

 We already have a list of fd - fdset mappings (mon_fdset_fd_t -
 mon_fdset_t).  Would it be too costly to loop through all the
 fdsets/fds
 at the beginning of every qemu_close()?

 I don't think so. qemu_close() is not a fast path and happens almost
 never, and the list is short enough that searching it isn't a problem
 anyway.

 I agree - just do the loop to do the reverse lookup yourself, rather
 than making qemu_close() have a different signature than close().


 Great, I'll do this then.
 
 You may want an optimization of using a bitset for tracking which fds
 are tracked by fdset in the first place, so that the fast path of
 qemu_close() will be a check against the bitset to see if you even have
 to waste time on the reverse lookup in the first place.  The bitset will
 typically be small (bounded not only by the maximum possible fd, but
 further by the fact that we don't usually open that many fds in the
 first place), but I'm not sure if you can get away with static sizing.

Premature optimisation, in my opinion. The list is really small.

Kevin



Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Corey Bryant



On 07/26/2012 05:07 AM, Kevin Wolf wrote:

Am 26.07.2012 05:57, schrieb Corey Bryant:

On 07/25/2012 03:43 PM, Eric Blake wrote:

On 07/23/2012 07:08 AM, Corey Bryant wrote:

+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+mon_fdset_t *mon_fdset;
+mon_fdset_fd_t *mon_fdset_fd;
+int mon_fd_flags;
+
+if (!mon) {
+errno = ENOENT;
+return -1;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id != fdset_id) {
+continue;
+}
+QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
+if (mon_fdset_fd-removed) {
+continue;
+}
+
+mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
+if (mon_fd_flags == -1) {
+return -1;


This says we fail on the first fcntl() failure, instead of trying other
fds in the set.  Granted, an fcntl() failure is probably the sign of a
bigger bug (such as closing an fd at the wrong point in time), so I
guess trying to go on doesn't make much sense once we already know we
are hosed.



I think I'll stick with it the way it is.  If fcntl() fails we might
have a tainted fd set so I think we should fail.


The alternative would be s/return 1/continue/, right? I think either way
is acceptable.


+}
+
+switch (flags  O_ACCMODE) {
+case O_RDWR:
+if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
+return mon_fdset_fd-fd;
+}
+break;
+case O_RDONLY:
+if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
+return mon_fdset_fd-fd;
+}
+break;


Do we want to allow the case where the caller asked for O_RDONLY, but
the set only has O_RDWR?  After all, the caller is getting a compatible
subset of what the set offers.


I don't see a problem with it.


I would require exact matches like you implemented, in order to prevent
damage if we ever had a bug that writes to a read-only file. I believe
it also makes the semantics clearer and the code simpler, while it
shouldn't make much of a difference for clients.

Kevin



Alright, then I'll plan on requiring exact matches of access mode flags.

--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Corey Bryant



On 07/26/2012 05:07 AM, Kevin Wolf wrote:

Am 26.07.2012 05:57, schrieb Corey Bryant:

On 07/25/2012 03:43 PM, Eric Blake wrote:

On 07/23/2012 07:08 AM, Corey Bryant wrote:

+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+mon_fdset_t *mon_fdset;
+mon_fdset_fd_t *mon_fdset_fd;
+int mon_fd_flags;
+
+if (!mon) {
+errno = ENOENT;
+return -1;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id != fdset_id) {





+continue;

+}
+QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
+if (mon_fdset_fd-removed) {
+continue;
+}
+
+mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
+if (mon_fd_flags == -1) {
+return -1;


This says we fail on the first fcntl() failure, instead of trying other
fds in the set.  Granted, an fcntl() failure is probably the sign of a
bigger bug (such as closing an fd at the wrong point in time), so I
guess trying to go on doesn't make much sense once we already know we
are hosed.



I think I'll stick with it the way it is.  If fcntl() fails we might
have a tainted fd set so I think we should fail.


The alternative would be s/return 1/continue/, right? I think either way
is acceptable.



Yes, we'd continue the loop instead of returning -1.  I prefer to return 
on the first failure, but if anyone feels strongly about continuing the 
loop, please let me know.


--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-26 Thread Corey Bryant



On 07/26/2012 09:16 AM, Kevin Wolf wrote:

Am 26.07.2012 15:13, schrieb Eric Blake:

On 07/25/2012 09:21 PM, Corey Bryant wrote:



On 07/25/2012 03:25 PM, Eric Blake wrote:

On 07/25/2012 02:22 AM, Kevin Wolf wrote:

Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd - fdset mappings in qemu_open/close?



I agree, I don't really like it either.

We already have a list of fd - fdset mappings (mon_fdset_fd_t -
mon_fdset_t).  Would it be too costly to loop through all the
fdsets/fds
at the beginning of every qemu_close()?


I don't think so. qemu_close() is not a fast path and happens almost
never, and the list is short enough that searching it isn't a problem
anyway.


I agree - just do the loop to do the reverse lookup yourself, rather
than making qemu_close() have a different signature than close().



Great, I'll do this then.


You may want an optimization of using a bitset for tracking which fds
are tracked by fdset in the first place, so that the fast path of
qemu_close() will be a check against the bitset to see if you even have
to waste time on the reverse lookup in the first place.  The bitset will
typically be small (bounded not only by the maximum possible fd, but
further by the fact that we don't usually open that many fds in the
first place), but I'm not sure if you can get away with static sizing.


Premature optimisation, in my opinion. The list is really small.

Kevin



I'll probably hold off on any optimisation at this point, but I can 
revisit it in the future if it's needed.


--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-25 Thread Kevin Wolf
Am 25.07.2012 05:41, schrieb Corey Bryant:
 diff --git a/block/raw-posix.c b/block/raw-posix.c
 index a172de3..5d0a801 100644
 --- a/block/raw-posix.c
 +++ b/block/raw-posix.c
 @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const 
 char *filename,
   out_free_buf:
   qemu_vfree(s-aligned_buf);
   out_close:
 -qemu_close(fd);
 +qemu_close(fd, filename);
   return -errno;
   }

 Hm, not a nice interface where qemu_close() needs the filename and
 (worse) could be given a wrong filename. Maybe it would be better to
 maintain a list of fd - fdset mappings in qemu_open/close?

 
 I agree, I don't really like it either.
 
 We already have a list of fd - fdset mappings (mon_fdset_fd_t - 
 mon_fdset_t).  Would it be too costly to loop through all the fdsets/fds 
 at the beginning of every qemu_close()?

I don't think so. qemu_close() is not a fast path and happens almost
never, and the list is short enough that searching it isn't a problem
anyway.

 +switch (flags  O_ACCMODE) {
 +case O_RDWR:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +case O_RDONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +case O_WRONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_WRONLY) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +}

 I think you mean:

if ((flags  O_ACCMODE) == (mon_fd_flags  O_ACCMODE)) {
return mon_fdset_fd-fd;
}
 
 Yes, that would be a bit simpler wouldn't it. :)
 

 What about other flags that cannot be set with fcntl(), like O_SYNC on
 older kernels or possibly non-Linux? (The block layer doesn't use it any
 more, but I think we want to keep the function generally useful)

 
 I see what you're getting at here.  Basically you could have 2 fds in an 
 fdset with the same access mode flags, but one has O_SYNC on and the 
 other has O_SYNC off.  That seems like it would make sense to implement. 
   As a work-around, I think a user could just create a separate fdset 
 for the same file with different O_SYNC value.  But from a client 
 perspective, it would be nicer to have this taken care of for you.

Now that the block layer doesn't use O_SYNC any more, it's more of a
theoretical point. I don't think there's any other place, where we'd
need to switch O_SYNC during runtime.

Taking it into consideration is complicated by the fact that some
kernels allow to fcntl() O_SYNC and others don't, so enforcing a match
here wouldn't feel completely right either.

Maybe just leave it as it is. :-)

 
 +}
 +errno = EACCES;
 +return -1;
 +}
 +errno = ENOENT;
 +return -1;
 +}
 +
   /* mon_cmds and info_cmds would be sorted at runtime */
   static mon_cmd_t mon_cmds[] = {
   #include hmp-commands.h

 @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice)
   #endif
   }

 +/*
 + * Dups an fd and sets the flags
 + */
 +static int qemu_dup(int fd, int flags)
 +{
 +int i;
 +int ret;
 +int serrno;
 +int dup_flags;
 +int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
 +  O_NONBLOCK, 0 };
 +
 +if (flags  O_CLOEXEC) {
 +ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
 +if (ret == -1  errno == EINVAL) {
 +ret = dup(fd);
 +if (ret != -1  fcntl_setfl(ret, O_CLOEXEC) == -1) {
 +goto fail;
 +}
 +}
 +} else {
 +ret = dup(fd);
 +}
 +
 +if (ret == -1) {
 +goto fail;
 +}
 +
 +dup_flags = fcntl(ret, F_GETFL);
 +if (dup_flags == -1) {
 +goto fail;
 +}
 +
 +if ((flags  O_SYNC) != (dup_flags  O_SYNC)) {
 +errno = EINVAL;
 +goto fail;
 +}

 It's worth trying to set it before failing, newer kernels can do it. But
 as I said above, if you can fail here, it makes sense to consider O_SYNC
 when selecting the right file descriptor from the fdset.

 
 I'm on a 3.4.4 Fedora kernel that doesn't appear to support 
 fcntl(O_SYNC), but perhaps I'm doing something wrong.  Here's my test 
 code (shortened for simplicty): [...]

Hm, true. So it seems that patch has never made it into the kernel, in
fact...

 +
 +qemu_set_cloexec(ret);

 Wait... If O_CLOEXEC is set, you set the flag immediately and if it
 isn't you set it at the end of the function? What's the intended use
 case for not using O_CLOEXEC then?

 
 This is a mistake.  I think I just need to be using qemu_set_cloexec() 
 instead of fcntl_setfl() earlier in this function and get rid of this 
 latter call to qemu_set_cloexec().

Yes, probably. And in fact, I think this shouldn't even be conditional
on flags  O_CLOEXEC. The whole reason qemu_open() was introduced
originally 

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-25 Thread Eric Blake
On 07/23/2012 07:08 AM, Corey Bryant wrote:
 When qemu_open is passed a filename of the /dev/fdset/nnn
 format (where nnn is the fdset ID), an fd with matching access
 mode flags will be searched for within the specified monitor
 fd set.  If the fd is found, a dup of the fd will be returned
 from qemu_open.
 
 Each fd set has a reference count.  The purpose of the reference
 count is to determine if an fd set contains file descriptors that
 have open dup() references that have not yet been closed.  It is
 incremented on qemu_open and decremented on qemu_close.  It is
 not until the refcount is zero that file desriptors in an fd set
 can be closed.  If an fd set has dup() references open, then we
 must keep the other fds in the fd set open in case a reopen
 of the file occurs that requires an fd with a different access
 mode.
 

 +++ b/monitor.c
 @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, 
 bool in_use)
  }
  }
  
 +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
 +{
 +mon_fdset_t *mon_fdset;
 +
 +if (!mon) {
 +return;
 +}

Am I reading this code right by stating that 'if there is no monitor, we
don't increment the refcount'?  How does a monitor reattach affect
things?  Or am I missing something fundamental about the cases when
'mon==NULL' will exist?

 +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
 +{
 +mon_fdset_t *mon_fdset;
 +mon_fdset_fd_t *mon_fdset_fd;
 +int mon_fd_flags;
 +
 +if (!mon) {
 +errno = ENOENT;
 +return -1;
 +}
 +
 +QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
 +if (mon_fdset-id != fdset_id) {
 +continue;
 +}
 +QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
 +if (mon_fdset_fd-removed) {
 +continue;
 +}
 +
 +mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
 +if (mon_fd_flags == -1) {
 +return -1;

This says we fail on the first fcntl() failure, instead of trying other
fds in the set.  Granted, an fcntl() failure is probably the sign of a
bigger bug (such as closing an fd at the wrong point in time), so I
guess trying to go on doesn't make much sense once we already know we
are hosed.

 +}
 +
 +switch (flags  O_ACCMODE) {
 +case O_RDWR:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +case O_RDONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
 +return mon_fdset_fd-fd;
 +}
 +break;

Do we want to allow the case where the caller asked for O_RDONLY, but
the set only has O_RDWR?  After all, the caller is getting a compatible
subset of what the set offers.

 +case O_WRONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_WRONLY) {
 +return mon_fdset_fd-fd;
 +}
 +break;

Likewise, should we allow a caller asking for O_WRONLY when the set
provides only O_RDWR?

  
 +/*
 + * Dups an fd and sets the flags
 + */
 +static int qemu_dup(int fd, int flags)
 +{
 +int i;
 +int ret;
 +int serrno;
 +int dup_flags;
 +int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
 +  O_NONBLOCK, 0 };
 +
 +if (flags  O_CLOEXEC) {
 +ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);

F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
OS yet; you probably need some #ifdef and/or configure guards.

 +if (ret == -1  errno == EINVAL) {
 +ret = dup(fd);
 +if (ret != -1  fcntl_setfl(ret, O_CLOEXEC) == -1) {

You _can't_ call F_SETFL with O_CLOEXEC.  O_CLOEXEC only causes open()
to set FD_CLOEXEC; thereafter, including in the case of this dup, what
you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
qemu_set_cloexec, not fcntl_setfl).

 +goto fail;
 +}
 +}
 +} else {
 +ret = dup(fd);
 +}
 +
 +if (ret == -1) {
 +goto fail;
 +}
 +
 +dup_flags = fcntl(ret, F_GETFL);
 +if (dup_flags == -1) {
 +goto fail;
 +}
 +
 +if ((flags  O_SYNC) != (dup_flags  O_SYNC)) {
 +errno = EINVAL;
 +goto fail;
 +}
 +
 +/* Set/unset flags that we can with fcntl */
 +i = 0;
 +while (setfl_flags[i] != 0) {
 +if (flags  setfl_flags[i]) {
 +dup_flags = (dup_flags | setfl_flags[i]);
 +} else {
 +dup_flags = (dup_flags  ~setfl_flags[i]);
 +}
 +i++;
 +}

Rather than looping one bit at a time, you should do this as a mask
operation.

 +
 +if (fcntl(ret, F_SETFL, dup_flags) == -1) {
 +goto fail;
 +}
 +
 +/* Truncate the file in the cases that open() would truncate it */
 +if (flags  O_TRUNC ||
 +  

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-25 Thread Corey Bryant



On 07/25/2012 03:25 PM, Eric Blake wrote:

On 07/25/2012 02:22 AM, Kevin Wolf wrote:

Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd - fdset mappings in qemu_open/close?



I agree, I don't really like it either.

We already have a list of fd - fdset mappings (mon_fdset_fd_t -
mon_fdset_t).  Would it be too costly to loop through all the fdsets/fds
at the beginning of every qemu_close()?


I don't think so. qemu_close() is not a fast path and happens almost
never, and the list is short enough that searching it isn't a problem
anyway.


I agree - just do the loop to do the reverse lookup yourself, rather
than making qemu_close() have a different signature than close().



Great, I'll do this then.

--
Regards,
Corey





Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-25 Thread Corey Bryant



On 07/25/2012 03:43 PM, Eric Blake wrote:

On 07/23/2012 07:08 AM, Corey Bryant wrote:

When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.




+++ b/monitor.c
@@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool 
in_use)
  }
  }

+void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
+{
+mon_fdset_t *mon_fdset;
+
+if (!mon) {
+return;
+}


Am I reading this code right by stating that 'if there is no monitor, we
don't increment the refcount'?  How does a monitor reattach affect
things?  Or am I missing something fundamental about the cases when
'mon==NULL' will exist?



Yes you're reading this correctly.

I'm pretty sure that mon will only be NULL if QEMU is started without a 
monitor.


If QEMU has a monitor, and libvirt disconnects it's connection to the 
qemu monitor, then I believe mon will remain non-NULL.


I'll plan on testing this out to verify though.  (I'm out most of this 
week and will be back full time starting next Tues.)



+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+mon_fdset_t *mon_fdset;
+mon_fdset_fd_t *mon_fdset_fd;
+int mon_fd_flags;
+
+if (!mon) {
+errno = ENOENT;
+return -1;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id != fdset_id) {
+continue;
+}
+QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
+if (mon_fdset_fd-removed) {
+continue;
+}
+
+mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
+if (mon_fd_flags == -1) {
+return -1;


This says we fail on the first fcntl() failure, instead of trying other
fds in the set.  Granted, an fcntl() failure is probably the sign of a
bigger bug (such as closing an fd at the wrong point in time), so I
guess trying to go on doesn't make much sense once we already know we
are hosed.



I think I'll stick with it the way it is.  If fcntl() fails we might 
have a tainted fd set so I think we should fail.



+}
+
+switch (flags  O_ACCMODE) {
+case O_RDWR:
+if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
+return mon_fdset_fd-fd;
+}
+break;
+case O_RDONLY:
+if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
+return mon_fdset_fd-fd;
+}
+break;


Do we want to allow the case where the caller asked for O_RDONLY, but
the set only has O_RDWR?  After all, the caller is getting a compatible
subset of what the set offers.



I don't see a problem with it.


+case O_WRONLY:
+if ((mon_fd_flags  O_ACCMODE) == O_WRONLY) {
+return mon_fdset_fd-fd;
+}
+break;


Likewise, should we allow a caller asking for O_WRONLY when the set
provides only O_RDWR?



I don't see a problem with it.



+/*
+ * Dups an fd and sets the flags
+ */
+static int qemu_dup(int fd, int flags)
+{
+int i;
+int ret;
+int serrno;
+int dup_flags;
+int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
+  O_NONBLOCK, 0 };
+
+if (flags  O_CLOEXEC) {
+ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);


F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
OS yet; you probably need some #ifdef and/or configure guards.



Ok


+if (ret == -1  errno == EINVAL) {
+ret = dup(fd);
+if (ret != -1  fcntl_setfl(ret, O_CLOEXEC) == -1) {


You _can't_ call F_SETFL with O_CLOEXEC.  O_CLOEXEC only causes open()
to set FD_CLOEXEC; thereafter, including in the case of this dup, what
you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
qemu_set_cloexec, not fcntl_setfl).



I know, this is a mistake.  I'm planning to replace fcntl_setfl() here 
with qemu_set_cloexec().



+goto fail;
+}
+}
+} else {
+ret = dup(fd);
+}
+
+if (ret == -1) {
+goto fail;
+}
+
+dup_flags = fcntl(ret, F_GETFL);
+if (dup_flags == -1) {
+goto fail;
+}
+
+if ((flags  O_SYNC) != (dup_flags  O_SYNC)) {
+

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-24 Thread Corey Bryant



On 07/24/2012 08:07 AM, Kevin Wolf wrote:

Am 23.07.2012 15:08, schrieb Corey Bryant:

When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com

v2:
  -Get rid of file_open and move dup code to qemu_open
   (kw...@redhat.com)
  -Use strtol wrapper instead of atoi (kw...@redhat.com)

v3:
  -Add note about fd leakage (ebl...@redhat.com)

v4
  -Moved patch to be later in series (lcapitul...@redhat.com)
  -Update qemu_open to check access mode flags and set flags that
   can be set (ebl...@redhat.com, kw...@redhat.com)

v5:
  -This patch was overhauled quite a bit in this version, with
   the addition of fd set and refcount support.
  -Use qemu_set_cloexec() on dup'd fd (ebl...@redhat.com)
  -Modify flags set by fcntl on dup'd fd (ebl...@redhat.com)
  -Reduce syscalls when setting flags for dup'd fd (ebl...@redhat.com)
  -Fix O_RDWR, O_RDONLY, O_WRONLY checks (ebl...@redhat.com)
---
  block/raw-posix.c |   24 +-
  block/raw-win32.c |2 +-
  block/vmdk.c  |4 +-
  block/vpc.c   |2 +-
  block/vvfat.c |   12 ++---
  cutils.c  |5 ++
  monitor.c |   85 +
  monitor.h |4 ++
  osdep.c   |  138 -
  qemu-common.h |3 +-
  qemu-tool.c   |   12 +
  11 files changed, 267 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a172de3..5d0a801 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
  out_free_buf:
  qemu_vfree(s-aligned_buf);
  out_close:
-qemu_close(fd);
+qemu_close(fd, filename);
  return -errno;
  }


Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd - fdset mappings in qemu_open/close?



I agree, I don't really like it either.

We already have a list of fd - fdset mappings (mon_fdset_fd_t - 
mon_fdset_t).  Would it be too costly to loop through all the fdsets/fds 
at the beginning of every qemu_close()?



But if we decided to keep it like this, please use the right interface
from the beginning in patch 5 instead of updating it here.



Ok


@@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool 
in_use)
  }
  }

+void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
+{
+mon_fdset_t *mon_fdset;
+
+if (!mon) {
+return;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id == fdset_id) {
+mon_fdset-refcount++;
+break;
+}
+}
+}
+
+void monitor_fdset_decrement_refcount(Monitor *mon, int64_t fdset_id)
+{
+mon_fdset_t *mon_fdset;
+
+if (!mon) {
+return;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id == fdset_id) {
+mon_fdset-refcount--;
+if (mon_fdset-refcount == 0) {
+monitor_fdset_cleanup(mon_fdset);
+}
+break;
+}
+}
+}


These two functions are almost the same. Would a
monitor_fdset_update_refcount(mon, fdset_id, value) make sense? These
functions could then be kept as thin wrappers around it, or they could
even be dropped completely.



This makes sense and I'll try one of these approaches.  I actually 
started to do something along these lines in v5 but reverted back to the 
two independent functions because it was easier to read the code.



+
+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+mon_fdset_t *mon_fdset;
+mon_fdset_fd_t *mon_fdset_fd;
+int mon_fd_flags;
+
+if (!mon) {
+errno = ENOENT;
+return -1;
+}
+
+QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
+if (mon_fdset-id != fdset_id) {
+continue;
+}
+QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
+if (mon_fdset_fd-removed) {
+continue;
+}
+
+mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
+if (mon_fd_flags == -1) {
+return -1;
+}

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-24 Thread Kevin Wolf
Am 23.07.2012 15:08, schrieb Corey Bryant:
 When qemu_open is passed a filename of the /dev/fdset/nnn
 format (where nnn is the fdset ID), an fd with matching access
 mode flags will be searched for within the specified monitor
 fd set.  If the fd is found, a dup of the fd will be returned
 from qemu_open.
 
 Each fd set has a reference count.  The purpose of the reference
 count is to determine if an fd set contains file descriptors that
 have open dup() references that have not yet been closed.  It is
 incremented on qemu_open and decremented on qemu_close.  It is
 not until the refcount is zero that file desriptors in an fd set
 can be closed.  If an fd set has dup() references open, then we
 must keep the other fds in the fd set open in case a reopen
 of the file occurs that requires an fd with a different access
 mode.
 
 Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
 
 v2:
  -Get rid of file_open and move dup code to qemu_open
   (kw...@redhat.com)
  -Use strtol wrapper instead of atoi (kw...@redhat.com)
 
 v3:
  -Add note about fd leakage (ebl...@redhat.com)
 
 v4
  -Moved patch to be later in series (lcapitul...@redhat.com)
  -Update qemu_open to check access mode flags and set flags that
   can be set (ebl...@redhat.com, kw...@redhat.com)
 
 v5:
  -This patch was overhauled quite a bit in this version, with
   the addition of fd set and refcount support.
  -Use qemu_set_cloexec() on dup'd fd (ebl...@redhat.com)
  -Modify flags set by fcntl on dup'd fd (ebl...@redhat.com)
  -Reduce syscalls when setting flags for dup'd fd (ebl...@redhat.com)
  -Fix O_RDWR, O_RDONLY, O_WRONLY checks (ebl...@redhat.com)
 ---
  block/raw-posix.c |   24 +-
  block/raw-win32.c |2 +-
  block/vmdk.c  |4 +-
  block/vpc.c   |2 +-
  block/vvfat.c |   12 ++---
  cutils.c  |5 ++
  monitor.c |   85 +
  monitor.h |4 ++
  osdep.c   |  138 
 -
  qemu-common.h |3 +-
  qemu-tool.c   |   12 +
  11 files changed, 267 insertions(+), 24 deletions(-)
 
 diff --git a/block/raw-posix.c b/block/raw-posix.c
 index a172de3..5d0a801 100644
 --- a/block/raw-posix.c
 +++ b/block/raw-posix.c
 @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const 
 char *filename,
  out_free_buf:
  qemu_vfree(s-aligned_buf);
  out_close:
 -qemu_close(fd);
 +qemu_close(fd, filename);
  return -errno;
  }

Hm, not a nice interface where qemu_close() needs the filename and
(worse) could be given a wrong filename. Maybe it would be better to
maintain a list of fd - fdset mappings in qemu_open/close?

But if we decided to keep it like this, please use the right interface
from the beginning in patch 5 instead of updating it here.

 @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, 
 bool in_use)
  }
  }
  
 +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
 +{
 +mon_fdset_t *mon_fdset;
 +
 +if (!mon) {
 +return;
 +}
 +
 +QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
 +if (mon_fdset-id == fdset_id) {
 +mon_fdset-refcount++;
 +break;
 +}
 +}
 +}
 +
 +void monitor_fdset_decrement_refcount(Monitor *mon, int64_t fdset_id)
 +{
 +mon_fdset_t *mon_fdset;
 +
 +if (!mon) {
 +return;
 +}
 +
 +QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
 +if (mon_fdset-id == fdset_id) {
 +mon_fdset-refcount--;
 +if (mon_fdset-refcount == 0) {
 +monitor_fdset_cleanup(mon_fdset);
 +}
 +break;
 +}
 +}
 +}

These two functions are almost the same. Would a
monitor_fdset_update_refcount(mon, fdset_id, value) make sense? These
functions could then be kept as thin wrappers around it, or they could
even be dropped completely.

 +
 +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
 +{
 +mon_fdset_t *mon_fdset;
 +mon_fdset_fd_t *mon_fdset_fd;
 +int mon_fd_flags;
 +
 +if (!mon) {
 +errno = ENOENT;
 +return -1;
 +}
 +
 +QLIST_FOREACH(mon_fdset, mon-fdsets, next) {
 +if (mon_fdset-id != fdset_id) {
 +continue;
 +}
 +QLIST_FOREACH(mon_fdset_fd, mon_fdset-fds, next) {
 +if (mon_fdset_fd-removed) {
 +continue;
 +}
 +
 +mon_fd_flags = fcntl(mon_fdset_fd-fd, F_GETFL);
 +if (mon_fd_flags == -1) {
 +return -1;
 +}
 +
 +switch (flags  O_ACCMODE) {
 +case O_RDWR:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDWR) {
 +return mon_fdset_fd-fd;
 +}
 +break;
 +case O_RDONLY:
 +if ((mon_fd_flags  O_ACCMODE) == O_RDONLY) {
 +return mon_fdset_fd-fd;
 +}
 +  

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-23 Thread Corey Bryant



On 07/23/2012 09:08 AM, Corey Bryant wrote:

When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com

v2:
  -Get rid of file_open and move dup code to qemu_open
   (kw...@redhat.com)
  -Use strtol wrapper instead of atoi (kw...@redhat.com)

v3:
  -Add note about fd leakage (ebl...@redhat.com)

v4
  -Moved patch to be later in series (lcapitul...@redhat.com)
  -Update qemu_open to check access mode flags and set flags that
   can be set (ebl...@redhat.com, kw...@redhat.com)

v5:
  -This patch was overhauled quite a bit in this version, with
   the addition of fd set and refcount support.
  -Use qemu_set_cloexec() on dup'd fd (ebl...@redhat.com)
  -Modify flags set by fcntl on dup'd fd (ebl...@redhat.com)
  -Reduce syscalls when setting flags for dup'd fd (ebl...@redhat.com)
  -Fix O_RDWR, O_RDONLY, O_WRONLY checks (ebl...@redhat.com)
---
  block/raw-posix.c |   24 +-
  block/raw-win32.c |2 +-
  block/vmdk.c  |4 +-
  block/vpc.c   |2 +-
  block/vvfat.c |   12 ++---
  cutils.c  |5 ++
  monitor.c |   85 +
  monitor.h |4 ++
  osdep.c   |  138 -
  qemu-common.h |3 +-
  qemu-tool.c   |   12 +
  11 files changed, 267 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a172de3..5d0a801 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
  out_free_buf:
  qemu_vfree(s-aligned_buf);
  out_close:
-qemu_close(fd);
+qemu_close(fd, filename);
  return -errno;
  }

@@ -376,7 +376,7 @@ static void raw_close(BlockDriverState *bs)
  {
  BDRVRawState *s = bs-opaque;
  if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  if (s-aligned_buf != NULL)
  qemu_vfree(s-aligned_buf);
@@ -580,7 +580,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
  if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
  result = -errno;
  }
-if (qemu_close(fd) != 0) {
+if (qemu_close(fd, filename) != 0) {
  result = -errno;
  }
  }
@@ -850,7 +850,7 @@ static int hdev_open(BlockDriverState *bs, const char 
*filename, int flags)
  if (fd  0) {
  bsdPath[strlen(bsdPath)-1] = '1';
  } else {
-qemu_close(fd);
+qemu_close(fd, bsdPath);
  }
  filename = bsdPath;
  }
@@ -889,7 +889,7 @@ static int fd_open(BlockDriverState *bs)
  last_media_present = (s-fd = 0);
  if (s-fd = 0 
  (get_clock() - s-fd_open_time) = FD_OPEN_TIMEOUT) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  #ifdef DEBUG_FLOPPY
  printf(Floppy closed\n);
@@ -988,7 +988,7 @@ static int hdev_create(const char *filename, 
QEMUOptionParameter *options)
  else if (lseek(fd, 0, SEEK_END)  total_size * BDRV_SECTOR_SIZE)
  ret = -ENOSPC;

-qemu_close(fd);
+qemu_close(fd, filename);
  return ret;
  }

@@ -1038,7 +1038,7 @@ static int floppy_open(BlockDriverState *bs, const char 
*filename, int flags)
  return ret;

  /* close fd so that we can reopen it as needed */
-qemu_close(s-fd);
+qemu_close(s-fd, filename);
  s-fd = -1;
  s-fd_media_changed = 1;

@@ -1070,7 +1070,7 @@ static int floppy_probe_device(const char *filename)
  prio = 100;

  outc:
-qemu_close(fd);
+qemu_close(fd, filename);
  out:
  return prio;
  }
@@ -1105,14 +1105,14 @@ static void floppy_eject(BlockDriverState *bs, bool 
eject_flag)
  int fd;

  if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
  s-fd = -1;
  }
  fd = qemu_open(bs-filename, s-open_flags | O_NONBLOCK);
  if (fd = 0) {
  if (ioctl(fd, FDEJECT, 0)  0)
  perror(FDEJECT);
-qemu_close(fd);
+qemu_close(fd, bs-filename);
  }
  }

@@ -1173,7 +1173,7 @@ static int 

[Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets

2012-07-23 Thread Corey Bryant
When qemu_open is passed a filename of the /dev/fdset/nnn
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com

v2:
 -Get rid of file_open and move dup code to qemu_open
  (kw...@redhat.com)
 -Use strtol wrapper instead of atoi (kw...@redhat.com)

v3:
 -Add note about fd leakage (ebl...@redhat.com)

v4
 -Moved patch to be later in series (lcapitul...@redhat.com)
 -Update qemu_open to check access mode flags and set flags that
  can be set (ebl...@redhat.com, kw...@redhat.com)

v5:
 -This patch was overhauled quite a bit in this version, with
  the addition of fd set and refcount support.
 -Use qemu_set_cloexec() on dup'd fd (ebl...@redhat.com)
 -Modify flags set by fcntl on dup'd fd (ebl...@redhat.com)
 -Reduce syscalls when setting flags for dup'd fd (ebl...@redhat.com)
 -Fix O_RDWR, O_RDONLY, O_WRONLY checks (ebl...@redhat.com)
---
 block/raw-posix.c |   24 +-
 block/raw-win32.c |2 +-
 block/vmdk.c  |4 +-
 block/vpc.c   |2 +-
 block/vvfat.c |   12 ++---
 cutils.c  |5 ++
 monitor.c |   85 +
 monitor.h |4 ++
 osdep.c   |  138 -
 qemu-common.h |3 +-
 qemu-tool.c   |   12 +
 11 files changed, 267 insertions(+), 24 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a172de3..5d0a801 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
 out_free_buf:
 qemu_vfree(s-aligned_buf);
 out_close:
-qemu_close(fd);
+qemu_close(fd, filename);
 return -errno;
 }
 
@@ -376,7 +376,7 @@ static void raw_close(BlockDriverState *bs)
 {
 BDRVRawState *s = bs-opaque;
 if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
 s-fd = -1;
 if (s-aligned_buf != NULL)
 qemu_vfree(s-aligned_buf);
@@ -580,7 +580,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
 result = -errno;
 }
-if (qemu_close(fd) != 0) {
+if (qemu_close(fd, filename) != 0) {
 result = -errno;
 }
 }
@@ -850,7 +850,7 @@ static int hdev_open(BlockDriverState *bs, const char 
*filename, int flags)
 if (fd  0) {
 bsdPath[strlen(bsdPath)-1] = '1';
 } else {
-qemu_close(fd);
+qemu_close(fd, bsdPath);
 }
 filename = bsdPath;
 }
@@ -889,7 +889,7 @@ static int fd_open(BlockDriverState *bs)
 last_media_present = (s-fd = 0);
 if (s-fd = 0 
 (get_clock() - s-fd_open_time) = FD_OPEN_TIMEOUT) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
 s-fd = -1;
 #ifdef DEBUG_FLOPPY
 printf(Floppy closed\n);
@@ -988,7 +988,7 @@ static int hdev_create(const char *filename, 
QEMUOptionParameter *options)
 else if (lseek(fd, 0, SEEK_END)  total_size * BDRV_SECTOR_SIZE)
 ret = -ENOSPC;
 
-qemu_close(fd);
+qemu_close(fd, filename);
 return ret;
 }
 
@@ -1038,7 +1038,7 @@ static int floppy_open(BlockDriverState *bs, const char 
*filename, int flags)
 return ret;
 
 /* close fd so that we can reopen it as needed */
-qemu_close(s-fd);
+qemu_close(s-fd, filename);
 s-fd = -1;
 s-fd_media_changed = 1;
 
@@ -1070,7 +1070,7 @@ static int floppy_probe_device(const char *filename)
 prio = 100;
 
 outc:
-qemu_close(fd);
+qemu_close(fd, filename);
 out:
 return prio;
 }
@@ -1105,14 +1105,14 @@ static void floppy_eject(BlockDriverState *bs, bool 
eject_flag)
 int fd;
 
 if (s-fd = 0) {
-qemu_close(s-fd);
+qemu_close(s-fd, bs-filename);
 s-fd = -1;
 }
 fd = qemu_open(bs-filename, s-open_flags | O_NONBLOCK);
 if (fd = 0) {
 if (ioctl(fd, FDEJECT, 0)  0)
 perror(FDEJECT);
-qemu_close(fd);
+qemu_close(fd, bs-filename);
 }
 }
 
@@ -1173,7 +1173,7 @@ static int cdrom_probe_device(const char *filename)
 prio = 100;
 
 outc:
-qemu_close(fd);
+qemu_close(fd, filename);
 out: