Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-20 Thread Jan Kara
On Fri 17-04-15 16:44:16, Andreas Dilger wrote:
> On Apr 17, 2015, at 5:31 AM, Jan Kara  wrote:
> > On Wed 15-04-15 09:15:44, Beata Michalska wrote:
> >> Introduce configurable generic interface for file
> >> system-wide event notifications to provide file
> >> systems with a common way of reporting any potential
> >> issues as they emerge.
> >> 
> >> The notifications are to be issued through generic
> >> netlink interface, by a dedicated, for file system
> >> events, multicast group. The file systems might as
> >> well use this group to send their own custom messages.
> >> 
> >> The events have been split into four base categories:
> >> information, warnings, errors and threshold notifications,
> >> with some very basic event types like running out of space
> >> or file system being remounted as read-only.
> >> 
> >> Threshold notifications have been included to allow
> >> triggering an event whenever the amount of free space
> >> drops below a certain level - or levels to be more precise
> >> as two of them are being supported: the lower and the upper
> >> range. The notifications work both ways: once the threshold
> >> level has been reached, an event shall be generated whenever
> >> the number of available blocks goes up again re-activating
> >> the threshold.
> >> 
> >> The interface has been exposed through a vfs. Once mounted,
> >> it serves as an entry point for the set-up where one can
> >> register for particular file system events.
> >> 
> >> Signed-off-by: Beata Michalska 
> >  Thanks for the patches! Some comments are below.
> > 
> >> ---
> >> Documentation/filesystems/events.txt |  254 +++
> >> fs/Makefile  |1 +
> >> fs/events/Makefile   |6 +
> >> fs/events/fs_event.c |  775 
> >> ++
> >> fs/events/fs_event.h |   27 ++
> >> fs/events/fs_event_netlink.c |   94 +
> >> fs/namespace.c   |1 +
> >> include/linux/fs.h   |6 +-
> >> include/linux/fs_event.h |   69 +++
> >> include/uapi/linux/fs_event.h|   62 +++
> >> include/uapi/linux/genetlink.h   |1 +
> >> net/netlink/genetlink.c  |7 +-
> >> 12 files changed, 1301 insertions(+), 2 deletions(-)
> >> create mode 100644 Documentation/filesystems/events.txt
> >> create mode 100644 fs/events/Makefile
> >> create mode 100644 fs/events/fs_event.c
> >> create mode 100644 fs/events/fs_event.h
> >> create mode 100644 fs/events/fs_event_netlink.c
> >> create mode 100644 include/linux/fs_event.h
> >> create mode 100644 include/uapi/linux/fs_event.h
> >> 
> >> diff --git a/Documentation/filesystems/events.txt 
> >> b/Documentation/filesystems/events.txt
> >> new file mode 100644
> >> index 000..c85dd88
> >> --- /dev/null
> >> +++ b/Documentation/filesystems/events.txt
> >> @@ -0,0 +1,254 @@
> >> +
> >> +  Generic file system event notification interface
> >> +
> >> +Document created 09 April 2015 by Beata Michalska 
> >> 
> >> +
> >> +1. The reason behind:
> >> +=
> >> +
> >> +There are many corner cases when things might get messy with the 
> >> filesystems.
> >> +And it is not always obvious what and when went wrong. Sometimes you might
> >> +get some subtle hints that there is something going on - but by the time
> >> +you realise it, it might be too late as you are already out-of-space
> >> +or the filesystem has been remounted as read-only (i.e.). The generic
> >> +interface for the filesystem events fills the gap by providing a rather
> >> +easy way of real-time notifications triggered whenever something 
> >> intreseting
> >> +happens, allowing filesystems to report events in a common way, as they 
> >> occur.
> >> +
> >> +2. How does it work:
> >> +
> >> +
> >> +The interface itself has been exposed as fstrace-type Virtual File System,
> >> +primarily to ease the process of setting up the configuration for the file
> >> +system notifications. So for starters it needs to get mounted (obviously):
> >> +
> >> +  mount -t fstrace none /sys/fs/events
> >> +
> >> +This will unveil the single fstrace filesystem entry - the 'config' file,
> >> +through which the notification are being set-up.
> >> +
> >> +Activating notifications for particular filesystem is as straightforward
> >> +as writing into the 'config' file. Note that by default all events despite
> >> +the actual filesystem type are being disregarded.
> >  Is there a reason to have a special filesystem for this? Do you expect
> > extending it by (many) more files? Why not just creating a file in sysfs or
> > something like that?
> > 
> >> +Synopsis of config:
> >> +--
> >> +
> >> +  MOUNT EVENT_TYPE [L1] [L2]
> >> +
> >> + MOUNT  : the filesystem's mount point
> >  I'm not quite decided but is mountpoint really the right thing to pass
> > via the interface? They aren't unique (filesystem can be mounted in
> > multiple 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-20 Thread Beata Michalska
Hi,

On 04/18/2015 12:44 AM, Andreas Dilger wrote:
> On Apr 17, 2015, at 5:31 AM, Jan Kara  wrote:
>> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>>> Introduce configurable generic interface for file
>>> system-wide event notifications to provide file
>>> systems with a common way of reporting any potential
>>> issues as they emerge.
>>>
>>> The notifications are to be issued through generic
>>> netlink interface, by a dedicated, for file system
>>> events, multicast group. The file systems might as
>>> well use this group to send their own custom messages.
>>>
>>> The events have been split into four base categories:
>>> information, warnings, errors and threshold notifications,
>>> with some very basic event types like running out of space
>>> or file system being remounted as read-only.
>>>
>>> Threshold notifications have been included to allow
>>> triggering an event whenever the amount of free space
>>> drops below a certain level - or levels to be more precise
>>> as two of them are being supported: the lower and the upper
>>> range. The notifications work both ways: once the threshold
>>> level has been reached, an event shall be generated whenever
>>> the number of available blocks goes up again re-activating
>>> the threshold.
>>>
>>> The interface has been exposed through a vfs. Once mounted,
>>> it serves as an entry point for the set-up where one can
>>> register for particular file system events.
>>>
>>> Signed-off-by: Beata Michalska 
>>  Thanks for the patches! Some comments are below.
>>
>>> ---
>>> Documentation/filesystems/events.txt |  254 +++
>>> fs/Makefile  |1 +
>>> fs/events/Makefile   |6 +
>>> fs/events/fs_event.c |  775 
>>> ++
>>> fs/events/fs_event.h |   27 ++
>>> fs/events/fs_event_netlink.c |   94 +
>>> fs/namespace.c   |1 +
>>> include/linux/fs.h   |6 +-
>>> include/linux/fs_event.h |   69 +++
>>> include/uapi/linux/fs_event.h|   62 +++
>>> include/uapi/linux/genetlink.h   |1 +
>>> net/netlink/genetlink.c  |7 +-
>>> 12 files changed, 1301 insertions(+), 2 deletions(-)
>>> create mode 100644 Documentation/filesystems/events.txt
>>> create mode 100644 fs/events/Makefile
>>> create mode 100644 fs/events/fs_event.c
>>> create mode 100644 fs/events/fs_event.h
>>> create mode 100644 fs/events/fs_event_netlink.c
>>> create mode 100644 include/linux/fs_event.h
>>> create mode 100644 include/uapi/linux/fs_event.h
>>>
>>> diff --git a/Documentation/filesystems/events.txt 
>>> b/Documentation/filesystems/events.txt
>>> new file mode 100644
>>> index 000..c85dd88
>>> --- /dev/null
>>> +++ b/Documentation/filesystems/events.txt
>>> @@ -0,0 +1,254 @@
>>> +
>>> +   Generic file system event notification interface
>>> +
>>> +Document created 09 April 2015 by Beata Michalska 
>>> +
>>> +1. The reason behind:
>>> +=
>>> +
>>> +There are many corner cases when things might get messy with the 
>>> filesystems.
>>> +And it is not always obvious what and when went wrong. Sometimes you might
>>> +get some subtle hints that there is something going on - but by the time
>>> +you realise it, it might be too late as you are already out-of-space
>>> +or the filesystem has been remounted as read-only (i.e.). The generic
>>> +interface for the filesystem events fills the gap by providing a rather
>>> +easy way of real-time notifications triggered whenever something 
>>> intreseting
>>> +happens, allowing filesystems to report events in a common way, as they 
>>> occur.
>>> +
>>> +2. How does it work:
>>> +
>>> +
>>> +The interface itself has been exposed as fstrace-type Virtual File System,
>>> +primarily to ease the process of setting up the configuration for the file
>>> +system notifications. So for starters it needs to get mounted (obviously):
>>> +
>>> +   mount -t fstrace none /sys/fs/events
>>> +
>>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>>> +through which the notification are being set-up.
>>> +
>>> +Activating notifications for particular filesystem is as straightforward
>>> +as writing into the 'config' file. Note that by default all events despite
>>> +the actual filesystem type are being disregarded.
>>  Is there a reason to have a special filesystem for this? Do you expect
>> extending it by (many) more files? Why not just creating a file in sysfs or
>> something like that?
>>
>>> +Synopsis of config:
>>> +--
>>> +
>>> +   MOUNT EVENT_TYPE [L1] [L2]
>>> +
>>> + MOUNT  : the filesystem's mount point
>>  I'm not quite decided but is mountpoint really the right thing to pass
>> via the interface? They aren't unique (filesystem can be mounted in
>> multiple places) and more importantly can change over time. So won't it be
>> better to pass major:minor over the interface? 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-20 Thread Jan Kara
On Fri 17-04-15 16:44:16, Andreas Dilger wrote:
 On Apr 17, 2015, at 5:31 AM, Jan Kara j...@suse.cz wrote:
  On Wed 15-04-15 09:15:44, Beata Michalska wrote:
  Introduce configurable generic interface for file
  system-wide event notifications to provide file
  systems with a common way of reporting any potential
  issues as they emerge.
  
  The notifications are to be issued through generic
  netlink interface, by a dedicated, for file system
  events, multicast group. The file systems might as
  well use this group to send their own custom messages.
  
  The events have been split into four base categories:
  information, warnings, errors and threshold notifications,
  with some very basic event types like running out of space
  or file system being remounted as read-only.
  
  Threshold notifications have been included to allow
  triggering an event whenever the amount of free space
  drops below a certain level - or levels to be more precise
  as two of them are being supported: the lower and the upper
  range. The notifications work both ways: once the threshold
  level has been reached, an event shall be generated whenever
  the number of available blocks goes up again re-activating
  the threshold.
  
  The interface has been exposed through a vfs. Once mounted,
  it serves as an entry point for the set-up where one can
  register for particular file system events.
  
  Signed-off-by: Beata Michalska b.michal...@samsung.com
   Thanks for the patches! Some comments are below.
  
  ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
  ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h
  
  diff --git a/Documentation/filesystems/events.txt 
  b/Documentation/filesystems/events.txt
  new file mode 100644
  index 000..c85dd88
  --- /dev/null
  +++ b/Documentation/filesystems/events.txt
  @@ -0,0 +1,254 @@
  +
  +  Generic file system event notification interface
  +
  +Document created 09 April 2015 by Beata Michalska 
  b.michal...@samsung.com
  +
  +1. The reason behind:
  +=
  +
  +There are many corner cases when things might get messy with the 
  filesystems.
  +And it is not always obvious what and when went wrong. Sometimes you might
  +get some subtle hints that there is something going on - but by the time
  +you realise it, it might be too late as you are already out-of-space
  +or the filesystem has been remounted as read-only (i.e.). The generic
  +interface for the filesystem events fills the gap by providing a rather
  +easy way of real-time notifications triggered whenever something 
  intreseting
  +happens, allowing filesystems to report events in a common way, as they 
  occur.
  +
  +2. How does it work:
  +
  +
  +The interface itself has been exposed as fstrace-type Virtual File System,
  +primarily to ease the process of setting up the configuration for the file
  +system notifications. So for starters it needs to get mounted (obviously):
  +
  +  mount -t fstrace none /sys/fs/events
  +
  +This will unveil the single fstrace filesystem entry - the 'config' file,
  +through which the notification are being set-up.
  +
  +Activating notifications for particular filesystem is as straightforward
  +as writing into the 'config' file. Note that by default all events despite
  +the actual filesystem type are being disregarded.
   Is there a reason to have a special filesystem for this? Do you expect
  extending it by (many) more files? Why not just creating a file in sysfs or
  something like that?
  
  +Synopsis of config:
  +--
  +
  +  MOUNT EVENT_TYPE [L1] [L2]
  +
  + MOUNT  : the filesystem's mount point
   I'm not quite decided but is mountpoint really the right thing to pass
  via the interface? They aren't unique (filesystem can be mounted in
  multiple places) and more importantly can change over time. So won't it be
  better to pass major:minor over the interface? These are stable, unique to
  the filesystem, and userspace can easily get them by calling stat(2) on the
  desired path (or directly from 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-20 Thread Beata Michalska
Hi,

On 04/18/2015 12:44 AM, Andreas Dilger wrote:
 On Apr 17, 2015, at 5:31 AM, Jan Kara j...@suse.cz wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.

 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.

 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.

 Signed-off-by: Beata Michalska b.michal...@samsung.com
  Thanks for the patches! Some comments are below.

 ---
 Documentation/filesystems/events.txt |  254 +++
 fs/Makefile  |1 +
 fs/events/Makefile   |6 +
 fs/events/fs_event.c |  775 
 ++
 fs/events/fs_event.h |   27 ++
 fs/events/fs_event_netlink.c |   94 +
 fs/namespace.c   |1 +
 include/linux/fs.h   |6 +-
 include/linux/fs_event.h |   69 +++
 include/uapi/linux/fs_event.h|   62 +++
 include/uapi/linux/genetlink.h   |1 +
 net/netlink/genetlink.c  |7 +-
 12 files changed, 1301 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/filesystems/events.txt
 create mode 100644 fs/events/Makefile
 create mode 100644 fs/events/fs_event.c
 create mode 100644 fs/events/fs_event.h
 create mode 100644 fs/events/fs_event_netlink.c
 create mode 100644 include/linux/fs_event.h
 create mode 100644 include/uapi/linux/fs_event.h

 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 +   Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the 
 filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something 
 intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 +   mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
  Is there a reason to have a special filesystem for this? Do you expect
 extending it by (many) more files? Why not just creating a file in sysfs or
 something like that?

 +Synopsis of config:
 +--
 +
 +   MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
  I'm not quite decided but is mountpoint really the right thing to pass
 via the interface? They aren't unique (filesystem can be mounted in
 multiple places) and more importantly can change over time. So won't it be
 better to pass major:minor over the interface? These are stable, unique to
 the filesystem, and userspace can easily get them by calling stat(2) on the
 desired path (or directly from /proc/self/mountinfo). That could be also
 used as an fs identifier instead of assigned ID (and thus we won't need
 those events 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Andreas Dilger
On Apr 17, 2015, at 5:31 AM, Jan Kara  wrote:
> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>> 
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
>> 
>> The events have been split into four base categories:
>> information, warnings, errors and threshold notifications,
>> with some very basic event types like running out of space
>> or file system being remounted as read-only.
>> 
>> Threshold notifications have been included to allow
>> triggering an event whenever the amount of free space
>> drops below a certain level - or levels to be more precise
>> as two of them are being supported: the lower and the upper
>> range. The notifications work both ways: once the threshold
>> level has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>> 
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>> 
>> Signed-off-by: Beata Michalska 
>  Thanks for the patches! Some comments are below.
> 
>> ---
>> Documentation/filesystems/events.txt |  254 +++
>> fs/Makefile  |1 +
>> fs/events/Makefile   |6 +
>> fs/events/fs_event.c |  775 
>> ++
>> fs/events/fs_event.h |   27 ++
>> fs/events/fs_event_netlink.c |   94 +
>> fs/namespace.c   |1 +
>> include/linux/fs.h   |6 +-
>> include/linux/fs_event.h |   69 +++
>> include/uapi/linux/fs_event.h|   62 +++
>> include/uapi/linux/genetlink.h   |1 +
>> net/netlink/genetlink.c  |7 +-
>> 12 files changed, 1301 insertions(+), 2 deletions(-)
>> create mode 100644 Documentation/filesystems/events.txt
>> create mode 100644 fs/events/Makefile
>> create mode 100644 fs/events/fs_event.c
>> create mode 100644 fs/events/fs_event.h
>> create mode 100644 fs/events/fs_event_netlink.c
>> create mode 100644 include/linux/fs_event.h
>> create mode 100644 include/uapi/linux/fs_event.h
>> 
>> diff --git a/Documentation/filesystems/events.txt 
>> b/Documentation/filesystems/events.txt
>> new file mode 100644
>> index 000..c85dd88
>> --- /dev/null
>> +++ b/Documentation/filesystems/events.txt
>> @@ -0,0 +1,254 @@
>> +
>> +Generic file system event notification interface
>> +
>> +Document created 09 April 2015 by Beata Michalska 
>> +
>> +1. The reason behind:
>> +=
>> +
>> +There are many corner cases when things might get messy with the 
>> filesystems.
>> +And it is not always obvious what and when went wrong. Sometimes you might
>> +get some subtle hints that there is something going on - but by the time
>> +you realise it, it might be too late as you are already out-of-space
>> +or the filesystem has been remounted as read-only (i.e.). The generic
>> +interface for the filesystem events fills the gap by providing a rather
>> +easy way of real-time notifications triggered whenever something intreseting
>> +happens, allowing filesystems to report events in a common way, as they 
>> occur.
>> +
>> +2. How does it work:
>> +
>> +
>> +The interface itself has been exposed as fstrace-type Virtual File System,
>> +primarily to ease the process of setting up the configuration for the file
>> +system notifications. So for starters it needs to get mounted (obviously):
>> +
>> +mount -t fstrace none /sys/fs/events
>> +
>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>> +through which the notification are being set-up.
>> +
>> +Activating notifications for particular filesystem is as straightforward
>> +as writing into the 'config' file. Note that by default all events despite
>> +the actual filesystem type are being disregarded.
>  Is there a reason to have a special filesystem for this? Do you expect
> extending it by (many) more files? Why not just creating a file in sysfs or
> something like that?
> 
>> +Synopsis of config:
>> +--
>> +
>> +MOUNT EVENT_TYPE [L1] [L2]
>> +
>> + MOUNT  : the filesystem's mount point
>  I'm not quite decided but is mountpoint really the right thing to pass
> via the interface? They aren't unique (filesystem can be mounted in
> multiple places) and more importantly can change over time. So won't it be
> better to pass major:minor over the interface? These are stable, unique to
> the filesystem, and userspace can easily get them by calling stat(2) on the
> desired path (or directly from /proc/self/mountinfo). 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Andreas Dilger
On Apr 17, 2015, at 11:37 AM, John Spray  wrote:
> On 17/04/2015 17:22, Jan Kara wrote:
>> On Fri 17-04-15 17:08:10, John Spray wrote:
>>> On 17/04/2015 16:43, Jan Kara wrote:
>>> In that case I'm confused -- why would ENOSPC be an appropriate use
>>> of this interface if the mount being entirely blocked would be
>>> inappropriate?  Isn't being unable to service any I/O a more
>>> fundamental and severe thing than being up and healthy but full?
>>> 
>>> Were you intending the interface to be exclusively for data
>>> integrity issues like checksum failures, rather than more general
>>> events about a mount that userspace would probably like to know
>>> about?
>>   Well, I'm not saying we cannot have those events for fs availability /
>> inavailability. I'm just saying I'd like to see some use for that first.
>> I don't want events to be added just because it's possible...
>> 
>> For ENOSPC we have thin provisioned storage and the userspace deamon
>> shuffling real storage underneath. So there I know the usecase.
>> 
> 
> Ah, OK.  So I can think of a couple of use cases:
> * a cluster scheduling service (think MPI jobs or docker containers) might 
> check for events like this.  If it can see the cluster filesystem is 
> unavailable, then it can avoid scheduling the job, so that the (multi-node) 
> application does not get hung on one node with a bad mount.  If it sees a 
> mount go bad (unavailable, or client evicted) partway through a job, then it 
> can kill -9 the process that was relying on the bad mount, and go run it 
> somewhere else.
> * Boring but practical case: a nagios health check for checking if mounts are 
> OK.

John,
thanks for chiming in, as I was just about to write the same.  Some users
were just asking yesterday at the Lustre User Group meeting about adding
an interface to notify job schedulers for your #1 point, and I'd much
rather use a generic interface than inventing our own for Lustre.

Cheers, Andreas

> We don't have to invent these event types now of course, but something to 
> bear in mind.  Hopefully if/when any of the distributed filesystems 
> (Lustre/Ceph/etc) choose to implement this, we can look at making the event 
> types common at that time though.
> 
> BTW in any case an interface for filesystem events to userspace will be a 
> useful addition, thank you!
> 
> Cheers,
> John


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray



On 17/04/2015 17:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:

On 17/04/2015 16:43, Jan Kara wrote:
In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.



Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) 
might check for events like this.  If it can see the cluster filesystem 
is unavailable, then it can avoid scheduling the job, so that the 
(multi-node) application does not get hung on one node with a bad 
mount.  If it sees a mount go bad (unavailable, or client evicted) 
partway through a job, then it can kill -9 the process that was relying 
on the bad mount, and go run it somewhere else.
 * Boring but practical case: a nagios health check for checking if 
mounts are OK.


We don't have to invent these event types now of course, but something 
to bear in mind.  Hopefully if/when any of the distributed filesystems 
(Lustre/Ceph/etc) choose to implement this, we can look at making the 
event types common at that time though.


BTW in any case an interface for filesystem events to userspace will be 
a useful addition, thank you!


Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 12:29:07, Austin S Hemmelgarn wrote:
> On 2015-04-17 12:22, Jan Kara wrote:
> >On Fri 17-04-15 17:08:10, John Spray wrote:
> >>
> >>On 17/04/2015 16:43, Jan Kara wrote:
> >>>On Fri 17-04-15 15:51:14, John Spray wrote:
> On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
> 
> >For some filesystems, it may make sense to differentiate between a
> >generic warning and an error.  For BTRFS and ZFS for example, if
> >there is a csum error on a block, this will get automatically
> >corrected in many configurations, and won't require anything like
> >fsck to be run, but monitoring applications will still probably
> >want to be notified.
> Another key differentiation IMHO is between transient errors (like
> server is unavailable in a distributed filesystem) that will block
> the filesystem but might clear on their own, vs. permanent errors
> like unreadable drives that definitely will not clear until the
> administrator takes some action.  It's usually a reasonable
> approximation to call transient issues warnings, and permanent
> issues errors.
> >>>   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
> >>>would this have? I wouldn't like the interface to be dumping ground for
> >>>random crap - we have dmesg for that :).
> >>In that case I'm confused -- why would ENOSPC be an appropriate use
> >>of this interface if the mount being entirely blocked would be
> >>inappropriate?  Isn't being unable to service any I/O a more
> >>fundamental and severe thing than being up and healthy but full?
> >>
> >>Were you intending the interface to be exclusively for data
> >>integrity issues like checksum failures, rather than more general
> >>events about a mount that userspace would probably like to know
> >>about?
> >   Well, I'm not saying we cannot have those events for fs availability /
> >inavailability. I'm just saying I'd like to see some use for that first.
> >I don't want events to be added just because it's possible...
> >
> >For ENOSPC we have thin provisioned storage and the userspace deamon
> >shuffling real storage underneath. So there I know the usecase.
> >
> The use-case that immediately comes to mind for me would be diskless
> nodes with root-on-nfs needing to know if they can actually access
> the root filesystem.
  Well, most apps will access the root file system regardless of what we
send over netlink... So I don't see netlink events improving the situation
there too much. You could try to use it for something like failover but
even there I'm not too convinced - just doing some IO, waiting for timeout,
and failing over if IO doesn't complete works just fine for that these
days. That's why I was asking because I didn't see convincing usecase
myself...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Austin S Hemmelgarn

On 2015-04-17 12:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).

In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.

Honza

The use-case that immediately comes to mind for me would be diskless 
nodes with root-on-nfs needing to know if they can actually access the 
root filesystem.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 17:08:10, John Spray wrote:
> 
> On 17/04/2015 16:43, Jan Kara wrote:
> >On Fri 17-04-15 15:51:14, John Spray wrote:
> >>On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
> >>
> >>>For some filesystems, it may make sense to differentiate between a
> >>>generic warning and an error.  For BTRFS and ZFS for example, if
> >>>there is a csum error on a block, this will get automatically
> >>>corrected in many configurations, and won't require anything like
> >>>fsck to be run, but monitoring applications will still probably
> >>>want to be notified.
> >>Another key differentiation IMHO is between transient errors (like
> >>server is unavailable in a distributed filesystem) that will block
> >>the filesystem but might clear on their own, vs. permanent errors
> >>like unreadable drives that definitely will not clear until the
> >>administrator takes some action.  It's usually a reasonable
> >>approximation to call transient issues warnings, and permanent
> >>issues errors.
> >   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
> >would this have? I wouldn't like the interface to be dumping ground for
> >random crap - we have dmesg for that :).
> In that case I'm confused -- why would ENOSPC be an appropriate use
> of this interface if the mount being entirely blocked would be
> inappropriate?  Isn't being unable to service any I/O a more
> fundamental and severe thing than being up and healthy but full?
> 
> Were you intending the interface to be exclusively for data
> integrity issues like checksum failures, rather than more general
> events about a mount that userspace would probably like to know
> about?
  Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 06:08 PM, John Spray wrote:
> 
> On 17/04/2015 16:43, Jan Kara wrote:
>> On Fri 17-04-15 15:51:14, John Spray wrote:
>>> On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
>>>
 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably
 want to be notified.
>>> Another key differentiation IMHO is between transient errors (like
>>> server is unavailable in a distributed filesystem) that will block
>>> the filesystem but might clear on their own, vs. permanent errors
>>> like unreadable drives that definitely will not clear until the
>>> administrator takes some action.  It's usually a reasonable
>>> approximation to call transient issues warnings, and permanent
>>> issues errors.
>>So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
>> would this have? I wouldn't like the interface to be dumping ground for
>> random crap - we have dmesg for that :).
> In that case I'm confused -- why would ENOSPC be an appropriate use of this 
> interface if the mount being entirely blocked would be inappropriate?  Isn't 
> being unable to service any I/O a more fundamental and severe thing than 
> being up and healthy but full?
> 
> Were you intending the interface to be exclusively for data integrity issues 
> like checksum failures, rather than more general events about a mount that 
> userspace would probably like to know about?
> 
> John
> 

I think we should support both and leave the decision on what
is to be reported or not to particular file systems keeping it
to a reasonable extent, of course. The interface should hand it over
to user space - acting as a go-between. I would though avoid
any filesystem specific events (when it comes to specifying those),
keeping it as generic as possible.


BR
Beata
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).
In that case I'm confused -- why would ENOSPC be an appropriate use of 
this interface if the mount being entirely blocked would be 
inappropriate?  Isn't being unable to service any I/O a more fundamental 
and severe thing than being up and healthy but full?


Were you intending the interface to be exclusively for data integrity 
issues like checksum failures, rather than more general events about a 
mount that userspace would probably like to know about?


John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 15:51:14, John Spray wrote:
> On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
> >On 2015-04-17 09:04, Beata Michalska wrote:
> >>On 04/17/2015 01:31 PM, Jan Kara wrote:
> >>>On Wed 15-04-15 09:15:44, Beata Michalska wrote:
> >>>...
> +static const match_table_t fs_etypes = {
> +{ FS_EVENT_INFO,"info"  },
> +{ FS_EVENT_WARN,"warn"  },
> +{ FS_EVENT_THRESH,  "thr"   },
> +{ FS_EVENT_ERR, "err"   },
> +{ 0, NULL },
> +};
> >>>   Why are there these generic message types? Threshold
> >>>messages make good
> >>>sense to me. But not so much the rest. If they don't have a
> >>>clear meaning,
> >>>it will be a mess. So I also agree with a message like -
> >>>"filesystem has
> >>>trouble, you should probably unmount and run fsck" - that's fine. But
> >>>generic "info" or "warning" doesn't really carry any meaning
> >>>on its own and
> >>>thus seems pretty useless to me. To explain a bit more, AFAIU this
> >>>shouldn't be a generic logging interface where something like severity
> >>>makes sense but rather a relatively specific interface notifying about
> >>>events in filesystem userspace should know about so I expect
> >>>relatively low
> >>>number of types of events, not tens or even hundreds...
> >>>
> >>>Honza
> >>
> >>Getting rid of those would simplify the configuration part, indeed.
> >>So we would be left with 'generic' and threshold events.
> >>I guess I've overdone this part.
> >
> >For some filesystems, it may make sense to differentiate between a
> >generic warning and an error.  For BTRFS and ZFS for example, if
> >there is a csum error on a block, this will get automatically
> >corrected in many configurations, and won't require anything like
> >fsck to be run, but monitoring applications will still probably
> >want to be notified.
> 
> Another key differentiation IMHO is between transient errors (like
> server is unavailable in a distributed filesystem) that will block
> the filesystem but might clear on their own, vs. permanent errors
> like unreadable drives that definitely will not clear until the
> administrator takes some action.  It's usually a reasonable
> approximation to call transient issues warnings, and permanent
> issues errors.
  So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+{ FS_EVENT_INFO,"info"  },
+{ FS_EVENT_WARN,"warn"  },
+{ FS_EVENT_THRESH,  "thr"   },
+{ FS_EVENT_ERR, "err"   },
+{ 0, NULL },
+};
   Why are there these generic message types? Threshold messages 
make good
sense to me. But not so much the rest. If they don't have a clear 
meaning,
it will be a mess. So I also agree with a message like - "filesystem 
has

trouble, you should probably unmount and run fsck" - that's fine. But
generic "info" or "warning" doesn't really carry any meaning on its 
own and

thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect 
relatively low

number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, 
but monitoring applications will still probably want to be notified.


Another key differentiation IMHO is between transient errors (like 
server is unavailable in a distributed filesystem) that will block the 
filesystem but might clear on their own, vs. permanent errors like 
unreadable drives that definitely will not clear until the administrator 
takes some action.  It's usually a reasonable approximation to call 
transient issues warnings, and permanent issues errors.


John




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 09:23:35, Austin S Hemmelgarn wrote:
> On 2015-04-17 09:04, Beata Michalska wrote:
> >On 04/17/2015 01:31 PM, Jan Kara wrote:
> >>On Wed 15-04-15 09:15:44, Beata Michalska wrote:
> >>...
> >>>+static const match_table_t fs_etypes = {
> >>>+  { FS_EVENT_INFO,"info"  },
> >>>+  { FS_EVENT_WARN,"warn"  },
> >>>+  { FS_EVENT_THRESH,  "thr"   },
> >>>+  { FS_EVENT_ERR, "err"   },
> >>>+  { 0, NULL },
> >>>+};
> >>   Why are there these generic message types? Threshold messages make good
> >>sense to me. But not so much the rest. If they don't have a clear meaning,
> >>it will be a mess. So I also agree with a message like - "filesystem has
> >>trouble, you should probably unmount and run fsck" - that's fine. But
> >>generic "info" or "warning" doesn't really carry any meaning on its own and
> >>thus seems pretty useless to me. To explain a bit more, AFAIU this
> >>shouldn't be a generic logging interface where something like severity
> >>makes sense but rather a relatively specific interface notifying about
> >>events in filesystem userspace should know about so I expect relatively low
> >>number of types of events, not tens or even hundreds...
> >>
> >>Honza
> >
> >Getting rid of those would simplify the configuration part, indeed.
> >So we would be left with 'generic' and threshold events.
> >I guess I've overdone this part.
> 
> For some filesystems, it may make sense to differentiate between a
> generic warning and an error.  For BTRFS and ZFS for example, if
> there is a csum error on a block, this will get automatically
> corrected in many configurations, and won't require anything like
> fsck to be run, but monitoring applications will still probably want
> to be notified.
   Sure, but in that case just create an event CORRECTED_CHECKSUM_ERROR and
use that. Then userspace knows what it should do with the event. No need to
hide it behind warning / error category.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Austin S Hemmelgarn

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+   { FS_EVENT_INFO,"info"  },
+   { FS_EVENT_WARN,"warn"  },
+   { FS_EVENT_THRESH,  "thr"   },
+   { FS_EVENT_ERR, "err"   },
+   { 0, NULL },
+};

   Why are there these generic message types? Threshold messages make good
sense to me. But not so much the rest. If they don't have a clear meaning,
it will be a mess. So I also agree with a message like - "filesystem has
trouble, you should probably unmount and run fsck" - that's fine. But
generic "info" or "warning" doesn't really carry any meaning on its own and
thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect relatively low
number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, but 
monitoring applications will still probably want to be notified.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 15:04:37, Beata Michalska wrote:
> On 04/17/2015 01:31 PM, Jan Kara wrote:
> > On Wed 15-04-15 09:15:44, Beata Michalska wrote:
> > Also I think that we should make it clear that each event type has
> > different set of arguments. For threshold events they'll be L1 & L2, for
> > other events there may be no arguments, for other events maybe something
> > else...
> > 
> 
> Currently only the threshold events use arguments -  not sure what arguments
> could be used for the remaining notifications. But any suggestions are 
> welcomed.
  Me neither be someone will surely find something in future ;)

> > ...
> >> +static const match_table_t fs_etypes = {
> >> +  { FS_EVENT_INFO,"info"  },
> >> +  { FS_EVENT_WARN,"warn"  },
> >> +  { FS_EVENT_THRESH,  "thr"   },
> >> +  { FS_EVENT_ERR, "err"   },
> >> +  { 0, NULL },
> >> +};
> >   Why are there these generic message types? Threshold messages make good
> > sense to me. But not so much the rest. If they don't have a clear meaning,
> > it will be a mess. So I also agree with a message like - "filesystem has
> > trouble, you should probably unmount and run fsck" - that's fine. But
> > generic "info" or "warning" doesn't really carry any meaning on its own and
> > thus seems pretty useless to me. To explain a bit more, AFAIU this
> > shouldn't be a generic logging interface where something like severity
> > makes sense but rather a relatively specific interface notifying about
> > events in filesystem userspace should know about so I expect relatively low
> > number of types of events, not tens or even hundreds...
> > 
> 
> Getting rid of those would simplify the configuration part, indeed.
> So we would be left with 'generic' and threshold events.
> I guess I've overdone this part.
  Well, I would avoid defining anything that's not really used. So
currently you can define threshold events and we start with just those.
When someone hooks up filesystem error paths to send notification, we can
create event type for telling "filesystem corrupted". And so on... We just
have to be careful to document that new event types can be added and
userspace has to ignore events it does not understand.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 03:04 PM, Beata Michalska wrote:
> On 04/17/2015 01:31 PM, Jan Kara wrote:
>> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>>> Introduce configurable generic interface for file
>>> system-wide event notifications to provide file
>>> systems with a common way of reporting any potential
>>> issues as they emerge.
>>>
>>> The notifications are to be issued through generic
>>> netlink interface, by a dedicated, for file system
>>> events, multicast group. The file systems might as
>>> well use this group to send their own custom messages.
>>>
>>> The events have been split into four base categories:
>>> information, warnings, errors and threshold notifications,
>>> with some very basic event types like running out of space
>>> or file system being remounted as read-only.
>>>
>>> Threshold notifications have been included to allow
>>> triggering an event whenever the amount of free space
>>> drops below a certain level - or levels to be more precise
>>> as two of them are being supported: the lower and the upper
>>> range. The notifications work both ways: once the threshold
>>> level has been reached, an event shall be generated whenever
>>> the number of available blocks goes up again re-activating
>>> the threshold.
>>>
>>> The interface has been exposed through a vfs. Once mounted,
>>> it serves as an entry point for the set-up where one can
>>> register for particular file system events.
>>>
>>> Signed-off-by: Beata Michalska 
>>   Thanks for the patches! Some comments are below.
>>
>>> ---
>>>  Documentation/filesystems/events.txt |  254 +++
>>>  fs/Makefile  |1 +
>>>  fs/events/Makefile   |6 +
>>>  fs/events/fs_event.c |  775 
>>> ++
>>>  fs/events/fs_event.h |   27 ++
>>>  fs/events/fs_event_netlink.c |   94 +
>>>  fs/namespace.c   |1 +
>>>  include/linux/fs.h   |6 +-
>>>  include/linux/fs_event.h |   69 +++
>>>  include/uapi/linux/fs_event.h|   62 +++
>>>  include/uapi/linux/genetlink.h   |1 +
>>>  net/netlink/genetlink.c  |7 +-
>>>  12 files changed, 1301 insertions(+), 2 deletions(-)
>>>  create mode 100644 Documentation/filesystems/events.txt
>>>  create mode 100644 fs/events/Makefile
>>>  create mode 100644 fs/events/fs_event.c
>>>  create mode 100644 fs/events/fs_event.h
>>>  create mode 100644 fs/events/fs_event_netlink.c
>>>  create mode 100644 include/linux/fs_event.h
>>>  create mode 100644 include/uapi/linux/fs_event.h
>>>
>>> diff --git a/Documentation/filesystems/events.txt 
>>> b/Documentation/filesystems/events.txt
>>> new file mode 100644
>>> index 000..c85dd88
>>> --- /dev/null
>>> +++ b/Documentation/filesystems/events.txt
>>> @@ -0,0 +1,254 @@
>>> +
>>> +   Generic file system event notification interface
>>> +
>>> +Document created 09 April 2015 by Beata Michalska 
>>> +
>>> +1. The reason behind:
>>> +=
>>> +
>>> +There are many corner cases when things might get messy with the 
>>> filesystems.
>>> +And it is not always obvious what and when went wrong. Sometimes you might
>>> +get some subtle hints that there is something going on - but by the time
>>> +you realise it, it might be too late as you are already out-of-space
>>> +or the filesystem has been remounted as read-only (i.e.). The generic
>>> +interface for the filesystem events fills the gap by providing a rather
>>> +easy way of real-time notifications triggered whenever something 
>>> intreseting
>>> +happens, allowing filesystems to report events in a common way, as they 
>>> occur.
>>> +
>>> +2. How does it work:
>>> +
>>> +
>>> +The interface itself has been exposed as fstrace-type Virtual File System,
>>> +primarily to ease the process of setting up the configuration for the file
>>> +system notifications. So for starters it needs to get mounted (obviously):
>>> +
>>> +   mount -t fstrace none /sys/fs/events
>>> +
>>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>>> +through which the notification are being set-up.
>>> +
>>> +Activating notifications for particular filesystem is as straightforward
>>> +as writing into the 'config' file. Note that by default all events despite
>>> +the actual filesystem type are being disregarded.
>>   Is there a reason to have a special filesystem for this? Do you expect
>> extending it by (many) more files? Why not just creating a file in sysfs or
>> something like that?
> 
> No particular reason here - just for possible future extension if needed.
> I'm totally fine with having a single sysfs entry.
> 

On the other hand  sysfs entries are mostly single-valued or are sets
of values of a single type, so not sure if we would fit in here -
with the current configuration for the interface.

>>
>>> +Synopsis of config:
>>> +--
>>> +
>>> +   MOUNT EVENT_TYPE 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 01:31 PM, Jan Kara wrote:
> On Wed 15-04-15 09:15:44, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
>>
>> The events have been split into four base categories:
>> information, warnings, errors and threshold notifications,
>> with some very basic event types like running out of space
>> or file system being remounted as read-only.
>>
>> Threshold notifications have been included to allow
>> triggering an event whenever the amount of free space
>> drops below a certain level - or levels to be more precise
>> as two of them are being supported: the lower and the upper
>> range. The notifications work both ways: once the threshold
>> level has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>>
>> Signed-off-by: Beata Michalska 
>   Thanks for the patches! Some comments are below.
> 
>> ---
>>  Documentation/filesystems/events.txt |  254 +++
>>  fs/Makefile  |1 +
>>  fs/events/Makefile   |6 +
>>  fs/events/fs_event.c |  775 
>> ++
>>  fs/events/fs_event.h |   27 ++
>>  fs/events/fs_event_netlink.c |   94 +
>>  fs/namespace.c   |1 +
>>  include/linux/fs.h   |6 +-
>>  include/linux/fs_event.h |   69 +++
>>  include/uapi/linux/fs_event.h|   62 +++
>>  include/uapi/linux/genetlink.h   |1 +
>>  net/netlink/genetlink.c  |7 +-
>>  12 files changed, 1301 insertions(+), 2 deletions(-)
>>  create mode 100644 Documentation/filesystems/events.txt
>>  create mode 100644 fs/events/Makefile
>>  create mode 100644 fs/events/fs_event.c
>>  create mode 100644 fs/events/fs_event.h
>>  create mode 100644 fs/events/fs_event_netlink.c
>>  create mode 100644 include/linux/fs_event.h
>>  create mode 100644 include/uapi/linux/fs_event.h
>>
>> diff --git a/Documentation/filesystems/events.txt 
>> b/Documentation/filesystems/events.txt
>> new file mode 100644
>> index 000..c85dd88
>> --- /dev/null
>> +++ b/Documentation/filesystems/events.txt
>> @@ -0,0 +1,254 @@
>> +
>> +Generic file system event notification interface
>> +
>> +Document created 09 April 2015 by Beata Michalska 
>> +
>> +1. The reason behind:
>> +=
>> +
>> +There are many corner cases when things might get messy with the 
>> filesystems.
>> +And it is not always obvious what and when went wrong. Sometimes you might
>> +get some subtle hints that there is something going on - but by the time
>> +you realise it, it might be too late as you are already out-of-space
>> +or the filesystem has been remounted as read-only (i.e.). The generic
>> +interface for the filesystem events fills the gap by providing a rather
>> +easy way of real-time notifications triggered whenever something intreseting
>> +happens, allowing filesystems to report events in a common way, as they 
>> occur.
>> +
>> +2. How does it work:
>> +
>> +
>> +The interface itself has been exposed as fstrace-type Virtual File System,
>> +primarily to ease the process of setting up the configuration for the file
>> +system notifications. So for starters it needs to get mounted (obviously):
>> +
>> +mount -t fstrace none /sys/fs/events
>> +
>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>> +through which the notification are being set-up.
>> +
>> +Activating notifications for particular filesystem is as straightforward
>> +as writing into the 'config' file. Note that by default all events despite
>> +the actual filesystem type are being disregarded.
>   Is there a reason to have a special filesystem for this? Do you expect
> extending it by (many) more files? Why not just creating a file in sysfs or
> something like that?

No particular reason here - just for possible future extension if needed.
I'm totally fine with having a single sysfs entry.

> 
>> +Synopsis of config:
>> +--
>> +
>> +MOUNT EVENT_TYPE [L1] [L2]
>> +
>> + MOUNT  : the filesystem's mount point
>   I'm not quite decided but is mountpoint really the right thing to pass
> via the interface? They aren't unique (filesystem can be mounted in
> multiple places) and more importantly can change over time. So won't it be
> better to pass major:minor over the interface? These are stable, 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Thu 16-04-15 23:56:11, Heinrich Schuchardt wrote:
> On 15.04.2015 09:15, Beata Michalska wrote:
> > Introduce configurable generic interface for file
> > system-wide event notifications to provide file
> > systems with a common way of reporting any potential
> > issues as they emerge.
> > 
> > The notifications are to be issued through generic
> > netlink interface, by a dedicated, for file system
> > events, multicast group. The file systems might as
> > well use this group to send their own custom messages.
> > 
> > The events have been split into four base categories:
> > information, warnings, errors and threshold notifications,
> > with some very basic event types like running out of space
> > or file system being remounted as read-only.
> > 
> > Threshold notifications have been included to allow
> > triggering an event whenever the amount of free space
> > drops below a certain level - or levels to be more precise
> > as two of them are being supported: the lower and the upper
> > range. The notifications work both ways: once the threshold
> > level has been reached, an event shall be generated whenever
> > the number of available blocks goes up again re-activating
> > the threshold.
> > 
> > The interface has been exposed through a vfs. Once mounted,
> > it serves as an entry point for the set-up where one can
> > register for particular file system events.
> 
> Having a framework for notification for file systems is a great idea.
> Your solution covers an important part of the possible application scope.
> 
> Before moving forward I suggest we should analyze if this scope should
> be enlarged.
> 
> Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
> network nodes (e.g. Lustre). How should file system notification work here?
  IMO server <-> client notification is fully within the responsibility of
a particular protocol. The client can then translate the notification via
this interface just fine. So IMHO there's nothing to do in this regard.

> How will fuse file systems be served?
  I similar answer as previously. It's resposibility of each filesystem to
provide the notification. You would need some way for userspace to notify
the FUSE in kernel which can then relay the information via this interface.
So doable but I don't think we have to do it now...

> The current point of reference is a single mount point.
> Every time I insert an USB stick several file system may be automounted.
> I would like to receive events for these automounted file systems.
  So you'll receive udev / DBus events for the mounts, you can catch these
in a userspace daemon and add appropriate rules to receive events (you
could even make it part of the mounting procedure of your desktop). I don't
think we should magically insert new rules for mounted filesystems since
that's a decision that belongs to userspace.

> A similar case arises when starting new virtual machines. How will I
> receive events on the host system for the file systems of the virtual
> machines?
  IMHO that belongs in userspace and is out of scope for this proposal.

> In your implementation events are received via Netlink.
> Using Netlink for marking mounts for notification would create a much
> more homogenous interface. So why should we use a virtual file system here?
  Hum, that's an interesting idea. Yes, e.g. networking uses netlink to
configure e.g. routing in kernel and in case of this interface, it really
might make the interface nicer.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Wed 15-04-15 09:15:44, Beata Michalska wrote:
> Introduce configurable generic interface for file
> system-wide event notifications to provide file
> systems with a common way of reporting any potential
> issues as they emerge.
> 
> The notifications are to be issued through generic
> netlink interface, by a dedicated, for file system
> events, multicast group. The file systems might as
> well use this group to send their own custom messages.
> 
> The events have been split into four base categories:
> information, warnings, errors and threshold notifications,
> with some very basic event types like running out of space
> or file system being remounted as read-only.
> 
> Threshold notifications have been included to allow
> triggering an event whenever the amount of free space
> drops below a certain level - or levels to be more precise
> as two of them are being supported: the lower and the upper
> range. The notifications work both ways: once the threshold
> level has been reached, an event shall be generated whenever
> the number of available blocks goes up again re-activating
> the threshold.
> 
> The interface has been exposed through a vfs. Once mounted,
> it serves as an entry point for the set-up where one can
> register for particular file system events.
> 
> Signed-off-by: Beata Michalska 
  Thanks for the patches! Some comments are below.

> ---
>  Documentation/filesystems/events.txt |  254 +++
>  fs/Makefile  |1 +
>  fs/events/Makefile   |6 +
>  fs/events/fs_event.c |  775 
> ++
>  fs/events/fs_event.h |   27 ++
>  fs/events/fs_event_netlink.c |   94 +
>  fs/namespace.c   |1 +
>  include/linux/fs.h   |6 +-
>  include/linux/fs_event.h |   69 +++
>  include/uapi/linux/fs_event.h|   62 +++
>  include/uapi/linux/genetlink.h   |1 +
>  net/netlink/genetlink.c  |7 +-
>  12 files changed, 1301 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/filesystems/events.txt
>  create mode 100644 fs/events/Makefile
>  create mode 100644 fs/events/fs_event.c
>  create mode 100644 fs/events/fs_event.h
>  create mode 100644 fs/events/fs_event_netlink.c
>  create mode 100644 include/linux/fs_event.h
>  create mode 100644 include/uapi/linux/fs_event.h
> 
> diff --git a/Documentation/filesystems/events.txt 
> b/Documentation/filesystems/events.txt
> new file mode 100644
> index 000..c85dd88
> --- /dev/null
> +++ b/Documentation/filesystems/events.txt
> @@ -0,0 +1,254 @@
> +
> + Generic file system event notification interface
> +
> +Document created 09 April 2015 by Beata Michalska 
> +
> +1. The reason behind:
> +=
> +
> +There are many corner cases when things might get messy with the filesystems.
> +And it is not always obvious what and when went wrong. Sometimes you might
> +get some subtle hints that there is something going on - but by the time
> +you realise it, it might be too late as you are already out-of-space
> +or the filesystem has been remounted as read-only (i.e.). The generic
> +interface for the filesystem events fills the gap by providing a rather
> +easy way of real-time notifications triggered whenever something intreseting
> +happens, allowing filesystems to report events in a common way, as they 
> occur.
> +
> +2. How does it work:
> +
> +
> +The interface itself has been exposed as fstrace-type Virtual File System,
> +primarily to ease the process of setting up the configuration for the file
> +system notifications. So for starters it needs to get mounted (obviously):
> +
> + mount -t fstrace none /sys/fs/events
> +
> +This will unveil the single fstrace filesystem entry - the 'config' file,
> +through which the notification are being set-up.
> +
> +Activating notifications for particular filesystem is as straightforward
> +as writing into the 'config' file. Note that by default all events despite
> +the actual filesystem type are being disregarded.
  Is there a reason to have a special filesystem for this? Do you expect
extending it by (many) more files? Why not just creating a file in sysfs or
something like that?

> +Synopsis of config:
> +--
> +
> + MOUNT EVENT_TYPE [L1] [L2]
> +
> + MOUNT  : the filesystem's mount point
  I'm not quite decided but is mountpoint really the right thing to pass
via the interface? They aren't unique (filesystem can be mounted in
multiple places) and more importantly can change over time. So won't it be
better to pass major:minor over the interface? These are stable, unique to
the filesystem, and userspace can easily get them by calling stat(2) on the
desired path (or directly from /proc/self/mountinfo). That could be also
used as an fs identifier instead of assigned ID (and thus we won't need
those events about creation of new trace which look 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska

Hi,

On 04/16/2015 11:56 PM, Heinrich Schuchardt wrote:
> On 15.04.2015 09:15, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
>>
>> The events have been split into four base categories:
>> information, warnings, errors and threshold notifications,
>> with some very basic event types like running out of space
>> or file system being remounted as read-only.
>>
>> Threshold notifications have been included to allow
>> triggering an event whenever the amount of free space
>> drops below a certain level - or levels to be more precise
>> as two of them are being supported: the lower and the upper
>> range. The notifications work both ways: once the threshold
>> level has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
> 
> Having a framework for notification for file systems is a great idea.
> Your solution covers an important part of the possible application scope.
> 
> Before moving forward I suggest we should analyze if this scope should
> be enlarged.
> 
> Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
> network nodes (e.g. Lustre). How should file system notification work here?
> 
> How will fuse file systems be served?
> 
> The current point of reference is a single mount point.
> Every time I insert an USB stick several file system may be automounted.
> I would like to receive events for these automounted file systems.
> 
> A similar case arises when starting new virtual machines. How will I
> receive events on the host system for the file systems of the virtual
> machines?

> In your implementation events are received via Netlink.
> Using Netlink for marking mounts for notification would create a much
> more homogenous interface. So why should we use a virtual file system here?
> 
> Best regards
> 
> Heinrich Schuchardt
> 
> 

I'd be more than happy to extend the scope of suggested changes.
I hope I'll be able to collect more comments - in this way there 
is a chance we might get here smth that is really useful, for everyone.

I've tried to make the interface rather flexible, so that new cases
can be easily added - so the notification whenever a file system
is being mounted is definitely doable.

The vfs here merely serves the purpose to configure which type of events
and for which filesystems are to be issued. Having this done through
netlink is also an option, though it needs some more thoughts. The way
notifications are being sent might be extended: so there could be more
than one option for this. We might also want to consider if we want to
have this widely available - everything for everyone. (?)

As for the rest, I must admit I'm not really an fs person, so I assume
there will be more comments and questions like yours. This is also why
any comments/hints/remarks/doubts/issues etc would me more than just
welcomed. I'll try to answer them all, though this will require some
time on my side, thus apologies if I have some delays.


I'll get beck to this asap.

BR
Beata




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
Hi,

On 04/16/2015 10:10 PM, Hugh Dickins wrote:
> On Thu, 16 Apr 2015, Beata Michalska wrote:
>> On 04/16/2015 05:46 AM, Eric Sandeen wrote:
>>> On 4/15/15 2:15 AM, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
>>>
>>> ...
>>>
 + 4.3 Threshold notifications:
 +
 + #include 
 + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
 + void fs_event_free_space(struct super_block *sb, u64 ncount);
 +
 + Each filesystme supporting the treshold notifiactions should call
 + fs_event_alloc_space/fs_event_free_space repsectively whenever the
 + ammount of availbale blocks changes.
 + - sb: the filesystem's super block
 + - ncount: number of blocks being acquired/released
>>>
>>> so:
>>>
 +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
 +{
 +  struct fs_trace_entry *en;
 +  s64 count;
 +
 +  spin_lock(_trace_lock);
>>>
>>> Every allocation/free for every supported filesystem system-wide will be
>>> serialized on this global spinlock?  That sounds like a non-starter...
>>>
>>> -Eric
>>>
>> I guess there is a plenty room for improvements as this is an early version.
>> I do agree that this might be a performance bottleneck event though I've 
>> tried
>> to keep this to minimum - it's being taken only for hashtable look-up. But 
>> still...
>> I was considering placing the trace object within the super_block to skip
>> this look-up part but I'd like to gather more comments, especially on the 
>> concept
>> itself.
> 
> Sorry, I have no opinion on the netlink fs notifications concept
> itself, not my area of expertise at all.
> 
> No doubt you Cc'ed me for tmpfs: I am very glad you're now trying the
> generic filesystem route, and yes, I'd be happy to have the support
> in tmpfs, thank you - if it is generally agreed to be suitable for
> filesystems; but wouldn't want this as a special for tmpfs.
> 
> However, I must echo Eric's point: please take a look at 7e496299d4d2
> "tmpfs: make tmpfs scalable with percpu_counter for used blocks":
> Tim would be unhappy if you added overhead back into that path.
> 
> (And please Cc linux-fsde...@vger.kernel.org next time you post these.)
> 
> Hugh
> 

Well, the concept of using netlink interface here is just a part of the overall
idea - so any comments are really welcomed here. The more of them the better 
solution
can be worked out, as I believe.

As for the possible overhead: this is the last thing I would want, so I'll
definitely do may best to not to introduce any. I will definitely rework this.

Thanks for Your comments,

BR
Beata


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Thu 16-04-15 10:22:45, Beata Michalska wrote:
> On 04/15/2015 09:25 PM, Darrick J. Wong wrote:
> > On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
> > 
> >> +#define FS_THRESH_LR_REACHED  0x0020  /* The lower range of 
> >> threshold has been reached */
> >> +#define FS_THRESH_UR_REACHED  0x0040  /* The upper range of 
> >> threshold has been reached */
> >> +#define FS_ERR_UNKNOWN0x0080  /* Unknown error */
> >> +#define FS_ERR_RO_REMOUT  0x0100  /* The file system has been 
> >> remounted as red-only */
> > 
> > _REMOUNT... read-only...
> > 
> >> +#define FS_ERR_ITERNAL0x0200  /* File system's 
> >> internal error */
> > 
> > _INTERNAL...
> > 
> > What does FS_ERR_ITERNAL mean?  "programming error"?
> > 
> FS_ERR_ITERNAL is supposed to mean smth than can not be easily translated
> into generic event code - so smth that is specific for given file system type.
> 
> 
> > How about a separate FS_ERR_CORRUPTED to mean "go run fsck"?
> 
> Sounds like a good idea.
> 
> > 
> > Hmm, these are bit flags... it doesn't make sense that I can send things 
> > like
> > FS_INFO_UMOUNT | FS_ERR_RO_REMOUT.
> > 
> 
> You can but you shouldn't. Possibly some sanity checks could be added
> for such cases. I was thinking of possibly merging events for the same
> file system and sending them in one go - so a single message could contain
> multiple events. Though this requires some more thoughts.
  Well, I don't think merging events makes some sense. I don't expect that
many messages going over this interface so that merging would be necessary
to get a good performance. And when you merge events, you loose information
about the order - like was it below_limit_info and then above_limit_warn or
the other way around? Also evens might carry other data with them in which
case merging is impossible anyway.

So I'd vote for just not allowing merging and making message type a simple
enum.

Honza

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
Hi,

On 04/16/2015 10:10 PM, Hugh Dickins wrote:
 On Thu, 16 Apr 2015, Beata Michalska wrote:
 On 04/16/2015 05:46 AM, Eric Sandeen wrote:
 On 4/15/15 2:15 AM, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 ...

 + 4.3 Threshold notifications:
 +
 + #include linux/fs_event.h
 + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
 + void fs_event_free_space(struct super_block *sb, u64 ncount);
 +
 + Each filesystme supporting the treshold notifiactions should call
 + fs_event_alloc_space/fs_event_free_space repsectively whenever the
 + ammount of availbale blocks changes.
 + - sb: the filesystem's super block
 + - ncount: number of blocks being acquired/released

 so:

 +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
 +{
 +  struct fs_trace_entry *en;
 +  s64 count;
 +
 +  spin_lock(fs_trace_lock);

 Every allocation/free for every supported filesystem system-wide will be
 serialized on this global spinlock?  That sounds like a non-starter...

 -Eric

 I guess there is a plenty room for improvements as this is an early version.
 I do agree that this might be a performance bottleneck event though I've 
 tried
 to keep this to minimum - it's being taken only for hashtable look-up. But 
 still...
 I was considering placing the trace object within the super_block to skip
 this look-up part but I'd like to gather more comments, especially on the 
 concept
 itself.
 
 Sorry, I have no opinion on the netlink fs notifications concept
 itself, not my area of expertise at all.
 
 No doubt you Cc'ed me for tmpfs: I am very glad you're now trying the
 generic filesystem route, and yes, I'd be happy to have the support
 in tmpfs, thank you - if it is generally agreed to be suitable for
 filesystems; but wouldn't want this as a special for tmpfs.
 
 However, I must echo Eric's point: please take a look at 7e496299d4d2
 tmpfs: make tmpfs scalable with percpu_counter for used blocks:
 Tim would be unhappy if you added overhead back into that path.
 
 (And please Cc linux-fsde...@vger.kernel.org next time you post these.)
 
 Hugh
 

Well, the concept of using netlink interface here is just a part of the overall
idea - so any comments are really welcomed here. The more of them the better 
solution
can be worked out, as I believe.

As for the possible overhead: this is the last thing I would want, so I'll
definitely do may best to not to introduce any. I will definitely rework this.

Thanks for Your comments,

BR
Beata


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska

Hi,

On 04/16/2015 11:56 PM, Heinrich Schuchardt wrote:
 On 15.04.2015 09:15, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.

 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.

 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.
 
 Having a framework for notification for file systems is a great idea.
 Your solution covers an important part of the possible application scope.
 
 Before moving forward I suggest we should analyze if this scope should
 be enlarged.
 
 Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
 network nodes (e.g. Lustre). How should file system notification work here?
 
 How will fuse file systems be served?
 
 The current point of reference is a single mount point.
 Every time I insert an USB stick several file system may be automounted.
 I would like to receive events for these automounted file systems.
 
 A similar case arises when starting new virtual machines. How will I
 receive events on the host system for the file systems of the virtual
 machines?

 In your implementation events are received via Netlink.
 Using Netlink for marking mounts for notification would create a much
 more homogenous interface. So why should we use a virtual file system here?
 
 Best regards
 
 Heinrich Schuchardt
 
 

I'd be more than happy to extend the scope of suggested changes.
I hope I'll be able to collect more comments - in this way there 
is a chance we might get here smth that is really useful, for everyone.

I've tried to make the interface rather flexible, so that new cases
can be easily added - so the notification whenever a file system
is being mounted is definitely doable.

The vfs here merely serves the purpose to configure which type of events
and for which filesystems are to be issued. Having this done through
netlink is also an option, though it needs some more thoughts. The way
notifications are being sent might be extended: so there could be more
than one option for this. We might also want to consider if we want to
have this widely available - everything for everyone. (?)

As for the rest, I must admit I'm not really an fs person, so I assume
there will be more comments and questions like yours. This is also why
any comments/hints/remarks/doubts/issues etc would me more than just
welcomed. I'll try to answer them all, though this will require some
time on my side, thus apologies if I have some delays.


I'll get beck to this asap.

BR
Beata




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Thu 16-04-15 10:22:45, Beata Michalska wrote:
 On 04/15/2015 09:25 PM, Darrick J. Wong wrote:
  On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
  
  +#define FS_THRESH_LR_REACHED  0x0020  /* The lower range of 
  threshold has been reached */
  +#define FS_THRESH_UR_REACHED  0x0040  /* The upper range of 
  threshold has been reached */
  +#define FS_ERR_UNKNOWN0x0080  /* Unknown error */
  +#define FS_ERR_RO_REMOUT  0x0100  /* The file system has been 
  remounted as red-only */
  
  _REMOUNT... read-only...
  
  +#define FS_ERR_ITERNAL0x0200  /* File system's 
  internal error */
  
  _INTERNAL...
  
  What does FS_ERR_ITERNAL mean?  programming error?
  
 FS_ERR_ITERNAL is supposed to mean smth than can not be easily translated
 into generic event code - so smth that is specific for given file system type.
 
 
  How about a separate FS_ERR_CORRUPTED to mean go run fsck?
 
 Sounds like a good idea.
 
  
  Hmm, these are bit flags... it doesn't make sense that I can send things 
  like
  FS_INFO_UMOUNT | FS_ERR_RO_REMOUT.
  
 
 You can but you shouldn't. Possibly some sanity checks could be added
 for such cases. I was thinking of possibly merging events for the same
 file system and sending them in one go - so a single message could contain
 multiple events. Though this requires some more thoughts.
  Well, I don't think merging events makes some sense. I don't expect that
many messages going over this interface so that merging would be necessary
to get a good performance. And when you merge events, you loose information
about the order - like was it below_limit_info and then above_limit_warn or
the other way around? Also evens might carry other data with them in which
case merging is impossible anyway.

So I'd vote for just not allowing merging and making message type a simple
enum.

Honza

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 01:31 PM, Jan Kara wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.

 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.

 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.

 Signed-off-by: Beata Michalska b.michal...@samsung.com
   Thanks for the patches! Some comments are below.
 
 ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
 ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h

 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 +Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the 
 filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 +mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
   Is there a reason to have a special filesystem for this? Do you expect
 extending it by (many) more files? Why not just creating a file in sysfs or
 something like that?

No particular reason here - just for possible future extension if needed.
I'm totally fine with having a single sysfs entry.

 
 +Synopsis of config:
 +--
 +
 +MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
   I'm not quite decided but is mountpoint really the right thing to pass
 via the interface? They aren't unique (filesystem can be mounted in
 multiple places) and more importantly can change over time. So won't it be
 better to pass major:minor over the interface? These are stable, unique to
 the filesystem, and userspace can easily get them by calling stat(2) on the
 desired path (or directly from /proc/self/mountinfo). That could be also
 used 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Thu 16-04-15 23:56:11, Heinrich Schuchardt wrote:
 On 15.04.2015 09:15, Beata Michalska wrote:
  Introduce configurable generic interface for file
  system-wide event notifications to provide file
  systems with a common way of reporting any potential
  issues as they emerge.
  
  The notifications are to be issued through generic
  netlink interface, by a dedicated, for file system
  events, multicast group. The file systems might as
  well use this group to send their own custom messages.
  
  The events have been split into four base categories:
  information, warnings, errors and threshold notifications,
  with some very basic event types like running out of space
  or file system being remounted as read-only.
  
  Threshold notifications have been included to allow
  triggering an event whenever the amount of free space
  drops below a certain level - or levels to be more precise
  as two of them are being supported: the lower and the upper
  range. The notifications work both ways: once the threshold
  level has been reached, an event shall be generated whenever
  the number of available blocks goes up again re-activating
  the threshold.
  
  The interface has been exposed through a vfs. Once mounted,
  it serves as an entry point for the set-up where one can
  register for particular file system events.
 
 Having a framework for notification for file systems is a great idea.
 Your solution covers an important part of the possible application scope.
 
 Before moving forward I suggest we should analyze if this scope should
 be enlarged.
 
 Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
 network nodes (e.g. Lustre). How should file system notification work here?
  IMO server - client notification is fully within the responsibility of
a particular protocol. The client can then translate the notification via
this interface just fine. So IMHO there's nothing to do in this regard.

 How will fuse file systems be served?
  I similar answer as previously. It's resposibility of each filesystem to
provide the notification. You would need some way for userspace to notify
the FUSE in kernel which can then relay the information via this interface.
So doable but I don't think we have to do it now...

 The current point of reference is a single mount point.
 Every time I insert an USB stick several file system may be automounted.
 I would like to receive events for these automounted file systems.
  So you'll receive udev / DBus events for the mounts, you can catch these
in a userspace daemon and add appropriate rules to receive events (you
could even make it part of the mounting procedure of your desktop). I don't
think we should magically insert new rules for mounted filesystems since
that's a decision that belongs to userspace.

 A similar case arises when starting new virtual machines. How will I
 receive events on the host system for the file systems of the virtual
 machines?
  IMHO that belongs in userspace and is out of scope for this proposal.

 In your implementation events are received via Netlink.
 Using Netlink for marking mounts for notification would create a much
 more homogenous interface. So why should we use a virtual file system here?
  Hum, that's an interesting idea. Yes, e.g. networking uses netlink to
configure e.g. routing in kernel and in case of this interface, it really
might make the interface nicer.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 03:04 PM, Beata Michalska wrote:
 On 04/17/2015 01:31 PM, Jan Kara wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.

 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.

 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.

 Signed-off-by: Beata Michalska b.michal...@samsung.com
   Thanks for the patches! Some comments are below.

 ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
 ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h

 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 +   Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the 
 filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something 
 intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 +   mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
   Is there a reason to have a special filesystem for this? Do you expect
 extending it by (many) more files? Why not just creating a file in sysfs or
 something like that?
 
 No particular reason here - just for possible future extension if needed.
 I'm totally fine with having a single sysfs entry.
 

On the other hand  sysfs entries are mostly single-valued or are sets
of values of a single type, so not sure if we would fit in here -
with the current configuration for the interface.


 +Synopsis of config:
 +--
 +
 +   MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
   I'm not quite decided but is mountpoint really the right thing to pass
 via the interface? They aren't unique (filesystem can be mounted in
 multiple places) and more importantly can change over time. So won't 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 09:23:35, Austin S Hemmelgarn wrote:
 On 2015-04-17 09:04, Beata Michalska wrote:
 On 04/17/2015 01:31 PM, Jan Kara wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 ...
 +static const match_table_t fs_etypes = {
 +  { FS_EVENT_INFO,info  },
 +  { FS_EVENT_WARN,warn  },
 +  { FS_EVENT_THRESH,  thr   },
 +  { FS_EVENT_ERR, err   },
 +  { 0, NULL },
 +};
Why are there these generic message types? Threshold messages make good
 sense to me. But not so much the rest. If they don't have a clear meaning,
 it will be a mess. So I also agree with a message like - filesystem has
 trouble, you should probably unmount and run fsck - that's fine. But
 generic info or warning doesn't really carry any meaning on its own and
 thus seems pretty useless to me. To explain a bit more, AFAIU this
 shouldn't be a generic logging interface where something like severity
 makes sense but rather a relatively specific interface notifying about
 events in filesystem userspace should know about so I expect relatively low
 number of types of events, not tens or even hundreds...
 
 Honza
 
 Getting rid of those would simplify the configuration part, indeed.
 So we would be left with 'generic' and threshold events.
 I guess I've overdone this part.
 
 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably want
 to be notified.
   Sure, but in that case just create an event CORRECTED_CHECKSUM_ERROR and
use that. Then userspace knows what it should do with the event. No need to
hide it behind warning / error category.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 15:04:37, Beata Michalska wrote:
 On 04/17/2015 01:31 PM, Jan Kara wrote:
  On Wed 15-04-15 09:15:44, Beata Michalska wrote:
  Also I think that we should make it clear that each event type has
  different set of arguments. For threshold events they'll be L1  L2, for
  other events there may be no arguments, for other events maybe something
  else...
  
 
 Currently only the threshold events use arguments -  not sure what arguments
 could be used for the remaining notifications. But any suggestions are 
 welcomed.
  Me neither be someone will surely find something in future ;)

  ...
  +static const match_table_t fs_etypes = {
  +  { FS_EVENT_INFO,info  },
  +  { FS_EVENT_WARN,warn  },
  +  { FS_EVENT_THRESH,  thr   },
  +  { FS_EVENT_ERR, err   },
  +  { 0, NULL },
  +};
Why are there these generic message types? Threshold messages make good
  sense to me. But not so much the rest. If they don't have a clear meaning,
  it will be a mess. So I also agree with a message like - filesystem has
  trouble, you should probably unmount and run fsck - that's fine. But
  generic info or warning doesn't really carry any meaning on its own and
  thus seems pretty useless to me. To explain a bit more, AFAIU this
  shouldn't be a generic logging interface where something like severity
  makes sense but rather a relatively specific interface notifying about
  events in filesystem userspace should know about so I expect relatively low
  number of types of events, not tens or even hundreds...
  
 
 Getting rid of those would simplify the configuration part, indeed.
 So we would be left with 'generic' and threshold events.
 I guess I've overdone this part.
  Well, I would avoid defining anything that's not really used. So
currently you can define threshold events and we start with just those.
When someone hooks up filesystem error paths to send notification, we can
create event type for telling filesystem corrupted. And so on... We just
have to be careful to document that new event types can be added and
userspace has to ignore events it does not understand.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Austin S Hemmelgarn

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+   { FS_EVENT_INFO,info  },
+   { FS_EVENT_WARN,warn  },
+   { FS_EVENT_THRESH,  thr   },
+   { FS_EVENT_ERR, err   },
+   { 0, NULL },
+};

   Why are there these generic message types? Threshold messages make good
sense to me. But not so much the rest. If they don't have a clear meaning,
it will be a mess. So I also agree with a message like - filesystem has
trouble, you should probably unmount and run fsck - that's fine. But
generic info or warning doesn't really carry any meaning on its own and
thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect relatively low
number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, but 
monitoring applications will still probably want to be notified.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.
 
 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
 
 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.
 
 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.
 
 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.
 
 Signed-off-by: Beata Michalska b.michal...@samsung.com
  Thanks for the patches! Some comments are below.

 ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
 ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h
 
 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 + Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 + mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
  Is there a reason to have a special filesystem for this? Do you expect
extending it by (many) more files? Why not just creating a file in sysfs or
something like that?

 +Synopsis of config:
 +--
 +
 + MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
  I'm not quite decided but is mountpoint really the right thing to pass
via the interface? They aren't unique (filesystem can be mounted in
multiple places) and more importantly can change over time. So won't it be
better to pass major:minor over the interface? These are stable, unique to
the filesystem, and userspace can easily get them by calling stat(2) on the
desired path (or directly from /proc/self/mountinfo). That could be also
used as an fs identifier instead of assigned ID (and thus we won't need
those events about creation of new trace which look somewhat strange to
me).

OTOH using major:minor may 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+{ FS_EVENT_INFO,info  },
+{ FS_EVENT_WARN,warn  },
+{ FS_EVENT_THRESH,  thr   },
+{ FS_EVENT_ERR, err   },
+{ 0, NULL },
+};
   Why are there these generic message types? Threshold messages 
make good
sense to me. But not so much the rest. If they don't have a clear 
meaning,
it will be a mess. So I also agree with a message like - filesystem 
has

trouble, you should probably unmount and run fsck - that's fine. But
generic info or warning doesn't really carry any meaning on its 
own and

thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect 
relatively low

number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, 
but monitoring applications will still probably want to be notified.


Another key differentiation IMHO is between transient errors (like 
server is unavailable in a distributed filesystem) that will block the 
filesystem but might clear on their own, vs. permanent errors like 
unreadable drives that definitely will not clear until the administrator 
takes some action.  It's usually a reasonable approximation to call 
transient issues warnings, and permanent issues errors.


John




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray



On 17/04/2015 17:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:

On 17/04/2015 16:43, Jan Kara wrote:
In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.



Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) 
might check for events like this.  If it can see the cluster filesystem 
is unavailable, then it can avoid scheduling the job, so that the 
(multi-node) application does not get hung on one node with a bad 
mount.  If it sees a mount go bad (unavailable, or client evicted) 
partway through a job, then it can kill -9 the process that was relying 
on the bad mount, and go run it somewhere else.
 * Boring but practical case: a nagios health check for checking if 
mounts are OK.


We don't have to invent these event types now of course, but something 
to bear in mind.  Hopefully if/when any of the distributed filesystems 
(Lustre/Ceph/etc) choose to implement this, we can look at making the 
event types common at that time though.


BTW in any case an interface for filesystem events to userspace will be 
a useful addition, thank you!


Cheers,
John
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 12:29:07, Austin S Hemmelgarn wrote:
 On 2015-04-17 12:22, Jan Kara wrote:
 On Fri 17-04-15 17:08:10, John Spray wrote:
 
 On 17/04/2015 16:43, Jan Kara wrote:
 On Fri 17-04-15 15:51:14, John Spray wrote:
 On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
 
 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably
 want to be notified.
 Another key differentiation IMHO is between transient errors (like
 server is unavailable in a distributed filesystem) that will block
 the filesystem but might clear on their own, vs. permanent errors
 like unreadable drives that definitely will not clear until the
 administrator takes some action.  It's usually a reasonable
 approximation to call transient issues warnings, and permanent
 issues errors.
So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
 would this have? I wouldn't like the interface to be dumping ground for
 random crap - we have dmesg for that :).
 In that case I'm confused -- why would ENOSPC be an appropriate use
 of this interface if the mount being entirely blocked would be
 inappropriate?  Isn't being unable to service any I/O a more
 fundamental and severe thing than being up and healthy but full?
 
 Were you intending the interface to be exclusively for data
 integrity issues like checksum failures, rather than more general
 events about a mount that userspace would probably like to know
 about?
Well, I'm not saying we cannot have those events for fs availability /
 inavailability. I'm just saying I'd like to see some use for that first.
 I don't want events to be added just because it's possible...
 
 For ENOSPC we have thin provisioned storage and the userspace deamon
 shuffling real storage underneath. So there I know the usecase.
 
 The use-case that immediately comes to mind for me would be diskless
 nodes with root-on-nfs needing to know if they can actually access
 the root filesystem.
  Well, most apps will access the root file system regardless of what we
send over netlink... So I don't see netlink events improving the situation
there too much. You could try to use it for something like failover but
even there I'm not too convinced - just doing some IO, waiting for timeout,
and failing over if IO doesn't complete works just fine for that these
days. That's why I was asking because I didn't see convincing usecase
myself...

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 15:51:14, John Spray wrote:
 On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
 On 2015-04-17 09:04, Beata Michalska wrote:
 On 04/17/2015 01:31 PM, Jan Kara wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 ...
 +static const match_table_t fs_etypes = {
 +{ FS_EVENT_INFO,info  },
 +{ FS_EVENT_WARN,warn  },
 +{ FS_EVENT_THRESH,  thr   },
 +{ FS_EVENT_ERR, err   },
 +{ 0, NULL },
 +};
Why are there these generic message types? Threshold
 messages make good
 sense to me. But not so much the rest. If they don't have a
 clear meaning,
 it will be a mess. So I also agree with a message like -
 filesystem has
 trouble, you should probably unmount and run fsck - that's fine. But
 generic info or warning doesn't really carry any meaning
 on its own and
 thus seems pretty useless to me. To explain a bit more, AFAIU this
 shouldn't be a generic logging interface where something like severity
 makes sense but rather a relatively specific interface notifying about
 events in filesystem userspace should know about so I expect
 relatively low
 number of types of events, not tens or even hundreds...
 
 Honza
 
 Getting rid of those would simplify the configuration part, indeed.
 So we would be left with 'generic' and threshold events.
 I guess I've overdone this part.
 
 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably
 want to be notified.
 
 Another key differentiation IMHO is between transient errors (like
 server is unavailable in a distributed filesystem) that will block
 the filesystem but might clear on their own, vs. permanent errors
 like unreadable drives that definitely will not clear until the
 administrator takes some action.  It's usually a reasonable
 approximation to call transient issues warnings, and permanent
 issues errors.
  So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Jan Kara
On Fri 17-04-15 17:08:10, John Spray wrote:
 
 On 17/04/2015 16:43, Jan Kara wrote:
 On Fri 17-04-15 15:51:14, John Spray wrote:
 On 17/04/2015 14:23, Austin S Hemmelgarn wrote:
 
 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably
 want to be notified.
 Another key differentiation IMHO is between transient errors (like
 server is unavailable in a distributed filesystem) that will block
 the filesystem but might clear on their own, vs. permanent errors
 like unreadable drives that definitely will not clear until the
 administrator takes some action.  It's usually a reasonable
 approximation to call transient issues warnings, and permanent
 issues errors.
So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
 would this have? I wouldn't like the interface to be dumping ground for
 random crap - we have dmesg for that :).
 In that case I'm confused -- why would ENOSPC be an appropriate use
 of this interface if the mount being entirely blocked would be
 inappropriate?  Isn't being unable to service any I/O a more
 fundamental and severe thing than being up and healthy but full?
 
 Were you intending the interface to be exclusively for data
 integrity issues like checksum failures, rather than more general
 events about a mount that userspace would probably like to know
 about?
  Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Austin S Hemmelgarn

On 2015-04-17 12:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).

In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.

Honza

The use-case that immediately comes to mind for me would be diskless 
nodes with root-on-nfs needing to know if they can actually access the 
root filesystem.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Beata Michalska
On 04/17/2015 06:08 PM, John Spray wrote:
 
 On 17/04/2015 16:43, Jan Kara wrote:
 On Fri 17-04-15 15:51:14, John Spray wrote:
 On 17/04/2015 14:23, Austin S Hemmelgarn wrote:

 For some filesystems, it may make sense to differentiate between a
 generic warning and an error.  For BTRFS and ZFS for example, if
 there is a csum error on a block, this will get automatically
 corrected in many configurations, and won't require anything like
 fsck to be run, but monitoring applications will still probably
 want to be notified.
 Another key differentiation IMHO is between transient errors (like
 server is unavailable in a distributed filesystem) that will block
 the filesystem but might clear on their own, vs. permanent errors
 like unreadable drives that definitely will not clear until the
 administrator takes some action.  It's usually a reasonable
 approximation to call transient issues warnings, and permanent
 issues errors.
So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
 would this have? I wouldn't like the interface to be dumping ground for
 random crap - we have dmesg for that :).
 In that case I'm confused -- why would ENOSPC be an appropriate use of this 
 interface if the mount being entirely blocked would be inappropriate?  Isn't 
 being unable to service any I/O a more fundamental and severe thing than 
 being up and healthy but full?
 
 Were you intending the interface to be exclusively for data integrity issues 
 like checksum failures, rather than more general events about a mount that 
 userspace would probably like to know about?
 
 John
 

I think we should support both and leave the decision on what
is to be reported or not to particular file systems keeping it
to a reasonable extent, of course. The interface should hand it over
to user space - acting as a go-between. I would though avoid
any filesystem specific events (when it comes to specifying those),
keeping it as generic as possible.


BR
Beata
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).
In that case I'm confused -- why would ENOSPC be an appropriate use of 
this interface if the mount being entirely blocked would be 
inappropriate?  Isn't being unable to service any I/O a more fundamental 
and severe thing than being up and healthy but full?


Were you intending the interface to be exclusively for data integrity 
issues like checksum failures, rather than more general events about a 
mount that userspace would probably like to know about?


John
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Andreas Dilger
On Apr 17, 2015, at 11:37 AM, John Spray john.sp...@redhat.com wrote:
 On 17/04/2015 17:22, Jan Kara wrote:
 On Fri 17-04-15 17:08:10, John Spray wrote:
 On 17/04/2015 16:43, Jan Kara wrote:
 In that case I'm confused -- why would ENOSPC be an appropriate use
 of this interface if the mount being entirely blocked would be
 inappropriate?  Isn't being unable to service any I/O a more
 fundamental and severe thing than being up and healthy but full?
 
 Were you intending the interface to be exclusively for data
 integrity issues like checksum failures, rather than more general
 events about a mount that userspace would probably like to know
 about?
   Well, I'm not saying we cannot have those events for fs availability /
 inavailability. I'm just saying I'd like to see some use for that first.
 I don't want events to be added just because it's possible...
 
 For ENOSPC we have thin provisioned storage and the userspace deamon
 shuffling real storage underneath. So there I know the usecase.
 
 
 Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) might 
 check for events like this.  If it can see the cluster filesystem is 
 unavailable, then it can avoid scheduling the job, so that the (multi-node) 
 application does not get hung on one node with a bad mount.  If it sees a 
 mount go bad (unavailable, or client evicted) partway through a job, then it 
 can kill -9 the process that was relying on the bad mount, and go run it 
 somewhere else.
 * Boring but practical case: a nagios health check for checking if mounts are 
 OK.

John,
thanks for chiming in, as I was just about to write the same.  Some users
were just asking yesterday at the Lustre User Group meeting about adding
an interface to notify job schedulers for your #1 point, and I'd much
rather use a generic interface than inventing our own for Lustre.

Cheers, Andreas

 We don't have to invent these event types now of course, but something to 
 bear in mind.  Hopefully if/when any of the distributed filesystems 
 (Lustre/Ceph/etc) choose to implement this, we can look at making the event 
 types common at that time though.
 
 BTW in any case an interface for filesystem events to userspace will be a 
 useful addition, thank you!
 
 Cheers,
 John


Cheers, Andreas





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread Andreas Dilger
On Apr 17, 2015, at 5:31 AM, Jan Kara j...@suse.cz wrote:
 On Wed 15-04-15 09:15:44, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.
 
 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
 
 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.
 
 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.
 
 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.
 
 Signed-off-by: Beata Michalska b.michal...@samsung.com
  Thanks for the patches! Some comments are below.
 
 ---
 Documentation/filesystems/events.txt |  254 +++
 fs/Makefile  |1 +
 fs/events/Makefile   |6 +
 fs/events/fs_event.c |  775 
 ++
 fs/events/fs_event.h |   27 ++
 fs/events/fs_event_netlink.c |   94 +
 fs/namespace.c   |1 +
 include/linux/fs.h   |6 +-
 include/linux/fs_event.h |   69 +++
 include/uapi/linux/fs_event.h|   62 +++
 include/uapi/linux/genetlink.h   |1 +
 net/netlink/genetlink.c  |7 +-
 12 files changed, 1301 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/filesystems/events.txt
 create mode 100644 fs/events/Makefile
 create mode 100644 fs/events/fs_event.c
 create mode 100644 fs/events/fs_event.h
 create mode 100644 fs/events/fs_event_netlink.c
 create mode 100644 include/linux/fs_event.h
 create mode 100644 include/uapi/linux/fs_event.h
 
 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 +Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the 
 filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 +mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
  Is there a reason to have a special filesystem for this? Do you expect
 extending it by (many) more files? Why not just creating a file in sysfs or
 something like that?
 
 +Synopsis of config:
 +--
 +
 +MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
  I'm not quite decided but is mountpoint really the right thing to pass
 via the interface? They aren't unique (filesystem can be mounted in
 multiple places) and more importantly can change over time. So won't it be
 better to pass major:minor over the interface? These are stable, unique to
 the filesystem, and userspace can easily get them by calling stat(2) on the
 desired path (or directly from /proc/self/mountinfo). That could be also
 used as an fs identifier instead of assigned ID (and thus we won't need
 those events about creation of new trace which look 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Heinrich Schuchardt
On 15.04.2015 09:15, Beata Michalska wrote:
> Introduce configurable generic interface for file
> system-wide event notifications to provide file
> systems with a common way of reporting any potential
> issues as they emerge.
> 
> The notifications are to be issued through generic
> netlink interface, by a dedicated, for file system
> events, multicast group. The file systems might as
> well use this group to send their own custom messages.
> 
> The events have been split into four base categories:
> information, warnings, errors and threshold notifications,
> with some very basic event types like running out of space
> or file system being remounted as read-only.
> 
> Threshold notifications have been included to allow
> triggering an event whenever the amount of free space
> drops below a certain level - or levels to be more precise
> as two of them are being supported: the lower and the upper
> range. The notifications work both ways: once the threshold
> level has been reached, an event shall be generated whenever
> the number of available blocks goes up again re-activating
> the threshold.
> 
> The interface has been exposed through a vfs. Once mounted,
> it serves as an entry point for the set-up where one can
> register for particular file system events.

Having a framework for notification for file systems is a great idea.
Your solution covers an important part of the possible application scope.

Before moving forward I suggest we should analyze if this scope should
be enlarged.

Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
network nodes (e.g. Lustre). How should file system notification work here?

How will fuse file systems be served?

The current point of reference is a single mount point.
Every time I insert an USB stick several file system may be automounted.
I would like to receive events for these automounted file systems.

A similar case arises when starting new virtual machines. How will I
receive events on the host system for the file systems of the virtual
machines?

In your implementation events are received via Netlink.
Using Netlink for marking mounts for notification would create a much
more homogenous interface. So why should we use a virtual file system here?

Best regards

Heinrich Schuchardt


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Hugh Dickins
On Thu, 16 Apr 2015, Beata Michalska wrote:
> On 04/16/2015 05:46 AM, Eric Sandeen wrote:
> > On 4/15/15 2:15 AM, Beata Michalska wrote:
> >> Introduce configurable generic interface for file
> >> system-wide event notifications to provide file
> >> systems with a common way of reporting any potential
> >> issues as they emerge.
> >>
> >> The notifications are to be issued through generic
> >> netlink interface, by a dedicated, for file system
> >> events, multicast group. The file systems might as
> >> well use this group to send their own custom messages.
> > 
> > ...
> > 
> >> + 4.3 Threshold notifications:
> >> +
> >> + #include 
> >> + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
> >> + void fs_event_free_space(struct super_block *sb, u64 ncount);
> >> +
> >> + Each filesystme supporting the treshold notifiactions should call
> >> + fs_event_alloc_space/fs_event_free_space repsectively whenever the
> >> + ammount of availbale blocks changes.
> >> + - sb: the filesystem's super block
> >> + - ncount: number of blocks being acquired/released
> > 
> > so:
> > 
> >> +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
> >> +{
> >> +  struct fs_trace_entry *en;
> >> +  s64 count;
> >> +
> >> +  spin_lock(_trace_lock);
> > 
> > Every allocation/free for every supported filesystem system-wide will be
> > serialized on this global spinlock?  That sounds like a non-starter...
> > 
> > -Eric
> > 
> I guess there is a plenty room for improvements as this is an early version.
> I do agree that this might be a performance bottleneck event though I've tried
> to keep this to minimum - it's being taken only for hashtable look-up. But 
> still...
> I was considering placing the trace object within the super_block to skip
> this look-up part but I'd like to gather more comments, especially on the 
> concept
> itself.

Sorry, I have no opinion on the netlink fs notifications concept
itself, not my area of expertise at all.

No doubt you Cc'ed me for tmpfs: I am very glad you're now trying the
generic filesystem route, and yes, I'd be happy to have the support
in tmpfs, thank you - if it is generally agreed to be suitable for
filesystems; but wouldn't want this as a special for tmpfs.

However, I must echo Eric's point: please take a look at 7e496299d4d2
"tmpfs: make tmpfs scalable with percpu_counter for used blocks":
Tim would be unhappy if you added overhead back into that path.

(And please Cc linux-fsde...@vger.kernel.org next time you post these.)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Beata Michalska
On 04/16/2015 05:46 AM, Eric Sandeen wrote:
> On 4/15/15 2:15 AM, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
> 
> ...
> 
>> + 4.3 Threshold notifications:
>> +
>> + #include 
>> + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
>> + void fs_event_free_space(struct super_block *sb, u64 ncount);
>> +
>> + Each filesystme supporting the treshold notifiactions should call
>> + fs_event_alloc_space/fs_event_free_space repsectively whenever the
>> + ammount of availbale blocks changes.
>> + - sb: the filesystem's super block
>> + - ncount: number of blocks being acquired/released
> 
> so:
> 
>> +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
>> +{
>> +struct fs_trace_entry *en;
>> +s64 count;
>> +
>> +spin_lock(_trace_lock);
> 
> Every allocation/free for every supported filesystem system-wide will be
> serialized on this global spinlock?  That sounds like a non-starter...
> 
> -Eric
> 
I guess there is a plenty room for improvements as this is an early version.
I do agree that this might be a performance bottleneck event though I've tried
to keep this to minimum - it's being taken only for hashtable look-up. But 
still...
I was considering placing the trace object within the super_block to skip
this look-up part but I'd like to gather more comments, especially on the 
concept
itself.

BR
Beata


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Beata Michalska
On 04/15/2015 09:25 PM, Darrick J. Wong wrote:
> On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
>> Introduce configurable generic interface for file
>> system-wide event notifications to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface, by a dedicated, for file system
>> events, multicast group. The file systems might as
>> well use this group to send their own custom messages.
>>
>> The events have been split into four base categories:
>> information, warnings, errors and threshold notifications,
>> with some very basic event types like running out of space
>> or file system being remounted as read-only.
>>
>> Threshold notifications have been included to allow
>> triggering an event whenever the amount of free space
>> drops below a certain level - or levels to be more precise
>> as two of them are being supported: the lower and the upper
>> range. The notifications work both ways: once the threshold
>> level has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>>
>> Signed-off-by: Beata Michalska 
>> ---
>>  Documentation/filesystems/events.txt |  254 +++
>>  fs/Makefile  |1 +
>>  fs/events/Makefile   |6 +
>>  fs/events/fs_event.c |  775 
>> ++
>>  fs/events/fs_event.h |   27 ++
>>  fs/events/fs_event_netlink.c |   94 +
>>  fs/namespace.c   |1 +
>>  include/linux/fs.h   |6 +-
>>  include/linux/fs_event.h |   69 +++
>>  include/uapi/linux/fs_event.h|   62 +++
>>  include/uapi/linux/genetlink.h   |1 +
>>  net/netlink/genetlink.c  |7 +-
>>  12 files changed, 1301 insertions(+), 2 deletions(-)
>>  create mode 100644 Documentation/filesystems/events.txt
>>  create mode 100644 fs/events/Makefile
>>  create mode 100644 fs/events/fs_event.c
>>  create mode 100644 fs/events/fs_event.h
>>  create mode 100644 fs/events/fs_event_netlink.c
>>  create mode 100644 include/linux/fs_event.h
>>  create mode 100644 include/uapi/linux/fs_event.h
>>
>> diff --git a/Documentation/filesystems/events.txt 
>> b/Documentation/filesystems/events.txt
>> new file mode 100644
>> index 000..c85dd88
>> --- /dev/null
>> +++ b/Documentation/filesystems/events.txt
>> @@ -0,0 +1,254 @@
>> +
>> +Generic file system event notification interface
>> +
>> +Document created 09 April 2015 by Beata Michalska 
>> +
>> +1. The reason behind:
>> +=
>> +
>> +There are many corner cases when things might get messy with the 
>> filesystems.
>> +And it is not always obvious what and when went wrong. Sometimes you might
>> +get some subtle hints that there is something going on - but by the time
>> +you realise it, it might be too late as you are already out-of-space
>> +or the filesystem has been remounted as read-only (i.e.). The generic
>> +interface for the filesystem events fills the gap by providing a rather
>> +easy way of real-time notifications triggered whenever something intreseting
>> +happens, allowing filesystems to report events in a common way, as they 
>> occur.
>> +
>> +2. How does it work:
>> +
>> +
>> +The interface itself has been exposed as fstrace-type Virtual File System,
>> +primarily to ease the process of setting up the configuration for the file
>> +system notifications. So for starters it needs to get mounted (obviously):
>> +
>> +mount -t fstrace none /sys/fs/events
>> +
>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>> +through which the notification are being set-up.
>> +
>> +Activating notifications for particular filesystem is as straightforward
>> +as writing into the 'config' file. Note that by default all events despite
>> +the actual filesystem type are being disregarded.
>> +
>> +Synopsis of config:
>> +--
>> +
>> +MOUNT EVENT_TYPE [L1] [L2]
>> +
>> + MOUNT  : the filesystem's mount point
>> + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
>> +  at least one type needs to be specified;
>> +  note the comma delimiter and lack of spaces between
>> +  those options
>> + L1 : the threshold limit - lower range
>> + L2 : the threshold limit - upper range
>> +  case enabling threshold notifications the lower level is
>> +  mandatory, whereas the upper one remains optional;
>> +  note though, that as those refer to the number of available
>> +  blocks, the lower level needs to be higher than the upper one
>> +
>> 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Beata Michalska
On 04/15/2015 09:25 PM, Darrick J. Wong wrote:
 On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.

 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.

 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.

 Signed-off-by: Beata Michalska b.michal...@samsung.com
 ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
 ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h

 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 +Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the 
 filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 +mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
 +
 +Synopsis of config:
 +--
 +
 +MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
 + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
 +  at least one type needs to be specified;
 +  note the comma delimiter and lack of spaces between
 +  those options
 + L1 : the threshold limit - lower range
 + L2 : the threshold limit - upper range
 +  case enabling threshold notifications the lower level is
 +  mandatory, whereas the upper one remains optional;
 +  note though, that as those refer to the number of available
 +  blocks, the lower level needs to be higher than the upper one
 +
 +Sample request could look like the follwoing:
 +
 + echo /sample/mount/point warn,err,thr 71 50  /sys/fs/events/config
 +
 +Multiple request might be specified provided they 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Beata Michalska
On 04/16/2015 05:46 AM, Eric Sandeen wrote:
 On 4/15/15 2:15 AM, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.

 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
 
 ...
 
 + 4.3 Threshold notifications:
 +
 + #include linux/fs_event.h
 + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
 + void fs_event_free_space(struct super_block *sb, u64 ncount);
 +
 + Each filesystme supporting the treshold notifiactions should call
 + fs_event_alloc_space/fs_event_free_space repsectively whenever the
 + ammount of availbale blocks changes.
 + - sb: the filesystem's super block
 + - ncount: number of blocks being acquired/released
 
 so:
 
 +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
 +{
 +struct fs_trace_entry *en;
 +s64 count;
 +
 +spin_lock(fs_trace_lock);
 
 Every allocation/free for every supported filesystem system-wide will be
 serialized on this global spinlock?  That sounds like a non-starter...
 
 -Eric
 
I guess there is a plenty room for improvements as this is an early version.
I do agree that this might be a performance bottleneck event though I've tried
to keep this to minimum - it's being taken only for hashtable look-up. But 
still...
I was considering placing the trace object within the super_block to skip
this look-up part but I'd like to gather more comments, especially on the 
concept
itself.

BR
Beata


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Heinrich Schuchardt
On 15.04.2015 09:15, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.
 
 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
 
 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.
 
 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.
 
 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.

Having a framework for notification for file systems is a great idea.
Your solution covers an important part of the possible application scope.

Before moving forward I suggest we should analyze if this scope should
be enlarged.

Many filesystems are remote (e.g. CIFS/Samba) or distributed over many
network nodes (e.g. Lustre). How should file system notification work here?

How will fuse file systems be served?

The current point of reference is a single mount point.
Every time I insert an USB stick several file system may be automounted.
I would like to receive events for these automounted file systems.

A similar case arises when starting new virtual machines. How will I
receive events on the host system for the file systems of the virtual
machines?

In your implementation events are received via Netlink.
Using Netlink for marking mounts for notification would create a much
more homogenous interface. So why should we use a virtual file system here?

Best regards

Heinrich Schuchardt


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-16 Thread Hugh Dickins
On Thu, 16 Apr 2015, Beata Michalska wrote:
 On 04/16/2015 05:46 AM, Eric Sandeen wrote:
  On 4/15/15 2:15 AM, Beata Michalska wrote:
  Introduce configurable generic interface for file
  system-wide event notifications to provide file
  systems with a common way of reporting any potential
  issues as they emerge.
 
  The notifications are to be issued through generic
  netlink interface, by a dedicated, for file system
  events, multicast group. The file systems might as
  well use this group to send their own custom messages.
  
  ...
  
  + 4.3 Threshold notifications:
  +
  + #include linux/fs_event.h
  + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
  + void fs_event_free_space(struct super_block *sb, u64 ncount);
  +
  + Each filesystme supporting the treshold notifiactions should call
  + fs_event_alloc_space/fs_event_free_space repsectively whenever the
  + ammount of availbale blocks changes.
  + - sb: the filesystem's super block
  + - ncount: number of blocks being acquired/released
  
  so:
  
  +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
  +{
  +  struct fs_trace_entry *en;
  +  s64 count;
  +
  +  spin_lock(fs_trace_lock);
  
  Every allocation/free for every supported filesystem system-wide will be
  serialized on this global spinlock?  That sounds like a non-starter...
  
  -Eric
  
 I guess there is a plenty room for improvements as this is an early version.
 I do agree that this might be a performance bottleneck event though I've tried
 to keep this to minimum - it's being taken only for hashtable look-up. But 
 still...
 I was considering placing the trace object within the super_block to skip
 this look-up part but I'd like to gather more comments, especially on the 
 concept
 itself.

Sorry, I have no opinion on the netlink fs notifications concept
itself, not my area of expertise at all.

No doubt you Cc'ed me for tmpfs: I am very glad you're now trying the
generic filesystem route, and yes, I'd be happy to have the support
in tmpfs, thank you - if it is generally agreed to be suitable for
filesystems; but wouldn't want this as a special for tmpfs.

However, I must echo Eric's point: please take a look at 7e496299d4d2
tmpfs: make tmpfs scalable with percpu_counter for used blocks:
Tim would be unhappy if you added overhead back into that path.

(And please Cc linux-fsde...@vger.kernel.org next time you post these.)

Hugh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Eric Sandeen
On 4/15/15 2:15 AM, Beata Michalska wrote:
> Introduce configurable generic interface for file
> system-wide event notifications to provide file
> systems with a common way of reporting any potential
> issues as they emerge.
> 
> The notifications are to be issued through generic
> netlink interface, by a dedicated, for file system
> events, multicast group. The file systems might as
> well use this group to send their own custom messages.

...

> + 4.3 Threshold notifications:
> +
> + #include 
> + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
> + void fs_event_free_space(struct super_block *sb, u64 ncount);
> +
> + Each filesystme supporting the treshold notifiactions should call
> + fs_event_alloc_space/fs_event_free_space repsectively whenever the
> + ammount of availbale blocks changes.
> + - sb: the filesystem's super block
> + - ncount: number of blocks being acquired/released

so:

> +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
> +{
> + struct fs_trace_entry *en;
> + s64 count;
> +
> + spin_lock(_trace_lock);

Every allocation/free for every supported filesystem system-wide will be
serialized on this global spinlock?  That sounds like a non-starter...

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Darrick J. Wong
On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
> Introduce configurable generic interface for file
> system-wide event notifications to provide file
> systems with a common way of reporting any potential
> issues as they emerge.
> 
> The notifications are to be issued through generic
> netlink interface, by a dedicated, for file system
> events, multicast group. The file systems might as
> well use this group to send their own custom messages.
> 
> The events have been split into four base categories:
> information, warnings, errors and threshold notifications,
> with some very basic event types like running out of space
> or file system being remounted as read-only.
> 
> Threshold notifications have been included to allow
> triggering an event whenever the amount of free space
> drops below a certain level - or levels to be more precise
> as two of them are being supported: the lower and the upper
> range. The notifications work both ways: once the threshold
> level has been reached, an event shall be generated whenever
> the number of available blocks goes up again re-activating
> the threshold.
> 
> The interface has been exposed through a vfs. Once mounted,
> it serves as an entry point for the set-up where one can
> register for particular file system events.
> 
> Signed-off-by: Beata Michalska 
> ---
>  Documentation/filesystems/events.txt |  254 +++
>  fs/Makefile  |1 +
>  fs/events/Makefile   |6 +
>  fs/events/fs_event.c |  775 
> ++
>  fs/events/fs_event.h |   27 ++
>  fs/events/fs_event_netlink.c |   94 +
>  fs/namespace.c   |1 +
>  include/linux/fs.h   |6 +-
>  include/linux/fs_event.h |   69 +++
>  include/uapi/linux/fs_event.h|   62 +++
>  include/uapi/linux/genetlink.h   |1 +
>  net/netlink/genetlink.c  |7 +-
>  12 files changed, 1301 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/filesystems/events.txt
>  create mode 100644 fs/events/Makefile
>  create mode 100644 fs/events/fs_event.c
>  create mode 100644 fs/events/fs_event.h
>  create mode 100644 fs/events/fs_event_netlink.c
>  create mode 100644 include/linux/fs_event.h
>  create mode 100644 include/uapi/linux/fs_event.h
> 
> diff --git a/Documentation/filesystems/events.txt 
> b/Documentation/filesystems/events.txt
> new file mode 100644
> index 000..c85dd88
> --- /dev/null
> +++ b/Documentation/filesystems/events.txt
> @@ -0,0 +1,254 @@
> +
> + Generic file system event notification interface
> +
> +Document created 09 April 2015 by Beata Michalska 
> +
> +1. The reason behind:
> +=
> +
> +There are many corner cases when things might get messy with the filesystems.
> +And it is not always obvious what and when went wrong. Sometimes you might
> +get some subtle hints that there is something going on - but by the time
> +you realise it, it might be too late as you are already out-of-space
> +or the filesystem has been remounted as read-only (i.e.). The generic
> +interface for the filesystem events fills the gap by providing a rather
> +easy way of real-time notifications triggered whenever something intreseting
> +happens, allowing filesystems to report events in a common way, as they 
> occur.
> +
> +2. How does it work:
> +
> +
> +The interface itself has been exposed as fstrace-type Virtual File System,
> +primarily to ease the process of setting up the configuration for the file
> +system notifications. So for starters it needs to get mounted (obviously):
> +
> + mount -t fstrace none /sys/fs/events
> +
> +This will unveil the single fstrace filesystem entry - the 'config' file,
> +through which the notification are being set-up.
> +
> +Activating notifications for particular filesystem is as straightforward
> +as writing into the 'config' file. Note that by default all events despite
> +the actual filesystem type are being disregarded.
> +
> +Synopsis of config:
> +--
> +
> + MOUNT EVENT_TYPE [L1] [L2]
> +
> + MOUNT  : the filesystem's mount point
> + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
> +  at least one type needs to be specified;
> +  note the comma delimiter and lack of spaces between
> +   those options
> + L1 : the threshold limit - lower range
> + L2 : the threshold limit - upper range
> +   case enabling threshold notifications the lower level is
> +   mandatory, whereas the upper one remains optional;
> +   note though, that as those refer to the number of available
> +   blocks, the lower level needs to be higher than the upper one
> +
> +Sample request could look like the follwoing:
> +
> + echo /sample/mount/point warn,err,thr 71 50 > /sys/fs/events/config
> +
> +Multiple request 

[RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Beata Michalska
Introduce configurable generic interface for file
system-wide event notifications to provide file
systems with a common way of reporting any potential
issues as they emerge.

The notifications are to be issued through generic
netlink interface, by a dedicated, for file system
events, multicast group. The file systems might as
well use this group to send their own custom messages.

The events have been split into four base categories:
information, warnings, errors and threshold notifications,
with some very basic event types like running out of space
or file system being remounted as read-only.

Threshold notifications have been included to allow
triggering an event whenever the amount of free space
drops below a certain level - or levels to be more precise
as two of them are being supported: the lower and the upper
range. The notifications work both ways: once the threshold
level has been reached, an event shall be generated whenever
the number of available blocks goes up again re-activating
the threshold.

The interface has been exposed through a vfs. Once mounted,
it serves as an entry point for the set-up where one can
register for particular file system events.

Signed-off-by: Beata Michalska 
---
 Documentation/filesystems/events.txt |  254 +++
 fs/Makefile  |1 +
 fs/events/Makefile   |6 +
 fs/events/fs_event.c |  775 ++
 fs/events/fs_event.h |   27 ++
 fs/events/fs_event_netlink.c |   94 +
 fs/namespace.c   |1 +
 include/linux/fs.h   |6 +-
 include/linux/fs_event.h |   69 +++
 include/uapi/linux/fs_event.h|   62 +++
 include/uapi/linux/genetlink.h   |1 +
 net/netlink/genetlink.c  |7 +-
 12 files changed, 1301 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/filesystems/events.txt
 create mode 100644 fs/events/Makefile
 create mode 100644 fs/events/fs_event.c
 create mode 100644 fs/events/fs_event.h
 create mode 100644 fs/events/fs_event_netlink.c
 create mode 100644 include/linux/fs_event.h
 create mode 100644 include/uapi/linux/fs_event.h

diff --git a/Documentation/filesystems/events.txt 
b/Documentation/filesystems/events.txt
new file mode 100644
index 000..c85dd88
--- /dev/null
+++ b/Documentation/filesystems/events.txt
@@ -0,0 +1,254 @@
+
+   Generic file system event notification interface
+
+Document created 09 April 2015 by Beata Michalska 
+
+1. The reason behind:
+=
+
+There are many corner cases when things might get messy with the filesystems.
+And it is not always obvious what and when went wrong. Sometimes you might
+get some subtle hints that there is something going on - but by the time
+you realise it, it might be too late as you are already out-of-space
+or the filesystem has been remounted as read-only (i.e.). The generic
+interface for the filesystem events fills the gap by providing a rather
+easy way of real-time notifications triggered whenever something intreseting
+happens, allowing filesystems to report events in a common way, as they occur.
+
+2. How does it work:
+
+
+The interface itself has been exposed as fstrace-type Virtual File System,
+primarily to ease the process of setting up the configuration for the file
+system notifications. So for starters it needs to get mounted (obviously):
+
+   mount -t fstrace none /sys/fs/events
+
+This will unveil the single fstrace filesystem entry - the 'config' file,
+through which the notification are being set-up.
+
+Activating notifications for particular filesystem is as straightforward
+as writing into the 'config' file. Note that by default all events despite
+the actual filesystem type are being disregarded.
+
+Synopsis of config:
+--
+
+   MOUNT EVENT_TYPE [L1] [L2]
+
+ MOUNT  : the filesystem's mount point
+ EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
+  at least one type needs to be specified;
+  note the comma delimiter and lack of spaces between
+ those options
+ L1 : the threshold limit - lower range
+ L2 : the threshold limit - upper range
+ case enabling threshold notifications the lower level is
+ mandatory, whereas the upper one remains optional;
+ note though, that as those refer to the number of available
+ blocks, the lower level needs to be higher than the upper one
+
+Sample request could look like the follwoing:
+
+ echo /sample/mount/point warn,err,thr 71 50 > /sys/fs/events/config
+
+Multiple request might be specified provided they are separated with semicolon.
+
+The configuration itself might be modified at any time. One can add/remove
+particilar event types for given fielsystem, modify the threshold levels,
+and remove single or all entries from the 'config' file.
+
+ - 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Darrick J. Wong
On Wed, Apr 15, 2015 at 09:15:44AM +0200, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.
 
 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.
 
 The events have been split into four base categories:
 information, warnings, errors and threshold notifications,
 with some very basic event types like running out of space
 or file system being remounted as read-only.
 
 Threshold notifications have been included to allow
 triggering an event whenever the amount of free space
 drops below a certain level - or levels to be more precise
 as two of them are being supported: the lower and the upper
 range. The notifications work both ways: once the threshold
 level has been reached, an event shall be generated whenever
 the number of available blocks goes up again re-activating
 the threshold.
 
 The interface has been exposed through a vfs. Once mounted,
 it serves as an entry point for the set-up where one can
 register for particular file system events.
 
 Signed-off-by: Beata Michalska b.michal...@samsung.com
 ---
  Documentation/filesystems/events.txt |  254 +++
  fs/Makefile  |1 +
  fs/events/Makefile   |6 +
  fs/events/fs_event.c |  775 
 ++
  fs/events/fs_event.h |   27 ++
  fs/events/fs_event_netlink.c |   94 +
  fs/namespace.c   |1 +
  include/linux/fs.h   |6 +-
  include/linux/fs_event.h |   69 +++
  include/uapi/linux/fs_event.h|   62 +++
  include/uapi/linux/genetlink.h   |1 +
  net/netlink/genetlink.c  |7 +-
  12 files changed, 1301 insertions(+), 2 deletions(-)
  create mode 100644 Documentation/filesystems/events.txt
  create mode 100644 fs/events/Makefile
  create mode 100644 fs/events/fs_event.c
  create mode 100644 fs/events/fs_event.h
  create mode 100644 fs/events/fs_event_netlink.c
  create mode 100644 include/linux/fs_event.h
  create mode 100644 include/uapi/linux/fs_event.h
 
 diff --git a/Documentation/filesystems/events.txt 
 b/Documentation/filesystems/events.txt
 new file mode 100644
 index 000..c85dd88
 --- /dev/null
 +++ b/Documentation/filesystems/events.txt
 @@ -0,0 +1,254 @@
 +
 + Generic file system event notification interface
 +
 +Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
 +
 +1. The reason behind:
 +=
 +
 +There are many corner cases when things might get messy with the filesystems.
 +And it is not always obvious what and when went wrong. Sometimes you might
 +get some subtle hints that there is something going on - but by the time
 +you realise it, it might be too late as you are already out-of-space
 +or the filesystem has been remounted as read-only (i.e.). The generic
 +interface for the filesystem events fills the gap by providing a rather
 +easy way of real-time notifications triggered whenever something intreseting
 +happens, allowing filesystems to report events in a common way, as they 
 occur.
 +
 +2. How does it work:
 +
 +
 +The interface itself has been exposed as fstrace-type Virtual File System,
 +primarily to ease the process of setting up the configuration for the file
 +system notifications. So for starters it needs to get mounted (obviously):
 +
 + mount -t fstrace none /sys/fs/events
 +
 +This will unveil the single fstrace filesystem entry - the 'config' file,
 +through which the notification are being set-up.
 +
 +Activating notifications for particular filesystem is as straightforward
 +as writing into the 'config' file. Note that by default all events despite
 +the actual filesystem type are being disregarded.
 +
 +Synopsis of config:
 +--
 +
 + MOUNT EVENT_TYPE [L1] [L2]
 +
 + MOUNT  : the filesystem's mount point
 + EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
 +  at least one type needs to be specified;
 +  note the comma delimiter and lack of spaces between
 +   those options
 + L1 : the threshold limit - lower range
 + L2 : the threshold limit - upper range
 +   case enabling threshold notifications the lower level is
 +   mandatory, whereas the upper one remains optional;
 +   note though, that as those refer to the number of available
 +   blocks, the lower level needs to be higher than the upper one
 +
 +Sample request could look like the follwoing:
 +
 + echo /sample/mount/point warn,err,thr 71 50  /sys/fs/events/config
 +
 +Multiple request might be specified provided they are separated with 
 semicolon.
 +
 

Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Eric Sandeen
On 4/15/15 2:15 AM, Beata Michalska wrote:
 Introduce configurable generic interface for file
 system-wide event notifications to provide file
 systems with a common way of reporting any potential
 issues as they emerge.
 
 The notifications are to be issued through generic
 netlink interface, by a dedicated, for file system
 events, multicast group. The file systems might as
 well use this group to send their own custom messages.

...

 + 4.3 Threshold notifications:
 +
 + #include linux/fs_event.h
 + void fs_event_alloc_space(struct super_block *sb, u64 ncount);
 + void fs_event_free_space(struct super_block *sb, u64 ncount);
 +
 + Each filesystme supporting the treshold notifiactions should call
 + fs_event_alloc_space/fs_event_free_space repsectively whenever the
 + ammount of availbale blocks changes.
 + - sb: the filesystem's super block
 + - ncount: number of blocks being acquired/released

so:

 +void fs_event_alloc_space(struct super_block *sb, u64 ncount)
 +{
 + struct fs_trace_entry *en;
 + s64 count;
 +
 + spin_lock(fs_trace_lock);

Every allocation/free for every supported filesystem system-wide will be
serialized on this global spinlock?  That sounds like a non-starter...

-Eric

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 1/4] fs: Add generic file system event notifications

2015-04-15 Thread Beata Michalska
Introduce configurable generic interface for file
system-wide event notifications to provide file
systems with a common way of reporting any potential
issues as they emerge.

The notifications are to be issued through generic
netlink interface, by a dedicated, for file system
events, multicast group. The file systems might as
well use this group to send their own custom messages.

The events have been split into four base categories:
information, warnings, errors and threshold notifications,
with some very basic event types like running out of space
or file system being remounted as read-only.

Threshold notifications have been included to allow
triggering an event whenever the amount of free space
drops below a certain level - or levels to be more precise
as two of them are being supported: the lower and the upper
range. The notifications work both ways: once the threshold
level has been reached, an event shall be generated whenever
the number of available blocks goes up again re-activating
the threshold.

The interface has been exposed through a vfs. Once mounted,
it serves as an entry point for the set-up where one can
register for particular file system events.

Signed-off-by: Beata Michalska b.michal...@samsung.com
---
 Documentation/filesystems/events.txt |  254 +++
 fs/Makefile  |1 +
 fs/events/Makefile   |6 +
 fs/events/fs_event.c |  775 ++
 fs/events/fs_event.h |   27 ++
 fs/events/fs_event_netlink.c |   94 +
 fs/namespace.c   |1 +
 include/linux/fs.h   |6 +-
 include/linux/fs_event.h |   69 +++
 include/uapi/linux/fs_event.h|   62 +++
 include/uapi/linux/genetlink.h   |1 +
 net/netlink/genetlink.c  |7 +-
 12 files changed, 1301 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/filesystems/events.txt
 create mode 100644 fs/events/Makefile
 create mode 100644 fs/events/fs_event.c
 create mode 100644 fs/events/fs_event.h
 create mode 100644 fs/events/fs_event_netlink.c
 create mode 100644 include/linux/fs_event.h
 create mode 100644 include/uapi/linux/fs_event.h

diff --git a/Documentation/filesystems/events.txt 
b/Documentation/filesystems/events.txt
new file mode 100644
index 000..c85dd88
--- /dev/null
+++ b/Documentation/filesystems/events.txt
@@ -0,0 +1,254 @@
+
+   Generic file system event notification interface
+
+Document created 09 April 2015 by Beata Michalska b.michal...@samsung.com
+
+1. The reason behind:
+=
+
+There are many corner cases when things might get messy with the filesystems.
+And it is not always obvious what and when went wrong. Sometimes you might
+get some subtle hints that there is something going on - but by the time
+you realise it, it might be too late as you are already out-of-space
+or the filesystem has been remounted as read-only (i.e.). The generic
+interface for the filesystem events fills the gap by providing a rather
+easy way of real-time notifications triggered whenever something intreseting
+happens, allowing filesystems to report events in a common way, as they occur.
+
+2. How does it work:
+
+
+The interface itself has been exposed as fstrace-type Virtual File System,
+primarily to ease the process of setting up the configuration for the file
+system notifications. So for starters it needs to get mounted (obviously):
+
+   mount -t fstrace none /sys/fs/events
+
+This will unveil the single fstrace filesystem entry - the 'config' file,
+through which the notification are being set-up.
+
+Activating notifications for particular filesystem is as straightforward
+as writing into the 'config' file. Note that by default all events despite
+the actual filesystem type are being disregarded.
+
+Synopsis of config:
+--
+
+   MOUNT EVENT_TYPE [L1] [L2]
+
+ MOUNT  : the filesystem's mount point
+ EVENT_TYPE : type of events to be enabled: info,warn,err,thr;
+  at least one type needs to be specified;
+  note the comma delimiter and lack of spaces between
+ those options
+ L1 : the threshold limit - lower range
+ L2 : the threshold limit - upper range
+ case enabling threshold notifications the lower level is
+ mandatory, whereas the upper one remains optional;
+ note though, that as those refer to the number of available
+ blocks, the lower level needs to be higher than the upper one
+
+Sample request could look like the follwoing:
+
+ echo /sample/mount/point warn,err,thr 71 50  /sys/fs/events/config
+
+Multiple request might be specified provided they are separated with semicolon.
+
+The configuration itself might be modified at any time. One can add/remove
+particilar event types for given fielsystem, modify the threshold levels,
+and remove single or