Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray



On 17/04/2015 17:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:

On 17/04/2015 16:43, Jan Kara wrote:
In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.



Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) 
might check for events like this.  If it can see the cluster filesystem 
is unavailable, then it can avoid scheduling the job, so that the 
(multi-node) application does not get hung on one node with a bad 
mount.  If it sees a mount go bad (unavailable, or client evicted) 
partway through a job, then it can kill -9 the process that was relying 
on the bad mount, and go run it somewhere else.
 * Boring but practical case: a nagios health check for checking if 
mounts are OK.


We don't have to invent these event types now of course, but something 
to bear in mind.  Hopefully if/when any of the distributed filesystems 
(Lustre/Ceph/etc) choose to implement this, we can look at making the 
event types common at that time though.


BTW in any case an interface for filesystem events to userspace will be 
a useful addition, thank you!


Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).
In that case I'm confused -- why would ENOSPC be an appropriate use of 
this interface if the mount being entirely blocked would be 
inappropriate?  Isn't being unable to service any I/O a more fundamental 
and severe thing than being up and healthy but full?


Were you intending the interface to be exclusively for data integrity 
issues like checksum failures, rather than more general events about a 
mount that userspace would probably like to know about?


John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+{ FS_EVENT_INFO,"info"  },
+{ FS_EVENT_WARN,"warn"  },
+{ FS_EVENT_THRESH,  "thr"   },
+{ FS_EVENT_ERR, "err"   },
+{ 0, NULL },
+};
   Why are there these generic message types? Threshold messages 
make good
sense to me. But not so much the rest. If they don't have a clear 
meaning,
it will be a mess. So I also agree with a message like - "filesystem 
has

trouble, you should probably unmount and run fsck" - that's fine. But
generic "info" or "warning" doesn't really carry any meaning on its 
own and

thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect 
relatively low

number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, 
but monitoring applications will still probably want to be notified.


Another key differentiation IMHO is between transient errors (like 
server is unavailable in a distributed filesystem) that will block the 
filesystem but might clear on their own, vs. permanent errors like 
unreadable drives that definitely will not clear until the administrator 
takes some action.  It's usually a reasonable approximation to call 
transient issues warnings, and permanent issues errors.


John




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:

On 2015-04-17 09:04, Beata Michalska wrote:

On 04/17/2015 01:31 PM, Jan Kara wrote:

On Wed 15-04-15 09:15:44, Beata Michalska wrote:
...

+static const match_table_t fs_etypes = {
+{ FS_EVENT_INFO,info  },
+{ FS_EVENT_WARN,warn  },
+{ FS_EVENT_THRESH,  thr   },
+{ FS_EVENT_ERR, err   },
+{ 0, NULL },
+};
   Why are there these generic message types? Threshold messages 
make good
sense to me. But not so much the rest. If they don't have a clear 
meaning,
it will be a mess. So I also agree with a message like - filesystem 
has

trouble, you should probably unmount and run fsck - that's fine. But
generic info or warning doesn't really carry any meaning on its 
own and

thus seems pretty useless to me. To explain a bit more, AFAIU this
shouldn't be a generic logging interface where something like severity
makes sense but rather a relatively specific interface notifying about
events in filesystem userspace should know about so I expect 
relatively low

number of types of events, not tens or even hundreds...

Honza


Getting rid of those would simplify the configuration part, indeed.
So we would be left with 'generic' and threshold events.
I guess I've overdone this part.


For some filesystems, it may make sense to differentiate between a 
generic warning and an error.  For BTRFS and ZFS for example, if there 
is a csum error on a block, this will get automatically corrected in 
many configurations, and won't require anything like fsck to be run, 
but monitoring applications will still probably want to be notified.


Another key differentiation IMHO is between transient errors (like 
server is unavailable in a distributed filesystem) that will block the 
filesystem but might clear on their own, vs. permanent errors like 
unreadable drives that definitely will not clear until the administrator 
takes some action.  It's usually a reasonable approximation to call 
transient issues warnings, and permanent issues errors.


John




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray



On 17/04/2015 17:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:

On 17/04/2015 16:43, Jan Kara wrote:
In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.



Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) 
might check for events like this.  If it can see the cluster filesystem 
is unavailable, then it can avoid scheduling the job, so that the 
(multi-node) application does not get hung on one node with a bad 
mount.  If it sees a mount go bad (unavailable, or client evicted) 
partway through a job, then it can kill -9 the process that was relying 
on the bad mount, and go run it somewhere else.
 * Boring but practical case: a nagios health check for checking if 
mounts are OK.


We don't have to invent these event types now of course, but something 
to bear in mind.  Hopefully if/when any of the distributed filesystems 
(Lustre/Ceph/etc) choose to implement this, we can look at making the 
event types common at that time though.


BTW in any case an interface for filesystem events to userspace will be 
a useful addition, thank you!


Cheers,
John
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray


On 17/04/2015 16:43, Jan Kara wrote:

On Fri 17-04-15 15:51:14, John Spray wrote:

On 17/04/2015 14:23, Austin S Hemmelgarn wrote:


For some filesystems, it may make sense to differentiate between a
generic warning and an error.  For BTRFS and ZFS for example, if
there is a csum error on a block, this will get automatically
corrected in many configurations, and won't require anything like
fsck to be run, but monitoring applications will still probably
want to be notified.

Another key differentiation IMHO is between transient errors (like
server is unavailable in a distributed filesystem) that will block
the filesystem but might clear on their own, vs. permanent errors
like unreadable drives that definitely will not clear until the
administrator takes some action.  It's usually a reasonable
approximation to call transient issues warnings, and permanent
issues errors.

   So you can have events like FS_UNAVAILABLE and FS_AVAILABLE but what use
would this have? I wouldn't like the interface to be dumping ground for
random crap - we have dmesg for that :).
In that case I'm confused -- why would ENOSPC be an appropriate use of 
this interface if the mount being entirely blocked would be 
inappropriate?  Isn't being unable to service any I/O a more fundamental 
and severe thing than being up and healthy but full?


Were you intending the interface to be exclusively for data integrity 
issues like checksum failures, rather than more general events about a 
mount that userspace would probably like to know about?


John
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/