The issue I would be concerned about with these races is that rsyslog can issue large writes to the OS. The OS then can split the writes on block size boundries when it sends them on to the filesystem. Since the block boundries won't match up with line boundries, you will have parts of lines from different systems combined into one line in the resulting file.

This is what typically happens on a local filesystem if you have two processes writing to the same file without explicit locking between them. Having the disk (or in this case, GlusterFS) try to guess what that locking should be between two processes just doesn't work well.

you can fake it without explicit locking if you make sure that all of your writes are smaller than block size, you disable buffering in the language and you make sure that the application never writes part of a line. But rsyslog can handle lines larger than block size (if message size is set large enough), and it can output multiple messages at once, so I would not expect that rsyslog would work well in these conditions.

David Lang

On Thu, 28 Mar 2013, Gregory Patmore wrote:

seems glusterfs 'should' handle the races with its locking translator, but like every 
distributed <add noun here>, they have their throughput limits.

so I guess it would depend on the load of your use case.

A quick google search turned up some stories of race condition drama with 
GlusterFS ( https://www.google.com/search?q=glusterfs+race+conditions ), but 
I'm sure it would be just as easy to find similar for any file system.

Greg

Sent from my iPhone

On Mar 28, 2013, at 5:03 AM, David Lang <[email protected]> wrote:

On Wed, 27 Mar 2013, Jiann-Ming Su wrote:

Is there a version of rsyslog that can properly use distributed filesystems 
like GlusterFS?  For example, I have two nodes each running rsyslog but also 
sharing a GlusterFS filesystem.  Can those independent rsyslog processes 
running on different servers write to the same file on the GlusterFS?

rsyslog doesn't have any code in it that cares what filesystem it uses.

What would it have to do to "properly use distributed filesystems like 
GlusterFS"?

I would expect that different distributed filesystems have different behavior 
if multiple clients are writing to the same file at the same time.

If nothing else, without explicit locking, I would expect that you will end up 
with race conditions between the different writers, which can cause writes from 
the different writers to be intermingled (and intermingled in chunks that make 
sense to the OS layer, not to the application layer)

I know that some distributed filesystems 'handle' this problem by only allowing 
one machine/process to have a given file open for writing at a time, other 
attempts to open the file either fail or block. What does GlusterFS do?

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to