On 10/09/2013 11:22 AM, Pruner, Anne (Anne) wrote:

I'm evaluating gluster for use in our product, and I want to ensure that I understand the failover behavior. What I'm seeing isn't great, but it doesn't look from the docs I've read that this is what everyone else is experiencing.

Is this normal?

Setup:

-one volume, distributed, replicated (2), with two bricks on two different servers

-35,000 files on volume, about 1MB each, all in one directory (I'm open to changing this, if that's the problem. ls --l takes a /really/ long time)

-volume is mounted (mount --t gluster) on server 1

Procedure:

-I stop glusterd and glusterfsd on server1, and send a few files to the volume. This is fine. I can write and read the files.

-I start glusterd on server1, and this starts glusterfsd. This triggers self-heal.

-Send a file to the server, and try to read it.

-Sending takes a *couple of minutes*.  Reading is immediate.

-Once self-heal is done, subsequent sends and reads are immediate.

I tried profiling this operation, and it seems like it's stuck on locking the file:

[Profiling deleted]

Any ideas?

Thanks,

Anne


What I suspect is happening is those 35k files are all being checked for self-heal before the directory can be regarded as clean and ready to lock. An easy way to test this would be to try writing to a file in a nearly empty directory and see if you get the same results.

If you are using a current kernel, or a EL kernel with current backports, mounting with use-readdirp=on will make directory reads faster. Not sure how much faster with 35k files though. Would be interested in finding out.
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to