[Gluster-users] MapReduce on glusterfs in Hadoop
Hi All, Thanks a lot for taking out your time to answer my question. I am trying to implement a file system in hadoop under irg.apache.hadoop.fs package something similar to KFS, glusterfs, etc. I wanted to know is that in README.txt of glusterfs it is mentioned : # ./bin/start-mapred.sh If the map/reduce job/task trackers are up, all I/O will be done to GlusterFS. So, suppose my input files are scattered in different nodes(glusterfs servers), how do I(hadoop client having glusterfs plugged in) issue a Mapreduce command? Moreover, after issuing a Mapreduce command would my hadoop client fetch all the data from different servers to my local machine and then do a Mapreduce or would it start the TaskTracker daemons on the machine(s) where the input file(s) are located and perform a Mapreduce there? Please rectify me if I am wrong but I suppose that the location of input files top Mapreduce is being returned by the function *getFileBlockLocations * *(*FileStatus file*,* *long* start*,* *long* len*). * Thank you very much for your time and helping me out. Regards, Nikhil ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] help posix_fallocate too slow on glusterfs client
Dear gluster experts, Recently I have encountered a problem about posix_fallocate performance on glusterfs client. I use posix_fallocate to allocate a file with specified size on glusterfs client. For example if I create a file with size of 1907658896, it will take about 20 seconds on glusterfs but on local xfs or ext4 it takes less than 1 second. What's the problem? How can I improve posix_fallocate performance for glusterfs? Thank you very much. BTW The volume info is as bellow: sudo gluster volume info volume_e Volume Name: volume_e Type: Distributed-Replicate Volume ID: 81702024-f327-4ae1-b06a-1f2b877d5ebb Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: glusterfs-test-dev001.qiyi.virtual:/mnt/xfsd/volume_e Brick2: glusterfs-test-dev002.qiyi.virtual:/mnt/xfsd/volume_e Brick3: glusterfs-test-dev003.qiyi.virtual:/mnt/xfsd/volume_e Brick4: glusterfs-test-dev004.qiyi.virtual:/mnt/xfsd/volume_e -- 符永涛 ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Sync status of underlying bricks for replicated volume
Hi Gluster Experts, I am trying in integrate glusterfs as a replication file system in my product. As a background, I will be creating a distributed replicated (glusterfs) volume on two bricks present on two different server. Can you please guide me to find out the sync status of all bricks under one volume i.e; given point of time whether all underlying bricks are in IN-SYNC, Out-Of-SYNC or synchronization is going on? Please note I am naive in the glusterfs. Thanks in anticipation. Regards Sejal =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] I/O error repaired only by owner or root access
Hi Dan, Could you please provide the following info (1) the exact permissions of the file you are accessing and its parent directory, (2) the user from which 'ls -l' is issued, and (3) the owner of the file, and the parent directory. You could open a bug for it if it is seen several times. Regards, Rajesh Amaravathi, Software Engineer, GlusterFS RedHat Inc. - Original Message - From: Dan Bretherton d.a.brether...@reading.ac.uk To: gluster-users gluster-users@gluster.org Sent: Thursday, February 21, 2013 8:16:40 PM Subject: [Gluster-users] I/O error repaired only by owner or root access Dear All- Several users are having a lot of trouble reading files belonging to other users. Here is an example. [sms05dab@jupiter ~]$ ls -l /users/gcs/WORK/ORCA1/ORCA1-R07-MEAN/Ctl ls: /users/gcs/WORK/ORCA1/ORCA1-R07-MEAN/Ctl: Input/output error The corresponding nfs.log messages are shown below. [2013-02-21 12:11:39.204659] W [nfs3.c:727:nfs3svc_getattr_stat_cbk] 0-nfs: fe2ba5b8: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN = -1 (Invalid argument) [2013-02-21 12:11:39.204778] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: fe2ba5b8, GETATTR: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid argument) [2013-02-21 12:11:39.215345] I [dht-common.c:954:dht_lookup_everywhere_cbk] 0-nemo2-dht: deleting stale linkfile /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN on nemo2-replicate-0 [2013-02-21 12:11:39.225674] W [client3_1-fops.c:592:client3_1_unlink_cbk] 0-nemo2-client-1: remote operation failed: Permission denied [2013-02-21 12:11:39.225786] W [client3_1-fops.c:592:client3_1_unlink_cbk] 0-nemo2-client-0: remote operation failed: Permission denied [2013-02-21 12:11:39.681029] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-nemo2-client-18: remote operation failed: Permission denied. Path: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN (1662aa0a-d43b-4c2e-9be9-407eb7a89e85) [2013-02-21 12:11:39.681400] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-nemo2-client-19: remote operation failed: Permission denied. Path: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN (1662aa0a-d43b-4c2e-9be9-407eb7a89e85) [2013-02-21 12:11:39.682268] W [nfs3.c:1627:nfs3svc_readlink_cbk] 0-nfs: 2ca5b8: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN = -1 (Invalid argument) [2013-02-21 12:11:39.682338] W [nfs3-helpers.c:3403:nfs3_log_readlink_res] 0-nfs-nfsv3: XID: 2ca5b8, READLINK: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid argument), target: (null) I managed to access the same directory as the owner (or any user with write access including root) without any trouble, and after that access from my normal user account was fine as well. The permissions on the directory allowed read access by everyone, but the Permission denied messages in nfs.log indicate that some sort of operation is not being allowed when the directory is accessed by other users. I have seen this happen with files and directories, and with the GlusterFS native client and NFS. I presume this is a bug; I would be grateful if someone could confirm this. I would file a bug report, but the trouble is that I don't know how to reproduce the problem that causes the I/O error in the first place. It only happens with some files and directories, not all. Would a bug report without any way to reproduce the error be any use, and can anyone suggest a way to dig deeper (eg looking at xattrs) next time I come across an example? -Dan. -- Dan Bretherton ESSC Computer System Manager Department of Meteorology Harry Pitt Building, 3 Earley Gate University of Reading Reading, RG6 7BE (or RG6 6AL for postal service deliveries) UK Tel. +44 118 378 5205, Fax: +44 118 378 6413 -- ## Please sponsor me to run in VSO's 30km Race to the Eye ## ##http://www.justgiving.com/DanBretherton ## ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Is having millions of directories unusual?
Dear All- I have just discovered that the volume I've been having the most trouble with contains over 3.3 million directories. Most of these are output from a storm tracking model that produces output in a large number of directories, each containing a handful of small files. Is this number of directories in a distributed-replicate volume unusual, and should I expect it to cause problems for GlusterFS? The layout fix operation that follows the addition of new bricks takes a very long time (weeks-months) and seems to result in a high CPU load. The volume has a capacity of 52TB and the bricks are 3.3TB in size. -Dan. -- Dan Bretherton ESSC Computer System Manager Department of Meteorology Harry Pitt Building, 3 Earley Gate University of Reading Reading, RG6 7BE (or RG6 6AL for postal service deliveries) UK Tel. +44 118 378 5205, Fax: +44 118 378 6413 -- ## Please sponsor me to run in VSO's 30km Race to the Eye ## ##http://www.justgiving.com/DanBretherton ## ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] MapReduce on glusterfs in Hadoop
Hi nithin: The fuse mount is what allows the filesystem to access distributed files in gluster: that is, GlusterFS has its own fuse mount ... And GlusterFileSystem wraps that in hadoop FileSystem semantics. Meanwhile, The mapreduce jobs are invoked using on custom core-site and mapred-site XML nodes which specify GlusterFileSystem as the dfs. On Feb 22, 2013, at 3:17 AM, Nikhil Agarwal nikaga...@gmail.com wrote: Hi All, Thanks a lot for taking out your time to answer my question. I am trying to implement a file system in hadoop under irg.apache.hadoop.fs package something similar to KFS, glusterfs, etc. I wanted to know is that in README.txt of glusterfs it is mentioned : # ./bin/start-mapred.sh If the map/reduce job/task trackers are up, all I/O will be done to GlusterFS. So, suppose my input files are scattered in different nodes(glusterfs servers), how do I(hadoop client having glusterfs plugged in) issue a Mapreduce command? Moreover, after issuing a Mapreduce command would my hadoop client fetch all the data from different servers to my local machine and then do a Mapreduce or would it start the TaskTracker daemons on the machine(s) where the input file(s) are located and perform a Mapreduce there? Please rectify me if I am wrong but I suppose that the location of input files top Mapreduce is being returned by the function getFileBlockLocations (FileStatus file, long start, long len). Thank you very much for your time and helping me out. Regards, Nikhil ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Bug in log rotation for 3.3.1
Yes, I am aware. Louis' packaging is part of the official debian development and bugs against it are tracked at launchpad. Toby Corkindale toby.corkind...@strategicdata.com.au wrote: On 22/02/13 11:18, Joe Julian wrote: On 02/20/2013 05:05 PM, Toby Corkindale wrote: logrotate.d/glusterfs-common (in the debian package for 3.3.1) is faulty. It rotates the log files, but it doesn't tell glusterd to re-open them, so it continues to write to what is now .1 (and then later it gets gziped and corrupted) I also note that the debian packages do not include the man pages for any of the gluster binaries! (gluster, glusterfs, glusterd) Please file a bug report at https://bugs.launchpad.net/ubuntu/+source/glusterfs/+filebug Oh, I think you misunderstand -- I'm talking about the officially-provided-by-glusterfs debian packages here, not the ones created by Debian or Ubuntu themselves. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Peer Probe
Hey, I was wondering if I could get a bit of help.. I installed a new Infiniband card into my servers but I'm unable to get it to come up as a peer. Is there something I'm missing? [root@fpsgluster testvault]# gluster peer probe fpsgluster2ib Probe on host fpsgluster2ib port 0 already in peer list [root@fpsgluster testvault]# yum list installed | grep gluster glusterfs.x86_64 3.3.1-1.el6 installed glusterfs-devel.x86_64 3.3.1-1.el6 installed glusterfs-fuse.x86_64 3.3.1-1.el6 installed glusterfs-geo-replication.x86_64 3.3.1-1.el6 installed glusterfs-rdma.x86_64 3.3.1-1.el6 installed glusterfs-server.x86_643.3.1-1.el6 installed Thanks. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users