[Gluster-users] MapReduce on glusterfs in Hadoop

2013-02-22 Thread Nikhil Agarwal
Hi All,



Thanks a lot for taking out your time to answer my question.



I am trying to implement a file system in hadoop under irg.apache.hadoop.fs
package something similar to KFS, glusterfs, etc. I wanted to know is that
in README.txt of glusterfs it is mentioned :



 # ./bin/start-mapred.sh

  If the map/reduce job/task trackers are up, all I/O will be done to
GlusterFS.



So, suppose my input files are scattered in different nodes(glusterfs
servers), how do I(hadoop client having glusterfs plugged in) issue a
Mapreduce command?

Moreover, after issuing a Mapreduce command would my hadoop client fetch
all the data from different servers to my local machine and then do a
Mapreduce or would it start the TaskTracker daemons on the machine(s) where
the input file(s) are located and perform a Mapreduce there?

Please rectify me if I am wrong but I suppose that the location of input
files top Mapreduce is being returned by the function *getFileBlockLocations
* *(*FileStatus file*,* *long* start*,* *long* len*). *



Thank you very much for your time and helping me out.



Regards,

Nikhil
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] help posix_fallocate too slow on glusterfs client

2013-02-22 Thread 符永涛
Dear gluster experts,

Recently I have encountered a problem about posix_fallocate
performance on glusterfs client.
I use posix_fallocate to allocate a file with specified size on
glusterfs client. For example if I create a file with size of
1907658896, it will take about 20 seconds on glusterfs but on local
xfs or ext4 it takes less than 1 second.
What's the problem? How can I improve posix_fallocate performance for glusterfs?

Thank you very much.

BTW
The volume info is as bellow:
sudo gluster volume info volume_e

Volume Name: volume_e
Type: Distributed-Replicate
Volume ID: 81702024-f327-4ae1-b06a-1f2b877d5ebb
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: glusterfs-test-dev001.qiyi.virtual:/mnt/xfsd/volume_e
Brick2: glusterfs-test-dev002.qiyi.virtual:/mnt/xfsd/volume_e
Brick3: glusterfs-test-dev003.qiyi.virtual:/mnt/xfsd/volume_e
Brick4: glusterfs-test-dev004.qiyi.virtual:/mnt/xfsd/volume_e

-- 
符永涛
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Sync status of underlying bricks for replicated volume

2013-02-22 Thread Sejal1 S
Hi Gluster Experts,


I am trying in integrate glusterfs as a replication file system in my 
product. 
As a background, I will be creating a distributed replicated (glusterfs) 
volume on two bricks present on two different server.

Can you please guide me to find out the sync status of all bricks under 
one volume i.e; given point of time whether all underlying bricks are in 
IN-SYNC, Out-Of-SYNC or synchronization is going on?

Please note I am naive in the glusterfs.


Thanks in anticipation.

Regards
Sejal 
=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] I/O error repaired only by owner or root access

2013-02-22 Thread Rajesh Amaravathi
Hi Dan,
Could you please provide the following info
(1) the exact permissions of the file you are accessing and
its parent directory,
(2) the user from which 'ls -l' is issued, and
(3) the owner of the file, and the parent directory.

You could open a bug for it if it is seen several times.

Regards, 
Rajesh Amaravathi, 
Software Engineer, GlusterFS 
RedHat Inc. 

- Original Message -
 From: Dan Bretherton d.a.brether...@reading.ac.uk
 To: gluster-users gluster-users@gluster.org
 Sent: Thursday, February 21, 2013 8:16:40 PM
 Subject: [Gluster-users] I/O error repaired only by owner or root access
 
 Dear All-
 Several users are having a lot of trouble reading files belonging to
 other users.  Here is an example.
 
 [sms05dab@jupiter ~]$ ls -l /users/gcs/WORK/ORCA1/ORCA1-R07-MEAN/Ctl
 ls: /users/gcs/WORK/ORCA1/ORCA1-R07-MEAN/Ctl: Input/output error
 
 The corresponding nfs.log messages are shown below.
 
 [2013-02-21 12:11:39.204659] W [nfs3.c:727:nfs3svc_getattr_stat_cbk]
 0-nfs: fe2ba5b8: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN = -1
 (Invalid argument)
 [2013-02-21 12:11:39.204778] W
 [nfs3-helpers.c:3389:nfs3_log_common_res]
 0-nfs-nfsv3: XID: fe2ba5b8, GETATTR: NFS: 22(Invalid argument for
 operation), POSIX: 22(Invalid argument)
 [2013-02-21 12:11:39.215345] I
 [dht-common.c:954:dht_lookup_everywhere_cbk] 0-nemo2-dht: deleting
 stale
 linkfile /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN on
 nemo2-replicate-0
 [2013-02-21 12:11:39.225674] W
 [client3_1-fops.c:592:client3_1_unlink_cbk] 0-nemo2-client-1: remote
 operation failed: Permission denied
 [2013-02-21 12:11:39.225786] W
 [client3_1-fops.c:592:client3_1_unlink_cbk] 0-nemo2-client-0: remote
 operation failed: Permission denied
 [2013-02-21 12:11:39.681029] W
 [client3_1-fops.c:258:client3_1_mknod_cbk] 0-nemo2-client-18: remote
 operation failed: Permission denied. Path:
 /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN
 (1662aa0a-d43b-4c2e-9be9-407eb7a89e85)
 [2013-02-21 12:11:39.681400] W
 [client3_1-fops.c:258:client3_1_mknod_cbk] 0-nemo2-client-19: remote
 operation failed: Permission denied. Path:
 /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN
 (1662aa0a-d43b-4c2e-9be9-407eb7a89e85)
 [2013-02-21 12:11:39.682268] W [nfs3.c:1627:nfs3svc_readlink_cbk]
 0-nfs:
 2ca5b8: /gorgon/users/gcs/WORK/ORCA1/ORCA1-R07-MEAN = -1 (Invalid
 argument)
 [2013-02-21 12:11:39.682338] W
 [nfs3-helpers.c:3403:nfs3_log_readlink_res] 0-nfs-nfsv3: XID: 2ca5b8,
 READLINK: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid
 argument), target: (null)
 
 I managed to access the same directory as the owner (or any user with
 write access including root) without any trouble, and after that
 access
 from my normal user account was fine as well.  The permissions on the
 directory allowed read access by everyone, but the Permission
 denied
 messages in nfs.log indicate that some sort of operation is not being
 allowed when the directory is accessed by other users.   I have seen
 this happen with files and directories, and with the GlusterFS native
 client and NFS.
 
 I presume this is a bug; I would be grateful if someone could confirm
 this.  I would file a bug report, but the trouble is that I don't
 know
 how to reproduce the problem that causes the I/O error in the first
 place.  It only happens with some files and directories, not all.
  Would
 a bug report without any way to reproduce the error be any use, and
 can
 anyone suggest a way to dig deeper (eg looking at xattrs) next time I
 come across an example?
 
 -Dan.
 
 --
 Dan Bretherton
 ESSC Computer System Manager
 Department of Meteorology
 Harry Pitt Building, 3 Earley Gate
 University of Reading
 Reading, RG6 7BE (or RG6 6AL for postal service deliveries)
 UK
 Tel. +44 118 378 5205, Fax: +44 118 378 6413
 --
 ## Please sponsor me to run in VSO's 30km Race to the Eye ##
 ##http://www.justgiving.com/DanBretherton ##
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Is having millions of directories unusual?

2013-02-22 Thread Dan Bretherton

Dear All-
I have just discovered that the volume I've been having the most trouble 
with contains over 3.3 million directories.  Most of these are output 
from a storm tracking model that produces output in a large number of 
directories, each containing a handful of small files.  Is this number 
of directories in a distributed-replicate volume unusual, and should I 
expect it to cause problems for GlusterFS?  The layout fix operation 
that follows the addition of new bricks takes a very long time 
(weeks-months) and seems to result in a high CPU load.  The volume has a 
capacity of 52TB and the bricks are 3.3TB in size.


-Dan.

--
Dan Bretherton
ESSC Computer System Manager
Department of Meteorology
Harry Pitt Building, 3 Earley Gate
University of Reading
Reading, RG6 7BE (or RG6 6AL for postal service deliveries)
UK
Tel. +44 118 378 5205, Fax: +44 118 378 6413
--
## Please sponsor me to run in VSO's 30km Race to the Eye ##
##http://www.justgiving.com/DanBretherton ##

___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] MapReduce on glusterfs in Hadoop

2013-02-22 Thread Jay Vyas
Hi nithin: 

The fuse mount is what allows the filesystem to access distributed files in 
gluster: that is,  GlusterFS has its own fuse mount ...  And GlusterFileSystem 
wraps that in hadoop FileSystem semantics.

Meanwhile, The mapreduce jobs are invoked using on custom core-site and 
mapred-site XML nodes which specify GlusterFileSystem as the dfs.

On Feb 22, 2013, at 3:17 AM, Nikhil Agarwal nikaga...@gmail.com wrote:

 Hi All,
 
  
 
 Thanks a lot for taking out your time to answer my question.
 
  
 
 I am trying to implement a file system in hadoop under irg.apache.hadoop.fs 
 package something similar to KFS, glusterfs, etc. I wanted to know is that in 
 README.txt of glusterfs it is mentioned :
 
  
 
  # ./bin/start-mapred.sh
   If the map/reduce job/task trackers are up, all I/O will be done to 
 GlusterFS.
 
  
 
 So, suppose my input files are scattered in different nodes(glusterfs 
 servers), how do I(hadoop client having glusterfs plugged in) issue a 
 Mapreduce command?
 
 Moreover, after issuing a Mapreduce command would my hadoop client fetch all 
 the data from different servers to my local machine and then do a Mapreduce 
 or would it start the TaskTracker daemons on the machine(s) where the input 
 file(s) are located and perform a Mapreduce there?
 
 Please rectify me if I am wrong but I suppose that the location of input 
 files top Mapreduce is being returned by the function getFileBlockLocations 
 (FileStatus file, long start, long len).
 
  
 
 Thank you very much for your time and helping me out.
 
  
 
 Regards,
 
 Nikhil
 
  
 
 
 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://supercolony.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Bug in log rotation for 3.3.1

2013-02-22 Thread Joe Julian
Yes, I am aware. Louis' packaging is part of the official debian development 
and bugs against it are tracked at launchpad. 

Toby Corkindale toby.corkind...@strategicdata.com.au wrote:

On 22/02/13 11:18, Joe Julian wrote:
 On 02/20/2013 05:05 PM, Toby Corkindale wrote:
 logrotate.d/glusterfs-common (in the debian package for 3.3.1) is
faulty.
 It rotates the log files, but it doesn't tell glusterd to re-open
 them, so it continues to write to what is now .1 (and then later it
 gets gziped and corrupted)

 I also note that the debian packages do not include the man pages
for
 any of the gluster binaries! (gluster, glusterfs, glusterd)
 Please file a bug report at
 https://bugs.launchpad.net/ubuntu/+source/glusterfs/+filebug

Oh, I think you misunderstand -- I'm talking about the 
officially-provided-by-glusterfs debian packages here, not the ones 
created by Debian or Ubuntu themselves.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Peer Probe

2013-02-22 Thread Tony Saenz
Hey,

I was wondering if I could get a bit of help.. I installed a new Infiniband 
card into my servers but I'm unable to get it to come up as a peer. Is there 
something I'm missing?

[root@fpsgluster testvault]# gluster peer probe fpsgluster2ib
Probe on host fpsgluster2ib port 0 already in peer list

[root@fpsgluster testvault]# yum list installed | grep gluster
glusterfs.x86_64   3.3.1-1.el6  installed   
glusterfs-devel.x86_64 3.3.1-1.el6  installed   
glusterfs-fuse.x86_64  3.3.1-1.el6  installed   
glusterfs-geo-replication.x86_64   3.3.1-1.el6  installed   
glusterfs-rdma.x86_64  3.3.1-1.el6  installed   
glusterfs-server.x86_643.3.1-1.el6  installed   

Thanks. 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users