Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

2013-07-11 Thread Brian Candler
On 11/07/2013 20:01, Greg Scott wrote: Well…OK. The “on my own” comment came from me after a long time and a lot of work trying to figure this out. I just went back and checked – I posted the original question on 7/8 at 8:18 PM. I asked a follow-up question 7/9 at 12:27 PM, roughly 16 hours

Re: [Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

2013-07-10 Thread Brian Candler
On 10/07/2013 06:26, Greg Scott wrote: Bummer. Looks like I'm on my own with this one. I'm afraid this is the problem with gluster: everything works great on the happy path, but as soon as anything goes wrong, you're stuffed. There is neither recovery procedure documentation, nor detailled

Re: [Gluster-users] Giving up [ was: Re: read-subvolume]

2013-07-10 Thread Brian Candler
On 10/07/2013 13:58, Jeff Darcy wrote: 2d. it needs a fast validation scanner which verifies that data is where it should be and is identical everywhere (md5sum). How fast is fast? What would be an acceptable time for such a scan on a volume containing (let's say) ten million files? What

[Gluster-users] linux-kernel-tuning-for-glusterfs

2013-05-20 Thread Brian Candler
http://community.gluster.org/a/linux-kernel-tuning-for-glusterfs/ now gives a 404. Does anyone know where it's gone or have a copy? Thanks, Brian. ___ Gluster-users mailing list Gluster-users@gluster.org

Re: [Gluster-users] linux-kernel-tuning-for-glusterfs

2013-05-20 Thread Brian Candler
On Mon, May 20, 2013 at 01:49:32PM -0400, John Mark Walker wrote: In the meantime, you can access the article here: http://www.gluster.org/community/documentation/index.php/Linux_Kernel_Tuning Thank you! ___ Gluster-users mailing list

Re: [Gluster-users] gluster volume create failure

2013-05-14 Thread Brian Candler
On Tue, May 14, 2013 at 08:56:02PM +0200, John Smith wrote: Thanks. Is there a preferred naming convention to go along with it ? It's up to you, but I used /exports/brick1/myvol, /exports/brick2/myvol and it seemed logical to me. There's an important benefit to doing this. If the filesystem

Re: [Gluster-users] Disappointing documentation?

2013-03-07 Thread Brian Candler
On Wed, Mar 06, 2013 at 05:16:45PM -0800, Joe Julian wrote: Somewhere along the way, the decision was made to only have an un-editable pdf as the only form of documentation provided by Gluster during the transition from their own company to being owned by Red Hat. There is HTML. It used to be

Re: [Gluster-users] GlusterFS performance

2013-03-05 Thread Brian Candler
On Tue, Mar 05, 2013 at 12:01:35PM +1100, Toby Corkindale wrote: I have to ask -- what are you moving to now, Brian? Nothing clever: NFSv4 to the storage bricks, behind-time replication using rsync, application-layer distribution of files between bricks. We may have a future need for a storage

Re: [Gluster-users] GlusterFS performance

2013-03-01 Thread Brian Candler
On Fri, Mar 01, 2013 at 03:30:07PM +0600, Nikita A Kardashin wrote: If I try to execute above command inside virtual machine (KVM), first time all going right - about 900MB/s (cache effect, I think), but if I run this test again on existing file - task (dd) hungs up and can be

Re: [Gluster-users] GlusterFS performance

2013-02-27 Thread Brian Candler
On Wed, Feb 27, 2013 at 02:46:28PM +, Robert van Leeuwen wrote: You could try to trunk the network on your client but 10Gbit ethernet ( i suggest to use fiber, because of latency issues with copper 10Gbit) Aside: 10G with SFP+ direct-attach cables also works well, even though it's copper.

Re: [Gluster-users] Peer Probe

2013-02-26 Thread Brian Candler
On Mon, Feb 25, 2013 at 06:28:01PM +, Tony Saenz wrote: Any help please? The regular NICs are fine which is what it currently sees but I'd like to move them over to the Infiniband cards. ... [root@fpsgluster testvault]# gluster peer probe fpsgluster2ib Probe on host fpsgluster2ib port 0

Re: [Gluster-users] Gluster Virtual Appliance

2013-02-25 Thread Brian Candler
On Sun, Feb 24, 2013 at 12:52:53PM +0100, Gandalf Corvotempesta wrote: Recap: 4 phisical nodes, each node will host at least 10 VM plus 1 gluster VM Each VM should boot from the gluster VM By boot from I guess you mean that the VM's root device, e.g. hda/vda, will be a disk image file stored on

Re: [Gluster-users] server3_1-fops.c:1240 (Cannot allocate memory)

2013-02-13 Thread Brian Candler
On Tue, Feb 12, 2013 at 07:19:43PM +0100, samuel wrote: Both nodes have enough disk space (1.3T) but looks like used memory is quite high: Not true. node1: total used free sharedbufferscached Mem: 40476804012700 34980

Re: [Gluster-users] gluster-information

2013-02-08 Thread Brian Candler
On Fri, Feb 08, 2013 at 12:33:16AM +0100, VHosting Solution wrote: For autoreplicate the file accross all nodes, what is the directory that I must use for work? You must mount the volume on a client, and work with it there. ___ Gluster-users

Re: [Gluster-users] NFS availability

2013-01-31 Thread Brian Candler
On Thu, Jan 31, 2013 at 09:18:26AM +0100, Stephan von Krawczynski wrote: The client will still fail (in most cases) since host1 (if I follow you) is part of the gluster groupset. Certainly if it's a distributed-only, maybe not if it's a dist/repl gluster. But if host1 goes down, the

Re: [Gluster-users] Fw: performance evaluation of distributed storage systems

2013-01-22 Thread Brian Candler
I'm a phd student and as a part of my research I've compared performance of different distributed storage systems (Gluster, Openstack, Compuverde). Compuverde? That's new to me. Oh wow. Software defined storage just got 400 % more efficient. Compuverde Gateway read and writes

Re: [Gluster-users] Data migration and rebalance

2013-01-20 Thread Brian Candler
On Sat, Jan 19, 2013 at 04:04:43PM -0500, Jeff Darcy wrote: On 01/19/2013 01:43 PM, F. Ozbek wrote: try moosefs. http://www.moosefs.org/ we tried both gluster and ceph, they both failed in many ways. moosefs passed the same tests with flying colors. moose is your friend. Don't you

Re: [Gluster-users] unexpected data beyond EOF in block %u of relation \%s\

2013-01-15 Thread Brian Candler
On Tue, Jan 15, 2013 at 05:49:00PM -0300, Targino Silveira wrote: I fixed my problem mounting partition with NFS. Would you care to share how you fixed it? Thanks, Brian. ___ Gluster-users mailing list Gluster-users@gluster.org

Re: [Gluster-users] ownership of link file changed to root:root after reboot brick service

2013-01-15 Thread Brian Candler
[Aside: please do not cross-post. If you have a problem using glusterfs, and you are not looking at the source code and proposing a specific path, then I suggest glusterfs-users is the appropriate place] On Wed, Jan 16, 2013 at 02:53:40PM +0800, huangql wrote: I have encountered a

Re: [Gluster-users] tuning guide

2013-01-12 Thread Brian Candler
On Sat, Jan 12, 2013 at 03:29:05AM +0100, Papp Tamas wrote: At this moment I want to test gluster as and underlying storage for ESX and Oracle VM. Is it a reliable option anyway? I haven't read so much here about this capability, only Xenserver and KVM. Oracle VM is Xen, I believe.

Re: [Gluster-users] gluster stripe at xfs

2013-01-11 Thread Brian Candler
On Fri, Jan 11, 2013 at 06:12:48PM +0800, 符永涛 wrote: I recommand you not use stripe. I didn't make a fully test and didn't investige it but from my experience there's no need to use stripe. Considering the following: 1 performance (When I test it performance is not good for stripe) 2 data

Re: [Gluster-users] self-heal failed

2013-01-11 Thread Brian Candler
On Fri, Jan 11, 2013 at 04:18:37PM -0500, Liang Ma wrote: For option 2, is this UUID for the peer host? What do you mean by the peer host? Unless the cluster has only two nodes, there are multiple peers. As per the example given: if server3 is the one which has failed, then you need to find

Re: [Gluster-users] self-heal failed

2013-01-10 Thread Brian Candler
On Thu, Jan 10, 2013 at 12:50:48PM -0500, Liang Ma wrote: I assume to replace a failed replicate disk or node should be a standard procedure, isn't it? I could find anything related to this in the 3.3 manual. You'd have thought so, wouldn't you :-( I know of two options. (1) If the

Re: [Gluster-users] frequent split-brain detected, aborting selfheal; background meta-data self-heal failed

2013-01-09 Thread Brian Candler
On Tue, Jan 08, 2013 at 03:02:19PM -0500, Jeff Darcy wrote: [1] http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/ [2] http://hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/ These are helpful articles, thank you. It seems to me that the main risk of

Re: [Gluster-users] frequent split-brain detected, aborting selfheal; background meta-data self-heal failed

2013-01-08 Thread Brian Candler
On Tue, Jan 08, 2013 at 06:44:13PM +0100, Tomasz Chmielewski wrote: Attribute afr.shared-client-0 has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub Attribute afr.shared-client-1 has a 12 byte value for /data/gluster/lfd/techstudiolfc/pub Perhaps that would be useful, too

Re: [Gluster-users] difference between S.O

2013-01-07 Thread Brian Candler
On Mon, Jan 07, 2013 at 01:01:20AM -0300, Targino Silveira wrote: I'll start a GlusterFS with two server with Ubuntu 12.04lts, I made some tests and by apt-get I don't get the last version of gluster, I need to compile it or I can get a .deb to install ? For Ubuntu 12.04: apt-get

Re: [Gluster-users] how well will this work

2013-01-02 Thread Brian Candler
On Fri, Dec 28, 2012 at 10:14:19AM -0800, Joe Julian wrote: In my configuration, 1 server has 4 drives (well, 5, but one's the OS). Each drive has one gpt partition. I create an lvm volume group that holds all four huge partitions. For any one GlusterFS volume I create 4 lvm logical volumes:

Re: [Gluster-users] Meta-discussion

2013-01-02 Thread Brian Candler
On Thu, Dec 27, 2012 at 06:53:46PM -0500, John Mark Walker wrote: I invite all sorts of disagreeable comments, and I'm all for public discussion of things - as can be seen in this list's archives. But, for better or worse, we've chosen the approach that we have. Anyone who would like to

Re: [Gluster-users] Meta-discussion

2013-01-02 Thread Brian Candler
On Wed, Jan 02, 2013 at 09:03:59AM -0500, Jeff Darcy wrote: * A general principles of operation guide - not a whole book, but more than bits scattered among slide presentations and wiki pages. Let's say something that would be on the order of 15-50 pages printed out. Personally I'd like to

Re: [Gluster-users] how well will this work

2012-12-27 Thread Brian Candler
On Wed, Dec 26, 2012 at 11:24:25PM -0500, Miles Fidelman wrote: I find myself trying to expand a 2-node high-availability cluster from to a 4-node cluster. I'm running Xen virtualization, and currently using DRBD to mirror data, and pacemaker to failover cleanly. Not answering your question

Re: [Gluster-users] Gluster machines slowing down over time

2012-12-12 Thread Brian Candler
On Tue, Dec 11, 2012 at 04:13:15PM +, Tom Hall wrote: I have 2 gluster servers in replicated mode on EC2 with ~4G RAM CPU and RAM look fine but over time the system becomes sluggish, particularly networking. I notice when sshing into the machine takes ages and running

Re: [Gluster-users] New GlusterFS Config with 6 x Dell R720xd's and 12x3TB storage

2012-12-04 Thread Brian Candler
On Mon, Dec 03, 2012 at 11:29:51PM +, Mike Hanby wrote: Each of the 6 servers now have 10 3TB LUNs that physically exist on a RAID 6. You mean you combined the 12 3TB drives into a 30TB RAID6 array, and then partitioned that 30TB array into 10 x 3TB partitions / logical volumes? I've

Re: [Gluster-users] New GlusterFS Config with 6 x Dell R720xd's and 12x3TB storage

2012-12-04 Thread Brian Candler
On Tue, Dec 04, 2012 at 04:33:30PM +, Mike Hanby wrote: Thanks Brian. I'm going to test the multiple volumes suggestion today. Regarding the underlying storage, the 12 x 3TB disks are configured as a single RAID6 that is presented to the OS as 10 x 3TB disks via the Dell PERC

[Gluster-users] gluster peer status messed up

2012-12-03 Thread Brian Candler
I have three machines, all Ubuntu 12.04 running gluster 3.3.1. storage1 192.168.6.70 on 10G, 192.168.5.70 on 1G storage2 192.168.6.71 on 10G, 192.168.5.71 on 1G storage3 192.168.6.72 on 10G, 192.168.5.72 on 1G Each machine has two NICs, but on each host, /etc/hosts lists the 10G

Re: [Gluster-users] gluster peer status messed up

2012-12-03 Thread Brian Candler
On Mon, Dec 03, 2012 at 01:44:47PM +, Brian Candler wrote: So this all looks broken, and as I can't find any gluster documentation saying what these various states mean, I'm not sure how to proceed. Any suggestions? Update. On storage1 and storage3 I killed all glusterfs(d) processes, did

Re: [Gluster-users] New GlusterFS Config with 6 x Dell R720xd's and 12x3TB storage

2012-12-02 Thread Brian Candler
On Fri, Nov 30, 2012 at 07:21:54PM +, Mike Hanby wrote: We have the following hardware that we are going to use for a GlusterFS cluster. 6 x Dell R720xd's (16 cores, 96G) Heavily over-specified, especially the RAM. Having such large amounts of RAM can even cause problems if

Re: [Gluster-users] geo-replicated Master Master cluster

2012-11-27 Thread Brian Candler
On Sun, Nov 25, 2012 at 08:59:04PM +0400, Zohair Raza wrote: I have only two machines and I want something like master-master concept, I tried that but can not succeed. Gluster won't do this for you. You could look at something like 'unison' or 'csync', but they may not scale up to your

Re: [Gluster-users] Inviting comments on my plans

2012-11-18 Thread Brian Candler
On Sat, Nov 17, 2012 at 11:04:33AM -0700, Shawn Heisey wrote: Dell R720xd servers with two internal OS drives and 12 hot-swap external 3.5 inch bays. Fedora 18 alpha, to be upgraded to Fedora 18 when it is released. I would strongly recommend *against* Fedora in any production environment,

Re: [Gluster-users] Inviting comments on my plans

2012-11-18 Thread Brian Candler
On Sun, Nov 18, 2012 at 09:27:41AM -0700, Shawn Heisey wrote: Having half the drives on BTRFS was my primary reason for Fedora. OK, I understand. You could perhaps look at what the kernel in ubuntu 12.04 LTS is like. (Note: kernels 3.2.0-30 to -33 have a serious problem with hotswap locking up

Re: [Gluster-users] Avoid Split-brain and other stuff

2012-11-16 Thread Brian Candler
On Fri, Nov 16, 2012 at 08:37:01AM +0100, Martin Emrich wrote: Any multi-master replication suffers from exactly the same split-brain scenarios as you described earlier. That would be perfectly acceptable, as long as it would heal deterministically (last one wins, or renamed conflicting

Re: [Gluster-users] Avoid Split-brain and other stuff

2012-11-15 Thread Brian Candler
On Wed, Nov 14, 2012 at 03:29:28PM +0100, Martin Emrich wrote: (By the way, I also wanted to try geo-replication (which might suffice for my needs with a tight-enough schedule), but I was not able to create a volume with only one brick… I certainly have created a volume with one

[Gluster-users] Replacing a failed server - 3.3

2012-11-15 Thread Brian Candler
I recently had to swap a server for a new one. The info at http://gluster.org/community/documentation/index.php/Gluster_3.2:_Brick_Restoration_-_Replace_Crashed_Server has not been updated for 3.3. It works if you change it as follows: grep server3 /var/lib/glusterd/peers/* echo UUID=...

Re: [Gluster-users] Avoid Split-brain and other stuff

2012-11-15 Thread Brian Candler
On Thu, Nov 15, 2012 at 11:45:12AM +0100, Martin Emrich wrote: I get this message: gluster volume create gv1-kl replica 1 transport tcp /bricks/brick2 replica count should be greater than 1 Usage: volume create NEW-VOLNAME [stripe COUNT] [replica COUNT] [transport tcp|rdma|tcp,rdma]

Re: [Gluster-users] Avoid Split-brain and other stuff

2012-11-15 Thread Brian Candler
On Thu, Nov 15, 2012 at 03:12:09PM +0100, Martin Emrich wrote: Hi! Apologies if this is a dumb comment, but isn't master-slave, by definition, only in one direction? If you could write on all nodes, it would be multi-master, yes?   I would say it depends on the definition.

Re: [Gluster-users] Gluster in a cluster

2012-11-15 Thread Brian Candler
On Thu, Nov 15, 2012 at 09:30:52AM -0600, Jerome wrote: My problem is when a node reboot accidentaly, or for some administration task: the node reinstall itself, and the gluster volume begin to fail I detect taht the UUID of a machine is generated during the instalation, so i develop some

Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-05 Thread Brian Candler
If your disks are 1TB with XFS then try mount -o inode64 This has the effect of sequential writes into the same directory being localised next to each other (within the same allocation group). When you skip to the next directory you will probably get a different allocation group. Without this,

Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-05 Thread Brian Candler
On Mon, Nov 05, 2012 at 01:10:40PM -0500, Jonathan Lefman wrote: Thanks Brian. I tried what you recommended. At first I was very encouraged when I saw things moving across the wire. But about 15 minutes into the transfer things ground to a halt. I am currently running across a

Re: [Gluster-users] Very slow directory listing and high CPU usage on replicated volume

2012-11-02 Thread Brian Candler
On Thu, Nov 01, 2012 at 08:03:21PM -0400, Jonathan Lefman wrote: Soon after loading up about 100 MB of small files (about 300kb each), the drive usage is at 1.1T. That is very odd. What do you get if you run du and df on the individual bricks themselves? 100MB is only ~330 files of 300KB

Re: [Gluster-users] Best practices for creating bricks

2012-11-01 Thread Brian Candler
On Wed, Oct 31, 2012 at 03:59:15PM -0400, Kushnir, Michael (NIH/NLM/LHC) [C] wrote: I am working with several Dell PE720xd. I have 24 disks per server at my disposal with a high end raid card with 1GB RAM and BBC. I will be building a distributed-replicated volume. Is it better for me

Re: [Gluster-users] Block replication with glusterfs for NFS failover

2012-10-25 Thread Brian Candler
On Thu, Oct 25, 2012 at 10:23:33AM +0200, Runar Ingebrigtsen wrote: Does that mean the bauer-power article [1] about how healing fails is inaccurate? AFAICS that page doesn't say anything about how gluster healing works. He says he had data corruption, and I do not doubt that. However

Re: [Gluster-users] Block replication with glusterfs for NFS failover

2012-10-24 Thread Brian Candler
On Wed, Oct 24, 2012 at 12:47:36AM +0200, Runar Ingebrigtsen wrote: I'm sorry - I am aware of that. The part of the document I was meaning to reference was the block-by-block replication that was pointed out as a requirement for NFS connection handover. I should have pointed out what I meant

Re: [Gluster-users] Block replication with glusterfs for NFS failover

2012-10-24 Thread Brian Candler
On Wed, Oct 24, 2012 at 11:19:13AM +0200, Runar Ingebrigtsen wrote: GlusterFS replication works at a different layer: each glusterfs brick sits on top of a local filesystem, and the operations are at the level of files (roughly open file named X, seek, read file, write file) rather than

Re: [Gluster-users] Is there a way to force a brick in a replica set to automatically self heal after it goes down and comes back up?

2012-10-24 Thread Brian Candler
On Wed, Oct 24, 2012 at 02:20:42PM -0400, Kushnir, Michael (NIH/NLM/LHC) [C] wrote: I am thinking of a scenario where a brick #2 in two-brick a replica set goes down and then comes back up. It will be inconsistent with brick #1. From what I understand, missing and changed files are

Re: [Gluster-users] Block replication with glusterfs for NFS failover

2012-10-23 Thread Brian Candler
On Tue, Oct 23, 2012 at 01:55:01PM +0200, Runar Ingebrigtsen wrote: to enable NFS state transfer between hosts in a failover SAN, it is necessary to have the NFS state data on the exact same blocks on both storage nodes: You are reading a document for something which is not glusterfs.

Re: [Gluster-users] GlusterFS failover with UCarp

2012-10-22 Thread Brian Candler
On Thu, Oct 18, 2012 at 06:48:42PM +0200, Runar Ingebrigtsen wrote: The connection break behavior is to be expected - the TCP connection doesn't handle the switch of host. I didn't expect the NFS client to go stale. I can't answer this directly but I did notice something in the 3.3.1

Re: [Gluster-users] New Download Server in the Works

2012-10-17 Thread Brian Candler
On Tue, Oct 16, 2012 at 01:46:15PM -0400, John Mark Walker wrote: The new download server is up at http://download.gluster.org/ Currently you'll find 3.3.1 repos for Fedora, Centos and RHEL. Other builds will filter in slowly over time - if you urgently need builds for other releases,

[Gluster-users] Where have Ubuntu packages gone?

2012-10-15 Thread Brian Candler
Previously I download pre-build Ubuntu packages from http://download.gluster.org/pub/gluster/glusterfs/LATEST/Ubuntu/12.04/ but I am getting connection timed out from this URL. If I follow the download link from www.gluster.org then click on Latest GA version of GlusterFS (ver 3.3) - DEB, RPM and

Re: [Gluster-users] Where have Ubuntu packages gone?

2012-10-15 Thread Brian Candler
On Mon, Oct 15, 2012 at 09:56:56AM -0400, John Mark Walker wrote: I apologize for the inconvenience - I am currently working up an announcement in which all will be explained. Suffice to say, this has been a fun weekend. Thanks for the update - and sorry to hear about the fun you're dealing

[Gluster-users] Renaming a file in a distributed volume

2012-10-13 Thread Brian Candler
In a distributed volume (glusterfs 3.3), files within a directory are assigned to a brick by a hash of their filename, correct? So what happens if you do mv foo bar? Does the file get copied to another brick? Is this no longer an atomic operation? Thanks, Brian.

[Gluster-users] Moving disks

2012-10-13 Thread Brian Candler
Suppose I want to move all the disks for a brick from one server to another, can this be done without having to copy all the data? Example: * A distributed-replicated volume comprises six bricks (each of which is 8 disks in a RAID array). Those bricks are on s1,s2,s3,s4,s5,s6 * I get two new

Re: [Gluster-users] Renaming a file in a distributed volume

2012-10-13 Thread Brian Candler
On Sat, Oct 13, 2012 at 05:05:08PM +0200, Stephan von Krawczynski wrote: On Sat, 13 Oct 2012 15:52:56 +0100 Brian Candler b.cand...@pobox.com wrote: In a distributed volume (glusterfs 3.3), files within a directory are assigned to a brick by a hash of their filename, correct? So what

Re: [Gluster-users] RAID 0 with Cache v/s NO-RAID

2012-10-03 Thread Brian Candler
On Wed, Oct 03, 2012 at 08:37:55PM +0530, Indivar Nair wrote: Sorry, couldn't reply earlier, was indisposed the last few days. Thanks for the input Brian, especially on the 'map' translator. It lead me to another one called the 'switch' scheduler that seems to do exactly what I

Re: [Gluster-users] RAID 0 with Cache v/s NO-RAID

2012-09-28 Thread Brian Candler
On Fri, Sep 28, 2012 at 08:58:55AM +0530, Indivar Nair wrote: We were trying to cater to both large file (100MB - 2GB) read speed and small file (10-50MB) read+write speed. With Gluster, we were thinking of setting the individual stripe size to 50MB so that each volume could hold a

Re: [Gluster-users] RAID 0 with Cache v/s NO-RAID

2012-09-27 Thread Brian Candler
On Thu, Sep 27, 2012 at 10:08:12PM +0530, Indivar Nair wrote: We were trying to define our storage spec for Gluster and was wondering which would be better purely from a performance perspective. 1. Use a simple 24 Disk JBOD with SAS Controller and export each hard disk as an

Re: [Gluster-users] XFS and MD RAID

2012-09-18 Thread Brian Candler
On Mon, Sep 10, 2012 at 09:29:25AM +0800, Jack Wang wrote: Hi Brian, below patch should fix your bug. John reports: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202] [..] Call Trace: [8141782a] scsi_remove_target+0xda/0x1f0 [81421de5]

Re: [Gluster-users] XFS and MD RAID

2012-09-11 Thread Brian Candler
On Mon, Sep 10, 2012 at 06:43:44PM +0100, Brian Candler wrote: It has been running fine for the last 7 hours or so. I have purposely sent some dd reads to the two failed drives in the server - I see the errors on those drives in dmesg, but activity on the remaining drives has not been affected

Re: [Gluster-users] XFS and MD RAID

2012-09-11 Thread Brian Candler
On Tue, Sep 11, 2012 at 09:51:28AM +0100, Brian Candler wrote: So this patch looks good to me. I hope it can find its way to the production Ubuntu kernel soon. FYI, I have opened a bug report at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1049013

Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Brian Candler
On Sun, Sep 09, 2012 at 09:28:47PM +0100, Andrei Mikhailovsky wrote: While trying to figure out the cause of the bottleneck i've realised that the bottle neck is coming from the client side as running concurrent test from two clients would give me about 650mb/s per each client.

Re: [Gluster-users] XFS and MD RAID

2012-09-10 Thread Brian Candler
On Mon, Sep 10, 2012 at 09:29:25AM +0800, Jack Wang wrote: below patch should fix your bug. Thank you Jack - that was a very quick response! I'm building a new kernel with this patch now and will report back. However, I think the existence of this bug suggests that Linux with software RAID is

Re: [Gluster-users] Throughout over infiniband

2012-09-10 Thread Brian Candler
On Mon, Sep 10, 2012 at 10:03:14AM +0200, Stephan von Krawczynski wrote: Yes - so in workloads where you have many concurrent clients, this isn't a problem. It's only a problem if you have a single client doing a lot of sequential operations. That is not correct for most cases. GlusterFS

Re: [Gluster-users] XFS and MD RAID

2012-09-10 Thread Brian Candler
On Mon, Sep 10, 2012 at 11:03:41AM +0200, Stephan von Krawczynski wrote: Brian, please re-think this. What you call a stable kernel (Ubuntu 3.2.0-30) is indeed very old. I am talking about the official kernel for the Ubuntu 12.04 Long Term Support server release. If you're saying that Ubuntu

Re: [Gluster-users] XFS and MD RAID

2012-09-10 Thread Brian Candler
On Mon, Sep 10, 2012 at 09:39:18AM +0100, Brian Candler wrote: On Mon, Sep 10, 2012 at 09:29:25AM +0800, Jack Wang wrote: below patch should fix your bug. Thank you Jack - that was a very quick response! I'm building a new kernel with this patch now and will report back. It has been

Re: [Gluster-users] Unexpected Gluster behavior on startup without backing stores mounted

2012-09-07 Thread Brian Candler
On Thu, Sep 06, 2012 at 03:19:53PM -0400, Whit Blauvelt wrote: Here's the unexpected behavior: Gluster restored the nfs export based on the backing store. But without that backing store really mounted, it used /mnt/xyz, which at that point was a local subdirectory of /. This is not optimal. An

Re: [Gluster-users] Feature Request, Console Manger: Command History

2012-09-06 Thread Brian Candler
On Wed, Sep 05, 2012 at 08:53:27PM -0700, Eric wrote: I compiled Gluster by following the very simple directions that were provided in ./INSTALL: 1. ./configure 2. make 3. make install FWIW: There doesn't appear to be anything in the Makefile about readline. $ grep

Re: [Gluster-users] migration operations: Stopping a migration

2012-09-06 Thread Brian Candler
On Wed, Sep 05, 2012 at 10:40:58PM -0700, Eric wrote: FYI: While working on another project, I discovered that the file system attributes had been restored to their original state (or at least what I remember their original state to be): I would be delighted if someone could write a

Re: [Gluster-users] XFS and MD RAID

2012-09-03 Thread Brian Candler
On Wed, Aug 29, 2012 at 09:06:28AM -0400, Joe Landman wrote: We've found modern LSI HBA and RAID gear have had issues with occasional events that seem to be more firmware bugs or driver bugs than anything else. The gear is stable for very light usage, but when pushed hard (without driver/fw

Re: [Gluster-users] Samba NFS Gluster leads to runaway lock problem after many stable months

2012-08-30 Thread Brian Candler
On Thu, Aug 30, 2012 at 09:44:30AM -0400, Whit Blauvelt wrote: See: http://www.linux-archive.org/centos/639945-xfs-inode64-nfs.html Given that my XFS Gluster NFS mount does not have inode64 set (although it's only 500G, not the 2T barrier beyond which that's evidently strictly required),

[Gluster-users] XFS and MD RAID

2012-08-29 Thread Brian Candler
Does anyone have any experience running gluster with XFS and MD RAID as the backend, and/or LSI HBAs, especially bad experience? In a test setup (Ubuntu 12.04, gluster 3.3.0, 24 x SATA HD on LSI Megaraid controllers, MD RAID) I can cause XFS corruption just by throwing some bonnie++ load at the

Re: [Gluster-users] XFS and MD RAID

2012-08-29 Thread Brian Candler
On Wed, Aug 29, 2012 at 08:47:22AM -0400, Brian Foster wrote: We have a few servers with 12 drive LSI RAID controllers we use for gluster (running XFS on RHEL6.2). I don't recall seeing major issues, but to be fair these particular systems see more hacking/dev/unit test work than longevity or

Re: [Gluster-users] Ownership changed to root

2012-08-28 Thread Brian Candler
On Tue, Aug 28, 2012 at 10:01:16AM +0200, Stephan von Krawczynski wrote: Again, let me note two things: - the current code has a lot more (other) problems than the 2.X tree, that is why we won't use that. - if one has to look at the code to find out the basic problem he is not the target

Re: [Gluster-users] Typical setup questions

2012-08-28 Thread Brian Candler
On Tue, Aug 28, 2012 at 10:29:55AM -0500, Matt Weil wrote: Since we are on the subject of hardware what would be the perfect fit for a gluster brick. We where looking at a PowerEdge C2100 Rack Server. Looks fine to me; UK website doesn't give the pricing so I imagine it's pretty expensive :-)

Re: [Gluster-users] Ownership changed to root

2012-08-27 Thread Brian Candler
On Mon, Aug 27, 2012 at 03:08:21PM +0200, Stephan von Krawczynski wrote: The gluster version is 2.X and cannot be changed. Ah, that's the important bit. If you have a way to replicate the problem with current code it will be easier to get someone to look at it. AFAIK the glusterfsd versions

Re: [Gluster-users] Ownership changed to root

2012-08-26 Thread Brian Candler
On Fri, Aug 24, 2012 at 07:45:35PM -0600, Joe Topjian wrote: This removed mdadm and LVM out of the equation and the problem went away. I then tried with just LVM and still did not see this problem. Unfortunately I don't have enough hardware at the moment to create another RAID1

Re: [Gluster-users] Ownership changed to root

2012-08-26 Thread Brian Candler
On Sun, Aug 26, 2012 at 03:50:16PM +0200, Stephan von Krawczynski wrote: I'd like to point you to [Gluster-devel] Specific bug question dated few days ago, where I describe a trivial situation when owner changes on a brick can occur, asking if someone can point me to a patch for that. I guess

Re: [Gluster-users] Typical setup questions

2012-08-24 Thread Brian Candler
On Fri, Aug 24, 2012 at 10:51:24AM -0500, Matt Weil wrote: I am curious what is used typically for the file system replication and how do you make sure that it is consistent. So for example when using large 3TB+ sata/NL-sas drives. Is is typical to replicate three times to get similar

Re: [Gluster-users] Gluster failure testing

2012-08-15 Thread Brian Candler
On Tue, Aug 14, 2012 at 08:19:27PM -0700, stephen pierce wrote: I let both clients run for a while, then I stop one client. I then reset the brick/server that is not active (the other one is servicing the HTTP traffic) now. Do you mean that client1 sends HTTP traffic to brick/server1,

Re: [Gluster-users] Problem mounting Gluster volume [3.3]

2012-08-15 Thread Brian Candler
On Wed, Aug 15, 2012 at 12:06:09AM +0200, Paolo Di Tommaso wrote: Hi, I'm mounting using the following sudo mount -t glusterfs master:/vol1 /soft The command should be right since, it works in on the server node (master) but it is failing on a client. Also I'm using

Re: [Gluster-users] ext4 issue explained

2012-08-15 Thread Brian Candler
On Wed, Aug 15, 2012 at 01:37:32AM -0700, Joe Julian wrote: Do you have a link to any info on that issue? Does it only affect RedHat, or does it also affect distros running new kernels? I am using ext4 rather than xfs because I was reliably able to make machines running xfs lock up (these

Re: [Gluster-users] ext4 issue explained

2012-08-15 Thread Brian Candler
On Wed, Aug 15, 2012 at 08:19:16AM -0400, Jeff Darcy wrote: On August 15, 2012 5:28:58 AM Brian Candler b.cand...@pobox.com wrote: Many thanks. I'm on Ubuntu 12.04 with a 3.2.0 kernel - so this shouldn't affect me, as long as the Ubuntu people don't backport this patch as well That would

Re: [Gluster-users] question about list directory missing files or hang

2012-08-14 Thread Brian Candler
On Tue, Aug 14, 2012 at 12:19:10AM -0700, Joe Julian wrote: I'm betting that your bricks are formatted ext4. If they are, you have a bug due to a recent structure change in ext4. If that is the problem, you can downgrade your kernel to before they backported the change (not sure

Re: [Gluster-users] Problem mounting Gluster volume [3.3]

2012-08-14 Thread Brian Candler
On Tue, Aug 14, 2012 at 04:56:17PM +0200, Paolo Di Tommaso wrote: The log reports the error (I've attached below the reported log file) : connection to failed (No route to host) ^^^ Missing hostname here. What exact command are you typing to try and mount? Also

Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Brian Candler
On Mon, Aug 13, 2012 at 09:40:49AM +, Fernando Frediani (Qube) wrote: I think Gluster as it stands now and current level of development is more for Multimedia and Archival files, not for small files nor for running Virtual Machines. It requires still a fair amount of

Re: [Gluster-users] Gluster speed sooo slow

2012-08-13 Thread Brian Candler
One thing that is new to Gluster and that in my opinion could contribute to increase performance is the Distributed-Stripped volumes, Only if you have a single huge file, and you are doing a large read or write to it - i.e. exactly the opposite case of lots of small files. for other usages I

Re: [Gluster-users] Replica with non-uniform bricks

2012-08-13 Thread Brian Candler
On Mon, Aug 13, 2012 at 06:03:36PM +0200, s19n wrote: So the question is: what is the expected behaviour when two bricks with different sizes are coupled to form a replica? Will the larger brick keep accepting writes even after the smallest brick has been filled up? I haven't tested, but I'd

Re: [Gluster-users] 1/4 glusterfsd's runs amok; performance suffers;

2012-08-11 Thread Brian Candler
On Sat, Aug 11, 2012 at 12:11:39PM +0100, Nux! wrote: On 10.08.2012 22:16, Harry Mangalam wrote: pbs3:/dev/md127 8.2T 5.9T 2.3T 73% /bducgl --- Harry, The name of that md device (127) indicated there may be something dodgy going on there. A device shouldn't be named 127 unless some

Re: [Gluster-users] 1/4 glusterfsd's runs amok; performance suffers;

2012-08-11 Thread Brian Candler
On Sat, Aug 11, 2012 at 08:31:51AM -0700, Harry Mangalam wrote: Re the size difference, I'll explicitly rebalance the brick after the fix-layout finishes, but I'm even more worried about this fantastic increase in CPU usage and its effect on user performance. This presumably means you

Re: [Gluster-users] Problem creating volume

2012-08-10 Thread Brian Candler
On Fri, Aug 10, 2012 at 11:12:54AM +0200, Jeff Williams wrote: Also, I noticed that even when you successfully create a volume and then delete it, it still leaves the extended attributes on the directories. Is this by design or should I report this as a bug (seems like a bug to me!).

Re: [Gluster-users] EBADFD with large number of concurrent files

2012-08-07 Thread Brian Candler
On Tue, Aug 07, 2012 at 05:18:24AM -0400, Shishir Gowda wrote: Can you please provide the client(mnt) log files? Also if possible can you take the state dumps and attach them in the mail gluster volume statedump volname The o/p will be files in /tmp/brick-path.pid.dump.x It is now looking

[Gluster-users] EBADFD with large number of concurrent files

2012-08-06 Thread Brian Candler
I have an application where there are 48 processes, and each one has opens 1000 files (different files for all 48 processes). They are opened onto a distributed gluster volume, distributed between two nodes. It works initially, but after a while, some of the processes abort. perror prints File

Re: [Gluster-users] Gluster 3.3, brick crashed

2012-07-31 Thread Brian Candler
On Tue, Jul 31, 2012 at 02:04:25PM +0200, Christian Wittwer wrote: b) Can I just restart glusterd on that node to trigger the self healing? I would double-check that the underlying filesystem on unic-prd-os-compute4:/data/brick0 is OK first. Look for errors in dmesg; look at your RAID

  1   2   3   >