Re: [Gluster-devel] epel-7 mock broken in rpm.t (due to ftp.redhat.com change?)

2014-06-12 Thread Justin Clift
At a guess, it'll be somewhere on:

  https://git.centos.org

No idea of specifics though.

+ Justin

On 13/06/2014, at 7:21 AM, Harshavardhana wrote:
> Interesting - looks like all the sources have been moved? do we know where?
> 
> On Thu, Jun 12, 2014 at 10:48 PM, Justin Clift  wrote:
>> Hi Kaleb,
>> 
>> This just started showing up in rpm.t test output:
>> 
>>  ERROR: 
>> Exception(/home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/glusterfs-3.5qa2-0.621.gita22a2f0.el6.src.rpm)
>>  Config(epel-7-x86_64) 0 minutes 2 seconds
>>  INFO: Results and/or logs in: 
>> /home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/mock.d/epel-7-x86_64
>>  INFO: Cleaning up build root ('clean_on_failure=True')
>>  Start: lock buildroot
>>  Start: clean chroot
>>  INFO: chroot (/var/lib/mock/epel-7-x86_64) unlocked and deleted
>>  Finish: clean chroot
>>  Finish: lock buildroot
>>  ERROR: Command failed:
>>   # ['/usr/bin/yum', '--installroot', '/var/lib/mock/epel-7-x86_64/root/', 
>> 'install', '@buildsys-build']
>> 
>>  http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/repodata/repomd.xml
>>  : [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not 
>> Found"
>>  Trying other mirror.
>>  Error: Cannot retrieve repository metadata (repomd.xml) for repository: el. 
>> Please verify its path and try again
>> 
>> Seems to be due to ftp.redhat.com changing their layout or something, which
>> seems to have broken mock.
>> 
>> Guessing we'll need to disable epel-7 testing until this gets
>> fixed.
>> 
>> + Justin
>> 
>> --
>> GlusterFS - http://www.gluster.org
>> 
>> An open source, distributed file system scaling to several
>> petabytes, and handling thousands of clients.
>> 
>> My personal twitter: twitter.com/realjustinclift
>> 
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 
> 
> 
> -- 
> Religious confuse piety with mere ritual, the virtuous confuse
> regulation with outcomes

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] epel-7 mock broken in rpm.t (due to ftp.redhat.com change?)

2014-06-12 Thread Harshavardhana
Interesting - looks like all the sources have been moved? do we know where?

On Thu, Jun 12, 2014 at 10:48 PM, Justin Clift  wrote:
> Hi Kaleb,
>
> This just started showing up in rpm.t test output:
>
>   ERROR: 
> Exception(/home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/glusterfs-3.5qa2-0.621.gita22a2f0.el6.src.rpm)
>  Config(epel-7-x86_64) 0 minutes 2 seconds
>   INFO: Results and/or logs in: 
> /home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/mock.d/epel-7-x86_64
>   INFO: Cleaning up build root ('clean_on_failure=True')
>   Start: lock buildroot
>   Start: clean chroot
>   INFO: chroot (/var/lib/mock/epel-7-x86_64) unlocked and deleted
>   Finish: clean chroot
>   Finish: lock buildroot
>   ERROR: Command failed:
># ['/usr/bin/yum', '--installroot', '/var/lib/mock/epel-7-x86_64/root/', 
> 'install', '@buildsys-build']
>
>   http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/repodata/repomd.xml
>   : [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not 
> Found"
>   Trying other mirror.
>   Error: Cannot retrieve repository metadata (repomd.xml) for repository: el. 
> Please verify its path and try again
>
> Seems to be due to ftp.redhat.com changing their layout or something, which
> seems to have broken mock.
>
> Guessing we'll need to disable epel-7 testing until this gets
> fixed.
>
> + Justin
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel



-- 
Religious confuse piety with mere ritual, the virtuous confuse
regulation with outcomes
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bug-857330/normal.t failure

2014-06-12 Thread Kaushal M
I've got no logs so I can't confirm it. But it is most likely the same
issue we found.

~kaushal

On Thu, Jun 12, 2014 at 10:49 PM, Pranith Kumar Karampuri
 wrote:
> Kaushal,
> Could you check if this is this the same rebalance failure we
> discovered?
>
> Pranith
>
> On 06/12/2014 10:35 PM, Justin Clift wrote:
>>
>> This one seems like a "proper" failure.  Is it on your radar?
>>
>>Test Summary Report
>>---
>>./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests: 24
>> Failed: 1)
>>  Failed test:  13
>>
>>http://build.gluster.org/job/rackspace-regression/123/console
>>
>> I've disconnected that slave from Jenkins, so it can be logged
>> into remotely (via SSH) and checked, if that's helpful.
>>
>> Let me know, and I'll send you the ssh password.
>>
>> + Justin
>>
>> --
>> GlusterFS - http://www.gluster.org
>>
>> An open source, distributed file system scaling to several
>> petabytes, and handling thousands of clients.
>>
>> My personal twitter: twitter.com/realjustinclift
>>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] epel-7 mock broken in rpm.t (due to ftp.redhat.com change?)

2014-06-12 Thread Justin Clift
Hi Kaleb,

This just started showing up in rpm.t test output:

  ERROR: 
Exception(/home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/glusterfs-3.5qa2-0.621.gita22a2f0.el6.src.rpm)
 Config(epel-7-x86_64) 0 minutes 2 seconds
  INFO: Results and/or logs in: 
/home/jenkins/root/workspace/rackspace-regression-2GB/rpmbuild-mock.d/mock.d/epel-7-x86_64
  INFO: Cleaning up build root ('clean_on_failure=True')
  Start: lock buildroot
  Start: clean chroot
  INFO: chroot (/var/lib/mock/epel-7-x86_64) unlocked and deleted
  Finish: clean chroot
  Finish: lock buildroot
  ERROR: Command failed: 
   # ['/usr/bin/yum', '--installroot', '/var/lib/mock/epel-7-x86_64/root/', 
'install', '@buildsys-build']

  http://ftp.redhat.com/pub/redhat/rhel/beta/7/x86_64/os/repodata/repomd.xml
  : [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not 
Found"
  Trying other mirror.
  Error: Cannot retrieve repository metadata (repomd.xml) for repository: el. 
Please verify its path and try again

Seems to be due to ftp.redhat.com changing their layout or something, which
seems to have broken mock.

Guessing we'll need to disable epel-7 testing until this gets
fixed.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] glusterfs split-brain problem

2014-06-12 Thread Pranith Kumar Karampuri

hi,
Could you let us know what is the exact problem you are running into?

Pranith
On 06/13/2014 09:27 AM, Krishnan Parthasarathi wrote:

Hi,
Pranith, who is the AFR maintainer, would be the best person to answer this
question. CC'ing Pranith and gluster-devel.

Krish

- Original Message -

hi  Krishnan Parthasarathi

Do you tell me which glusterfs-version has great improvement for glusterfs
split-brain problem?
Can you tell me the relevant links?

thank you very much!




justgluste...@gmail.com



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] glusterfs split-brain problem

2014-06-12 Thread Krishnan Parthasarathi
Hi,
Pranith, who is the AFR maintainer, would be the best person to answer this 
question. CC'ing Pranith and gluster-devel.

Krish

- Original Message -
> hi  Krishnan Parthasarathi
> 
> Do you tell me which glusterfs-version has great improvement for glusterfs
> split-brain problem?
> Can you tell me the relevant links?
> 
> thank you very much!
> 
> 
> 
> 
> justgluste...@gmail.com
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread RAGHAVENDRA TALUR
Thank you Pranith :)


On Fri, Jun 13, 2014 at 9:18 AM, Justin Clift  wrote:

> Thanks. :)
>
> + Justin
>
>
> On 13/06/2014, at 4:46 AM, Pranith Kumar Karampuri wrote:
> > Found the issue. May take a while to get to it. May be we should
> redesign rename in afr for this. For now reverted it
> >
> > Pranith
> > On 06/13/2014 02:16 AM, Justin Clift wrote:
> >> This one seems to be happening a lot now.  The last 3 failures (across
> >> different nodes) were from this test.
> >>
> >> Log files here:
> >>
> >>
> http://slave2.cloud.gluster.org/logs/glusterfs-logs-20140612%3a18%3a53%3a06.tgz
> >>
> >> (am installing Nginx on the slaves now, for easy log retrieval as
> recommended
> >> by Kaushal M)
> >>
> >> + Justin
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>



-- 
*Raghavendra Talur *
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Justin Clift
Thanks. :)

+ Justin


On 13/06/2014, at 4:46 AM, Pranith Kumar Karampuri wrote:
> Found the issue. May take a while to get to it. May be we should redesign 
> rename in afr for this. For now reverted it
> 
> Pranith
> On 06/13/2014 02:16 AM, Justin Clift wrote:
>> This one seems to be happening a lot now.  The last 3 failures (across
>> different nodes) were from this test.
>> 
>> Log files here:
>> 
>>   
>> http://slave2.cloud.gluster.org/logs/glusterfs-logs-20140612%3a18%3a53%3a06.tgz
>> 
>> (am installing Nginx on the slaves now, for easy log retrieval as recommended
>> by Kaushal M)
>> 
>> + Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Pranith Kumar Karampuri
Found the issue. May take a while to get to it. May be we should 
redesign rename in afr for this. For now reverted it


Pranith
On 06/13/2014 02:16 AM, Justin Clift wrote:

This one seems to be happening a lot now.  The last 3 failures (across
different nodes) were from this test.

Log files here:

   
http://slave2.cloud.gluster.org/logs/glusterfs-logs-20140612%3a18%3a53%3a06.tgz

(am installing Nginx on the slaves now, for easy log retrieval as recommended
by Kaushal M)

+ Justin


On 12/06/2014, at 1:23 PM, Pranith Kumar Karampuri wrote:

Thanks for reporting. Will take a look.

Pranith

On 06/12/2014 05:52 PM, Raghavendra Talur wrote:

Hi Pranith,

This test failed for my patch set today and seems to be a spurious failure.
Here is the console output for the run.
http://build.gluster.org/job/rackspace-regression/107/consoleFull

Could you please have a look at it?

--
Thanks!
Raghavendra Talur | Red Hat Storage Developer | Bangalore |+918039245176


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Justin Clift
On 12/06/2014, at 6:47 PM, Pranith Kumar Karampuri wrote:
> On 06/12/2014 11:16 PM, Anand Avati wrote:

>> The client can actually be fixed to be compatible with both old and new 
>> servers. We can change the errno from ESTALE to ENOENT before doing the GFID 
>> mismatch check in client_lookup_cbk.
> But this will require a two hop upgrade. Is that normal/acceptable?


Thoughts on this from Corvid Tech:

> From: "David F. Robinson" 
> Subject: Re: Rolling upgrades discussion on mailing list
> Date: 12 June 2014 11:01:27 PM GMT+01:00
> 
> Read through the discussion.  Not sure I fully understood it, but here are my 
> thoughts regardless...
> 
> - We have roughly 1000-nodes (HPC setup) and 100 workstations that hopefully 
> will be using gluster as the primary storage system (if we can resolve the 
> few outstanding issues)... If these all had gluster-3.4.x rpm's and I wanted 
> to upgrade to 3.5.x rpm's, I don't particular care if I had to incrementally 
> update the packages (3.4.5 --> 3.4.6 --> 3.5.0 --> 3.5.1).  Requiring a 
> minimum version seems reasonable and workable to me as long as there is a 
> documented process to update.
> - What I wouldn't want to do if it was avoidable was to have to stop all i/o 
> on all of the hpc-nodes and workstations for the update.  If I could do 
> incremental 'rpm -Uvh' for the various versions "without" killing the 
> currently ongoing i/o, that is what would be most important.  It wasn't clear 
> to me if this was not possible at all, or not possible unless you upgraded 
> all of the clients before upgrading the server.
> - If the problem is that you have to update all of the clients BEFORE 
> updating the server, that doesn't sound unworkable as long as there is  a 
> mechanism to list all clients that are connected and display the gluster 
> version each client is using.
> - Also, a clearly documented process to update would be great.  We have lots 
> of questions and taking the entire storage device offline to do an upgrade 
> isn't a really feasible.  My assumption after reading the link you sent is 
> that we would have to upgrade all of the client software (assumes that a 
> newer client can still write to an older server version... i.e. client 3.5.2 
> can write to a 3.4.5 server).  This isn't a big deal and seems reasonable, if 
> there is some method to find all of the connected clients.
> 
> - From the server side, I would like to be able to take one of the 
> nodes/bricks offline nicely (i.e. have the current i/o finish, accept no more 
> i/o requests, and then take the node offline for writes) and do the update. 
> Then bring that node back up, let the heal finish, and move on to the next 
> node/brick.  We are told that simply taking a node offline will not interrupt 
> the i/o if you are using a replica.  We have experienced mixed results with 
> this.  I believe that the active i/o to the brick fails but additional i/o 
> fails over to the other brick.  A nicer process to take a brick offline would 
> be desirable.
>- The other suggestion I have seen for server software updates is to 
> migrate all of the data from a node/brick to the other bricks of the system 
> (gluster remove-brick xxx), remove this brick from the pool, do the software 
> update, and then add it back to the pool. The issue here is that each of our 
> bricks is roughly 100TB. Moving all of that data to the other bricks in the 
> system will take a very long time. Not a practical solution for us...
> 
> 
> I don't mind having to use a process (incrementally upgrade all clients, take 
> bricks offline, update bricks),  as long as I didn't have to turn off all i/o 
> to the primary storage system for the entire company in order to execute an 
> upgrade... The number one requirement for us would be to have a process to do 
> an upgrade on a "live" system... Who cares if the process is painful... That 
> is the IT guys problem :)...


+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Justin Clift
This one seems to be happening a lot now.  The last 3 failures (across
different nodes) were from this test.

Log files here:

  
http://slave2.cloud.gluster.org/logs/glusterfs-logs-20140612%3a18%3a53%3a06.tgz

(am installing Nginx on the slaves now, for easy log retrieval as recommended
by Kaushal M)

+ Justin


On 12/06/2014, at 1:23 PM, Pranith Kumar Karampuri wrote:
> Thanks for reporting. Will take a look.
> 
> Pranith
> 
> On 06/12/2014 05:52 PM, Raghavendra Talur wrote:
>> Hi Pranith,
>> 
>> This test failed for my patch set today and seems to be a spurious failure.
>> Here is the console output for the run.
>> http://build.gluster.org/job/rackspace-regression/107/consoleFull
>> 
>> Could you please have a look at it?
>> 
>> -- 
>> Thanks!
>> Raghavendra Talur | Red Hat Storage Developer | Bangalore |+918039245176
>> 
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Please don't start jobs on rackspace-regression yet

2014-06-12 Thread Justin Clift
Hi all,

Please don't start rackspace-regression testing jobs in Jenkins
yet.

I'm very manually changing settings on them, running a job, then
rebooting each node (not fun) while I try to make them more
reliable.

If you want a Gerrit CR run on one of the nodes, let me know
which one(s) and I'll get it done.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Anand Avati
On Thu, Jun 12, 2014 at 10:47 AM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
> On 06/12/2014 11:16 PM, Anand Avati wrote:
>
>
>
>
> On Thu, Jun 12, 2014 at 10:39 AM, Ravishankar N 
> wrote:
>
>> On 06/12/2014 08:19 PM, Justin Clift wrote:
>>
>>> On 12/06/2014, at 2:22 PM, Ravishankar N wrote:
>>> 
>>>
 But we will still hit the problem when rolling upgrade is performed
 from 3.4 to 3.5,  unless the clients are also upgraded to 3.5

>>>
>>> Could we introduce a client side patch into (say) 3.4.5 that helps
>>> with this?
>>>
>>  But the client side patch is needed only if Avati's server (posix) fix
>> is present. And that is present only in 3.5 and not 3.4 .
>
>
>
>  The client can actually be fixed to be compatible with both old and new
> servers. We can change the errno from ESTALE to ENOENT before doing the
> GFID mismatch check in client_lookup_cbk.
>
> But this will require a two hop upgrade. Is that normal/acceptable?
>

One hop is always better than two hops. But at least we have a way out. It
is not at all unreasonable to have clients be of a minimum version for
rolling upgrades to work (upgrade still works no matter what the client
versions are if they are doing read-only access).
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Pranith Kumar Karampuri


On 06/12/2014 11:16 PM, Anand Avati wrote:




On Thu, Jun 12, 2014 at 10:39 AM, Ravishankar N 
mailto:ravishan...@redhat.com>> wrote:


On 06/12/2014 08:19 PM, Justin Clift wrote:

On 12/06/2014, at 2:22 PM, Ravishankar N wrote:


But we will still hit the problem when rolling upgrade is
performed
from 3.4 to 3.5,  unless the clients are also upgraded to 3.5


Could we introduce a client side patch into (say) 3.4.5 that helps
with this?

But the client side patch is needed only if Avati's server (posix)
fix is present. And that is present only in 3.5 and not 3.4 .



The client can actually be fixed to be compatible with both old and 
new servers. We can change the errno from ESTALE to ENOENT before 
doing the GFID mismatch check in client_lookup_cbk.

But this will require a two hop upgrade. Is that normal/acceptable?

Pranith


Thanks


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Justin Clift
On 12/06/2014, at 6:39 PM, Ravishankar N wrote:
> On 06/12/2014 08:19 PM, Justin Clift wrote:
>> On 12/06/2014, at 2:22 PM, Ravishankar N wrote:
>> 
>>> But we will still hit the problem when rolling upgrade is performed
>>> from 3.4 to 3.5,  unless the clients are also upgraded to 3.5
>> 
>> Could we introduce a client side patch into (say) 3.4.5 that helps
>> with this?
> But the client side patch is needed only if Avati's server (posix) fix is 
> present. And that is present only in 3.5 and not 3.4 .

No worries.  Was a thought. :)

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Anand Avati
On Thu, Jun 12, 2014 at 10:39 AM, Ravishankar N 
wrote:

> On 06/12/2014 08:19 PM, Justin Clift wrote:
>
>> On 12/06/2014, at 2:22 PM, Ravishankar N wrote:
>> 
>>
>>> But we will still hit the problem when rolling upgrade is performed
>>> from 3.4 to 3.5,  unless the clients are also upgraded to 3.5
>>>
>>
>> Could we introduce a client side patch into (say) 3.4.5 that helps
>> with this?
>>
> But the client side patch is needed only if Avati's server (posix) fix is
> present. And that is present only in 3.5 and not 3.4 .



The client can actually be fixed to be compatible with both old and new
servers. We can change the errno from ESTALE to ENOENT before doing the
GFID mismatch check in client_lookup_cbk.

Thanks
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Anand Avati
On Thu, Jun 12, 2014 at 10:33 AM, Vijay Bellur  wrote:

> On 06/12/2014 06:52 PM, Ravishankar N wrote:
>
>> Hi Vijay,
>>
>> Since glusterfs 3.5, posix_lookup() sends ESTALE instead of ENOENT [1]
>> when when a parent gfid (entry) is not present on the brick . In a
>> replicate set up, this causes a problem because AFR gives more priority
>> to ESTALE than ENOENT, causing IO to fail [2]. The fix is in progress at
>> [3] and is client-side specific , and would make it to 3.5.2
>>
>> But we will still hit the problem when rolling upgrade is performed from
>> 3.4 to 3.5,  unless the clients are also upgraded to 3.5: To elaborate
>> an example:
>>
>> 0) Create a 1x2 volume using 2 nodes and mount it from client. All
>> machines are glusterfs 3.4
>> 1) Perform for i in {1..30}; do mkdir $i; tar xf glusterfs-3.5git.tar.gz
>> -C $i& done
>> 2) While this is going on, kill one of the node in the replica pair and
>> upgrade it to glusterfs 3.5 (simulating rolling upgrade)
>> 3) After a while, kill all tar processes
>> 4) Create a backup directory and move all 1..30 dirs inside 'backup'
>> 5) Start the untar processes in 1) again
>> 6) Bring up the upgraded node. Tar fails with estale errors.
>>
>> Essentially the errors occur because [3] is a client side fix. But
>> rolling upgrades are targeted at servers while the older clients still
>> need to access them without issues.
>>
>> A solution is to have a fix in the posix translator wherein the newer
>> client passes it's version (3.5) to posix_lookup() which then sends
>> ESTALE if version is 3.5 or newer but sends ENOENT instead if it is an
>> older client. Does this seem okay?
>>
>>
> Cannot think of a better solution to this. Seamless rolling upgrades are
> necessary for us and the proposed fix does seem okay for that reason.
>
> Thanks,
> Vijay
>
>
I also like Justin's proposal, of having fixes in 3.4.X and requiring
clients to be at least 3.4.X in order to have rolling upgrade to 3.5.Y.
This way we can add the "special fix" in 3.4.X client (just like the 3.5.2
client). Ravi's proposal "works", but all LOOKUPs will have an extra xattr,
and we will be carrying forward the compat code burden for a very long
time. Whereas a 3.4.X client fix will remain in 3.4 branch.

Thanks
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Pranith Kumar Karampuri


On 06/12/2014 11:09 PM, Ravishankar N wrote:

On 06/12/2014 08:19 PM, Justin Clift wrote:

On 12/06/2014, at 2:22 PM, Ravishankar N wrote:


But we will still hit the problem when rolling upgrade is performed
from 3.4 to 3.5,  unless the clients are also upgraded to 3.5


Could we introduce a client side patch into (say) 3.4.5 that helps
with this?
But the client side patch is needed only if Avati's server (posix) fix 
is present. And that is present only in 3.5 and not 3.4 .



Then mandate that 3.4 -> 3.5 rolling upgrades have to be on 3.4.5
first?
The idea of a rolling upgrade is that customers don't have to stop 
their (hundreds of ?) clients from accessing the gluster volume and 
the replica configuration ensures HA during rolling upgrade of 
servers. When the can afford downtime of their application, then they 
stop the clients and upgrade them as well. So the rolling upgrade 
solution has to be  independent of client side features.


3.4.x -> 3.4.5 should be "no issues" yeah?


After the fix on the server side i.e. posix xlator, no issue.

Pranith



+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Ravishankar N

On 06/12/2014 08:19 PM, Justin Clift wrote:

On 12/06/2014, at 2:22 PM, Ravishankar N wrote:


But we will still hit the problem when rolling upgrade is performed
from 3.4 to 3.5,  unless the clients are also upgraded to 3.5


Could we introduce a client side patch into (say) 3.4.5 that helps
with this?
But the client side patch is needed only if Avati's server (posix) fix 
is present. And that is present only in 3.5 and not 3.4 .



Then mandate that 3.4 -> 3.5 rolling upgrades have to be on 3.4.5
first?
The idea of a rolling upgrade is that customers don't have to stop their 
(hundreds of ?) clients from accessing the gluster volume and the 
replica configuration ensures HA during rolling upgrade of servers. When 
the can afford downtime of their application, then they stop the clients 
and upgrade them as well. So the rolling upgrade solution has to be  
independent of client side features.


3.4.x -> 3.4.5 should be "no issues" yeah?

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Vijay Bellur

On 06/12/2014 06:52 PM, Ravishankar N wrote:

Hi Vijay,

Since glusterfs 3.5, posix_lookup() sends ESTALE instead of ENOENT [1]
when when a parent gfid (entry) is not present on the brick . In a
replicate set up, this causes a problem because AFR gives more priority
to ESTALE than ENOENT, causing IO to fail [2]. The fix is in progress at
[3] and is client-side specific , and would make it to 3.5.2

But we will still hit the problem when rolling upgrade is performed from
3.4 to 3.5,  unless the clients are also upgraded to 3.5: To elaborate
an example:

0) Create a 1x2 volume using 2 nodes and mount it from client. All
machines are glusterfs 3.4
1) Perform for i in {1..30}; do mkdir $i; tar xf glusterfs-3.5git.tar.gz
-C $i& done
2) While this is going on, kill one of the node in the replica pair and
upgrade it to glusterfs 3.5 (simulating rolling upgrade)
3) After a while, kill all tar processes
4) Create a backup directory and move all 1..30 dirs inside 'backup'
5) Start the untar processes in 1) again
6) Bring up the upgraded node. Tar fails with estale errors.

Essentially the errors occur because [3] is a client side fix. But
rolling upgrades are targeted at servers while the older clients still
need to access them without issues.

A solution is to have a fix in the posix translator wherein the newer
client passes it's version (3.5) to posix_lookup() which then sends
ESTALE if version is 3.5 or newer but sends ENOENT instead if it is an
older client. Does this seem okay?



Cannot think of a better solution to this. Seamless rolling upgrades are 
necessary for us and the proposed fix does seem okay for that reason.


Thanks,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] bug-857330/normal.t failure

2014-06-12 Thread Pranith Kumar Karampuri

Kaushal,
Could you check if this is this the same rebalance failure we 
discovered?


Pranith
On 06/12/2014 10:35 PM, Justin Clift wrote:

This one seems like a "proper" failure.  Is it on your radar?

   Test Summary Report
   ---
   ./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests: 24 Failed: 
1)
 Failed test:  13

   http://build.gluster.org/job/rackspace-regression/123/console

I've disconnected that slave from Jenkins, so it can be logged
into remotely (via SSH) and checked, if that's helpful.

Let me know, and I'll send you the ssh password.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] bug-857330/normal.t failure

2014-06-12 Thread Justin Clift
This one seems like a "proper" failure.  Is it on your radar?

  Test Summary Report
  ---
  ./tests/bugs/bug-857330/normal.t(Wstat: 0 Tests: 24 Failed: 1)
Failed test:  13

  http://build.gluster.org/job/rackspace-regression/123/console

I've disconnected that slave from Jenkins, so it can be logged
into remotely (via SSH) and checked, if that's helpful.

Let me know, and I'll send you the ssh password.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Spurious regression of tests/basic/mgmt_v3-locks.t

2014-06-12 Thread Pranith Kumar Karampuri

Avra/Poornima,
 Please look into this.

Patch ==> http://review.gluster.com/#/c/6483/9
Author==>  Poornima pguru...@redhat.com
Build triggered by==> amarts
Build-url ==> 
http://build.gluster.org/job/regression/4847/consoleFull
Download-log-at   ==> 
http://build.gluster.org:443/logs/regression/glusterfs-logs-20140612:08:37:44.tgz

Test written by   ==> Author: Avra Sengupta 

./tests/basic/mgmt_v3-locks.t [11, 12, 13]
0 #!/bin/bash
1
2 . $(dirname $0)/../include.rc
3 . $(dirname $0)/../cluster.rc
4
5 function check_peers {
6 $CLI_1 peer status | grep 'Peer in Cluster (Connected)' | 
wc -l

7 }
8
9 function volume_count {
   10 local cli=$1;
   11 if [ $cli -eq '1' ] ; then
   12 $CLI_1 volume info | grep 'Volume Name' | wc -l;
   13 else
   14 $CLI_2 volume info | grep 'Volume Name' | wc -l;
   15 fi
   16 }
   17
   18 function volinfo_field()
   19 {
   20 local vol=$1;
   21 local field=$2;
   22
   23 $CLI_1 volume info $vol | grep "^$field: " | sed 's/.*: //';
   24 }
   25
   26 function two_diff_vols_create {
   27 # Both volume creates should be successful
   28 $CLI_1 volume create $V0 $H1:$B1/$V0 $H2:$B2/$V0 
$H3:$B3/$V0 &

   29 PID_1=$!
   30
   31 $CLI_2 volume create $V1 $H1:$B1/$V1 $H2:$B2/$V1 
$H3:$B3/$V1 &

   32 PID_2=$!
   33
   34 wait $PID_1 $PID_2
   35 }
   36
   37 function two_diff_vols_start {
   38 # Both volume starts should be successful
   39 $CLI_1 volume start $V0 &
   40 PID_1=$!
   41
   42 $CLI_2 volume start $V1 &
   43 PID_2=$!
   44
   45 wait $PID_1 $PID_2
   46 }
   47
   48 function two_diff_vols_stop_force {
   49 # Force stop, so that if rebalance from the
   50 # remove bricks is in progress, stop can
   51 # still go ahead. Both volume stops should
   52 # be successful
   53 $CLI_1 volume stop $V0 force &
   54 PID_1=$!
   55
   56 $CLI_2 volume stop $V1 force &
   57 PID_2=$!
   58
   59 wait $PID_1 $PID_2
   60 }
   61
   62 function same_vol_remove_brick {
   63
   64 # Running two same vol commands at the same time can 
result in
   65 # two success', two failures, or one success and one 
failure, all
   66 # of which are valid. The only thing that shouldn't 
happen is a

   67 # glusterd crash.
   68
   69 local vol=$1
   70 local brick=$2
   71 $CLI_1 volume remove-brick $1 $2 start &
   72 $CLI_2 volume remove-brick $1 $2 start
   73 }
   74
   75 cleanup;
   76
   77 TEST launch_cluster 3;
   78 TEST $CLI_1 peer probe $H2;
   79 TEST $CLI_1 peer probe $H3;
   80
   81 EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers
   82
   83 two_diff_vols_create
   84 EXPECT 'Created' volinfo_field $V0 'Status';
   85 EXPECT 'Created' volinfo_field $V1 'Status';
   86
   87 two_diff_vols_start
   88 EXPECT 'Started' volinfo_field $V0 'Status';
   89 EXPECT 'Started' volinfo_field $V1 'Status';
   90
   91 same_vol_remove_brick $V0 $H2:$B2/$V0
   92 # Checking glusterd crashed or not after same volume remove brick
   93 # on both nodes.
   94 EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers
   95
   96 same_vol_remove_brick $V1 $H2:$B2/$V1
   97 # Checking glusterd crashed or not after same volume remove brick
   98 # on both nodes.
   99 EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers
  100
  101 $CLI_1 volume set $V0 diagnostics.client-log-level DEBUG &
  102 $CLI_1 volume set $V1 diagnostics.client-log-level DEBUG
  103 kill_glusterd 3
  104 $CLI_1 volume status $V0
  105 $CLI_2 volume status $V1
  106 $CLI_1 peer status
**107 EXPECT_WITHIN $PROBE_TIMEOUT 1 check_peers
**108 EXPECT 'Started' volinfo_field $V0 'Status';
**109 EXPECT 'Started' volinfo_field $V1 'Status';
  110
  111 TEST $glusterd_3
  112 $CLI_1 volume status $V0
  113 $CLI_2 volume status $V1
  114 $CLI_1 peer status
  115 #EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers
  116 #EXPECT 'Started' volinfo_field $V0 'Status';
  117 #EXPECT 'Started' volinfo_field $V1 'Status';
  118 #two_diff_vols_stop_force
  119 #EXPECT_WITHIN $PROBE_TIMEOUT 2 check_peers
  120 cleanup;

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Justin Clift
On 12/06/2014, at 2:22 PM, Ravishankar N wrote:

> But we will still hit the problem when rolling upgrade is performed
> from 3.4 to 3.5,  unless the clients are also upgraded to 3.5


Could we introduce a client side patch into (say) 3.4.5 that helps
with this?

Then mandate that 3.4 -> 3.5 rolling upgrades have to be on 3.4.5
first?

3.4.x -> 3.4.5 should be "no issues" yeah?

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Please use http://build.gluster.org/job/rackspace-regression/

2014-06-12 Thread Justin Clift
On 12/06/2014, at 10:22 AM, Pranith Kumar Karampuri wrote:
> hi Guys,
> Rackspace slaves are in action now, thanks to Justin. Please use the URL 
> in Subject to run the regressions. I already shifted some jobs to rackspace.


Good thinking, but please hold off on this for now.

The slaves are hugely unreliable (lots of hanging), at the
moment. :(

Rebooting each slave after each run seems to help, but that's
not a real solution.

I'll be adjusting and tweaking their settings throughout the day
in order to improve it.

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-12 Thread Ravishankar N

Hi Vijay,

Since glusterfs 3.5, posix_lookup() sends ESTALE instead of ENOENT [1] 
when when a parent gfid (entry) is not present on the brick . In a 
replicate set up, this causes a problem because AFR gives more priority 
to ESTALE than ENOENT, causing IO to fail [2]. The fix is in progress at 
[3] and is client-side specific , and would make it to 3.5.2


But we will still hit the problem when rolling upgrade is performed from 
3.4 to 3.5,  unless the clients are also upgraded to 3.5: To elaborate 
an example:


0) Create a 1x2 volume using 2 nodes and mount it from client. All 
machines are glusterfs 3.4
1) Perform for i in {1..30}; do mkdir $i; tar xf glusterfs-3.5git.tar.gz 
-C $i& done
2) While this is going on, kill one of the node in the replica pair and 
upgrade it to glusterfs 3.5 (simulating rolling upgrade)

3) After a while, kill all tar processes
4) Create a backup directory and move all 1..30 dirs inside 'backup'
5) Start the untar processes in 1) again
6) Bring up the upgraded node. Tar fails with estale errors.

Essentially the errors occur because [3] is a client side fix. But 
rolling upgrades are targeted at servers while the older clients still 
need to access them without issues.


A solution is to have a fix in the posix translator wherein the newer 
client passes it's version (3.5) to posix_lookup() which then sends 
ESTALE if version is 3.5 or newer but sends ENOENT instead if it is an 
older client. Does this seem okay?


[1] http://review.gluster.org/6318
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1106408
[3] http://review.gluster.org/#/c/8015/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Pranith Kumar Karampuri

Thanks for reporting. Will take a look.

Pranith

On 06/12/2014 05:52 PM, Raghavendra Talur wrote:

Hi Pranith,

This test failed for my patch set today and seems to be a spurious 
failure.

Here is the console output for the run.
http://build.gluster.org/job/rackspace-regression/107/consoleFull

Could you please have a look at it?

--
Thanks!
Raghavendra Talur | Red Hat Storage Developer | Bangalore |+918039245176



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Spurious regression test failure in ./tests/bugs/bug-1101143.t

2014-06-12 Thread Raghavendra Talur
Hi Pranith, 

This test failed for my patch set today and seems to be a spurious failure. 
Here is the console output for the run. 
http://build.gluster.org/job/rackspace-regression/107/consoleFull 

Could you please have a look at it? 

-- 
Thanks! 
Raghavendra Talur | Red Hat Storage Developer | Bangalore | +918039245176 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious regression failure in tests/bugs/bug-1104642.t

2014-06-12 Thread Pranith Kumar Karampuri

Thanks a lot for quick resolution Sachin

Pranith
On 06/12/2014 04:38 PM, Sachin Pandit wrote:

http://review.gluster.org/#/c/8041/ is merged upstream.

~ Sachin.
- Original Message -
From: "Sachin Pandit" 
To: "Raghavendra Talur" 
Cc: "Pranith Kumar Karampuri" , "Gluster Devel" 

Sent: Thursday, June 12, 2014 12:58:44 PM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Patch link http://review.gluster.org/#/c/8041/.

~ Sachin.

- Original Message -
From: "Raghavendra Talur" 
To: "Pranith Kumar Karampuri" 
Cc: "Sachin Pandit" , "Gluster Devel" 

Sent: Thursday, June 12, 2014 10:46:14 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Sachin and I looked at the failure.

Current guess is that glusterd_2 had not yet completed the handshake with
glusterd_1 and hence did not know about the option set.

KP suggested that instead of having a sleep before this command,
we could get peer status and verify that it is 1 and then get the
vol info. Although even this does not make the test fully deterministic,
we will be closer to it. Sachin will send out a patch for the same.

Raghavendra Talur

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Sachin Pandit" 
Cc: "Gluster Devel" 
Sent: Thursday, June 12, 2014 9:54:03 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Check the logs to find the reason.

Pranith.
On 06/12/2014 09:24 AM, Sachin Pandit wrote:

I am not hitting this even after running the test case in a loop.
I'll update in this thread once I find out the root cause of the failure.

~ Sachin

- Original Message -
From: "Sachin Pandit" 
To: "Pranith Kumar Karampuri" 
Cc: "Gluster Devel" 
Sent: Thursday, June 12, 2014 8:50:40 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

I will look into this.

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Gluster Devel" 
Cc: rta...@redhat.com, span...@redhat.com
Sent: Wednesday, June 11, 2014 9:08:44 PM
Subject: spurious regression failure in tests/bugs/bug-1104642.t

Raghavendra/Sachin,
Could one of you guys take a look at this please.

pk1@localhost - ~/workspace/gerrit-repo (master)
21:04:46 :) ⚡ ~/.scripts/regression.py
http://build.gluster.org/job/regression/4831/consoleFull
Patch ==> http://review.gluster.com/#/c/7994/2
Author ==> Raghavendra Talur rta...@redhat.com
Build triggered by ==> amarts
Build-url ==> http://build.gluster.org/job/regression/4831/consoleFull
Download-log-at ==>
http://build.gluster.org:443/logs/regression/glusterfs-logs-20140611:08:39:04.tgz
Test written by ==> Author: Sachin Pandit 

./tests/bugs/bug-1104642.t [13]
0 #!/bin/bash
1
2 . $(dirname $0)/../include.rc
3 . $(dirname $0)/../volume.rc
4 . $(dirname $0)/../cluster.rc
5
6
7 function get_value()
8 {
9 local key=$1
10 local var="CLI_$2"
11
12 eval cli_index=\$$var
13
14 $cli_index volume info | grep "^$key"\
15 | sed 's/.*: //'
16 }
17
18 cleanup
19
20 TEST launch_cluster 2
21
22 TEST $CLI_1 peer probe $H2;
23 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
24
25 TEST $CLI_1 volume create $V0 $H1:$B1/${V0}0 $H2:$B2/${V0}1
26 EXPECT "$V0" get_value 'Volume Name' 1
27 EXPECT "Created" get_value 'Status' 1
28
29 TEST $CLI_1 volume start $V0
30 EXPECT "Started" get_value 'Status' 1
31
32 #Bring down 2nd glusterd
33 TEST kill_glusterd 2
34
35 #set the volume all options from the 1st glusterd
36 TEST $CLI_1 volume set all cluster.server-quorum-ratio 80
37
38 #Bring back the 2nd glusterd
39 TEST $glusterd_2
40
41 #Verify whether the value has been synced
42 EXPECT '80' get_value 'cluster.server-quorum-ratio' 1
***43 EXPECT '80' get_value 'cluster.server-quorum-ratio' 2
44
45 cleanup;

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious regression failure in tests/bugs/bug-1104642.t

2014-06-12 Thread Sachin Pandit
http://review.gluster.org/#/c/8041/ is merged upstream.

~ Sachin.
- Original Message -
From: "Sachin Pandit" 
To: "Raghavendra Talur" 
Cc: "Pranith Kumar Karampuri" , "Gluster Devel" 

Sent: Thursday, June 12, 2014 12:58:44 PM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Patch link http://review.gluster.org/#/c/8041/.

~ Sachin.

- Original Message -
From: "Raghavendra Talur" 
To: "Pranith Kumar Karampuri" 
Cc: "Sachin Pandit" , "Gluster Devel" 

Sent: Thursday, June 12, 2014 10:46:14 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Sachin and I looked at the failure.

Current guess is that glusterd_2 had not yet completed the handshake with
glusterd_1 and hence did not know about the option set.

KP suggested that instead of having a sleep before this command,
we could get peer status and verify that it is 1 and then get the 
vol info. Although even this does not make the test fully deterministic,
we will be closer to it. Sachin will send out a patch for the same.

Raghavendra Talur 

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Sachin Pandit" 
Cc: "Gluster Devel" 
Sent: Thursday, June 12, 2014 9:54:03 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Check the logs to find the reason.

Pranith.
On 06/12/2014 09:24 AM, Sachin Pandit wrote:
> I am not hitting this even after running the test case in a loop.
> I'll update in this thread once I find out the root cause of the failure.
>
> ~ Sachin
>
> - Original Message -
> From: "Sachin Pandit" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Thursday, June 12, 2014 8:50:40 AM
> Subject: Re: [Gluster-devel] spurious regression failure in   
> tests/bugs/bug-1104642.t
>
> I will look into this.
>
> - Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Gluster Devel" 
> Cc: rta...@redhat.com, span...@redhat.com
> Sent: Wednesday, June 11, 2014 9:08:44 PM
> Subject: spurious regression failure in tests/bugs/bug-1104642.t
>
> Raghavendra/Sachin,
> Could one of you guys take a look at this please.
>
> pk1@localhost - ~/workspace/gerrit-repo (master)
> 21:04:46 :) ⚡ ~/.scripts/regression.py
> http://build.gluster.org/job/regression/4831/consoleFull
> Patch ==> http://review.gluster.com/#/c/7994/2
> Author ==> Raghavendra Talur rta...@redhat.com
> Build triggered by ==> amarts
> Build-url ==> http://build.gluster.org/job/regression/4831/consoleFull
> Download-log-at ==>
> http://build.gluster.org:443/logs/regression/glusterfs-logs-20140611:08:39:04.tgz
> Test written by ==> Author: Sachin Pandit 
>
> ./tests/bugs/bug-1104642.t [13]
> 0 #!/bin/bash
> 1
> 2 . $(dirname $0)/../include.rc
> 3 . $(dirname $0)/../volume.rc
> 4 . $(dirname $0)/../cluster.rc
> 5
> 6
> 7 function get_value()
> 8 {
> 9 local key=$1
> 10 local var="CLI_$2"
> 11
> 12 eval cli_index=\$$var
> 13
> 14 $cli_index volume info | grep "^$key"\
> 15 | sed 's/.*: //'
> 16 }
> 17
> 18 cleanup
> 19
> 20 TEST launch_cluster 2
> 21
> 22 TEST $CLI_1 peer probe $H2;
> 23 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
> 24
> 25 TEST $CLI_1 volume create $V0 $H1:$B1/${V0}0 $H2:$B2/${V0}1
> 26 EXPECT "$V0" get_value 'Volume Name' 1
> 27 EXPECT "Created" get_value 'Status' 1
> 28
> 29 TEST $CLI_1 volume start $V0
> 30 EXPECT "Started" get_value 'Status' 1
> 31
> 32 #Bring down 2nd glusterd
> 33 TEST kill_glusterd 2
> 34
> 35 #set the volume all options from the 1st glusterd
> 36 TEST $CLI_1 volume set all cluster.server-quorum-ratio 80
> 37
> 38 #Bring back the 2nd glusterd
> 39 TEST $glusterd_2
> 40
> 41 #Verify whether the value has been synced
> 42 EXPECT '80' get_value 'cluster.server-quorum-ratio' 1
> ***43 EXPECT '80' get_value 'cluster.server-quorum-ratio' 2
> 44
> 45 cleanup;
>
> Pranith
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

-- 
Thanks! 
Raghavendra Talur | Red Hat Storage Developer | Bangalore | +918039245176 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Request for merging patch

2014-06-12 Thread Vijay Bellur

On 06/12/2014 01:35 PM, Pranith Kumar Karampuri wrote:

Vijay,
Could you merge this patch please.

http://review.gluster.org/7928



Done, thanks.

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Please use http://build.gluster.org/job/rackspace-regression/

2014-06-12 Thread Pranith Kumar Karampuri

hi Guys,
 Rackspace slaves are in action now, thanks to Justin. Please use 
the URL in Subject to run the regressions. I already shifted some jobs 
to rackspace.


Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Request for merging patch

2014-06-12 Thread Pranith Kumar Karampuri

Vijay,
   Could you merge this patch please.

http://review.gluster.org/7928

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] spurious regression failure in tests/bugs/bug-1104642.t

2014-06-12 Thread Sachin Pandit
Patch link http://review.gluster.org/#/c/8041/.

~ Sachin.

- Original Message -
From: "Raghavendra Talur" 
To: "Pranith Kumar Karampuri" 
Cc: "Sachin Pandit" , "Gluster Devel" 

Sent: Thursday, June 12, 2014 10:46:14 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Sachin and I looked at the failure.

Current guess is that glusterd_2 had not yet completed the handshake with
glusterd_1 and hence did not know about the option set.

KP suggested that instead of having a sleep before this command,
we could get peer status and verify that it is 1 and then get the 
vol info. Although even this does not make the test fully deterministic,
we will be closer to it. Sachin will send out a patch for the same.

Raghavendra Talur 

- Original Message -
From: "Pranith Kumar Karampuri" 
To: "Sachin Pandit" 
Cc: "Gluster Devel" 
Sent: Thursday, June 12, 2014 9:54:03 AM
Subject: Re: [Gluster-devel] spurious regression failure in 
tests/bugs/bug-1104642.t

Check the logs to find the reason.

Pranith.
On 06/12/2014 09:24 AM, Sachin Pandit wrote:
> I am not hitting this even after running the test case in a loop.
> I'll update in this thread once I find out the root cause of the failure.
>
> ~ Sachin
>
> - Original Message -
> From: "Sachin Pandit" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Thursday, June 12, 2014 8:50:40 AM
> Subject: Re: [Gluster-devel] spurious regression failure in   
> tests/bugs/bug-1104642.t
>
> I will look into this.
>
> - Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Gluster Devel" 
> Cc: rta...@redhat.com, span...@redhat.com
> Sent: Wednesday, June 11, 2014 9:08:44 PM
> Subject: spurious regression failure in tests/bugs/bug-1104642.t
>
> Raghavendra/Sachin,
> Could one of you guys take a look at this please.
>
> pk1@localhost - ~/workspace/gerrit-repo (master)
> 21:04:46 :) ⚡ ~/.scripts/regression.py
> http://build.gluster.org/job/regression/4831/consoleFull
> Patch ==> http://review.gluster.com/#/c/7994/2
> Author ==> Raghavendra Talur rta...@redhat.com
> Build triggered by ==> amarts
> Build-url ==> http://build.gluster.org/job/regression/4831/consoleFull
> Download-log-at ==>
> http://build.gluster.org:443/logs/regression/glusterfs-logs-20140611:08:39:04.tgz
> Test written by ==> Author: Sachin Pandit 
>
> ./tests/bugs/bug-1104642.t [13]
> 0 #!/bin/bash
> 1
> 2 . $(dirname $0)/../include.rc
> 3 . $(dirname $0)/../volume.rc
> 4 . $(dirname $0)/../cluster.rc
> 5
> 6
> 7 function get_value()
> 8 {
> 9 local key=$1
> 10 local var="CLI_$2"
> 11
> 12 eval cli_index=\$$var
> 13
> 14 $cli_index volume info | grep "^$key"\
> 15 | sed 's/.*: //'
> 16 }
> 17
> 18 cleanup
> 19
> 20 TEST launch_cluster 2
> 21
> 22 TEST $CLI_1 peer probe $H2;
> 23 EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count
> 24
> 25 TEST $CLI_1 volume create $V0 $H1:$B1/${V0}0 $H2:$B2/${V0}1
> 26 EXPECT "$V0" get_value 'Volume Name' 1
> 27 EXPECT "Created" get_value 'Status' 1
> 28
> 29 TEST $CLI_1 volume start $V0
> 30 EXPECT "Started" get_value 'Status' 1
> 31
> 32 #Bring down 2nd glusterd
> 33 TEST kill_glusterd 2
> 34
> 35 #set the volume all options from the 1st glusterd
> 36 TEST $CLI_1 volume set all cluster.server-quorum-ratio 80
> 37
> 38 #Bring back the 2nd glusterd
> 39 TEST $glusterd_2
> 40
> 41 #Verify whether the value has been synced
> 42 EXPECT '80' get_value 'cluster.server-quorum-ratio' 1
> ***43 EXPECT '80' get_value 'cluster.server-quorum-ratio' 2
> 44
> 45 cleanup;
>
> Pranith
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

-- 
Thanks! 
Raghavendra Talur | Red Hat Storage Developer | Bangalore | +918039245176 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] XFS kernel panic bug?

2014-06-12 Thread Niels de Vos
On Thu, Jun 12, 2014 at 07:26:25AM +0100, Justin Clift wrote:
> On 12/06/2014, at 6:58 AM, Niels de Vos wrote:
> 
> > If you capture a vmcore (needs kdump installed and configured), we may 
> > be able to see the cause more clearly.

Oh, these seem to be Xen hosts. I don't think kdump (mainly kexec) works 
on Xen. You would need to run xen-dump (or something like that) on the 
Dom0, for that, you'll have to call Rackspace support, and I have no 
idea how they handle such requests...

> That does help, and so will Harsha's suggestion too probably. :)

That is indeed a solution that can mostly prevent such memory 
dead-locks. Those options can be used to configure to push out the 
outstanding data earlier to the loop-devices, and to the underlying XFS 
filesystem that hold the backing files for the loop-devices.

Cheers,
Niels

> I'll look into it properly later on today.
> 
> For the moment, I've rebooted the other slaves which seems to put them into
> an ok state for a few runs.
> 
> Also just started some rackspace-regression runs on them, using the ones
> queued up in the normal regression queue.
> 
> The results are being updated live into Gerrit now (+1/-1/MERGE CONFLICT).
> 
> So, if you see any regression runs pass on the slaves, it's worth removing
> the corresponding job from the main regression queue.  That'll help keep
> the queue shorter for today at least. :)
> 
> Btw - Happy vacation Niels :)
> 
> /me goes to bed
> 
> + Justin
> 
> --
> GlusterFS - http://www.gluster.org
> 
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
> 
> My personal twitter: twitter.com/realjustinclift
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel