Re: [Gluster-devel] Introduce GD_OP_VERSION_5_10 and increase GD_OP_VERSION_MAX to this version

2020-01-29 Thread Atin Mukherjee
There’s no hard requirement to have an op-version tagged against a release
until and unless you introduce a new volume option. What you need to do is
introduce a new macro with a higher value than max op version and set the
max op version value to the same - just like how a new volume option is
introduced. If you introduce a new macro with a value lower than max op
version you’re calling for a trouble with backward compatibility and
managing heterogeneous cluster.

HTH

On Wed, 29 Jan 2020 at 14:28, David Spisla  wrote:

> Dear Gluster Devels,
> i am using gluster 5.10 and want to introduce a new volume option.
> Therefore I want to set a proper GD_OP_VERSION for it. In gluster 5.10
> source code there is no macro defined for 51000.
>
> But concurrently the GD_OP_VERSION_MAX is set to 50400. I would do
> something like this:
>
> 1. Change in libglusterfs/src/globals-h (line 47)
> #define GD_OP_VERSION_MAX
>  \
> GD_OP_VERSION_5_10
> 2. Add line to same Header file:
> #define GD_OP_VERSION_5_10 51000 /* Op-version for GlusterFS 5.10 */
>
> Do you think this is fine?
>
> 3. libglusterfs/src/common-utils.c (line 2036):
> On the other side there is a if-branch which uses GD_OP_VERSION_5_4 which
> is currently the GD_OP_VERSION_MAX. Why it is used here and should I
> increase it also to GD_OP_VERSION_5_10?
>
> Regards
> David Spisla
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
> --
--Atin
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Release-8] Thin-Arbiter: Unique-ID requirement

2020-01-14 Thread Atin Mukherjee
>From a design perspective 2 is a better choice. However I'd like to see a
design on how cluster id will be generated and maintained (with peer
addition/deletion scenarios, node replacement etc).

On Tue, Jan 14, 2020 at 1:42 PM Amar Tumballi  wrote:

> Hello,
>
> As we are gearing up for Release-8, and its planning, I wanted to bring up
> one of my favorite topics, 'Thin-Arbiter' (or Tie-Breaker/Metro-Cluster etc
> etc).
>
> We have made thin-arbiter release in v7.0 itself, which works great, when
> we have just 1 cluster of gluster. I am talking about a situation which
> involves multiple gluster clusters, and easier management of thin-arbiter
> nodes. (Ref: https://github.com/gluster/glusterfs/issues/763)
>
> I am working with a goal of hosting a thin-arbiter node service (free of
> cost), for which any gluster deployment can connect, and save their cost of
> an additional replica, which is required today to not get into split-brain
> situation. Tie-breaker storage and process needs are so less that we can
> easily handle all gluster deployments till date in just one machine. When I
> looked at the code with this goal, I found that current implementation
> doesn't support it, mainly because it uses 'volumename' in the file it
> creates. This is good for 1 cluster, as we don't allow duplicate volume
> names in a single cluster, or OK for multiple clusters, as long as volume
> names are not colliding.
>
> To resolve this properly we have 2 options (as per my thinking now) to
> make it truly global service.
>
> 1. Add 'volume-id' option in afr volume itself, so, each instance picks
> the volume-id and uses it in thin-arbiter name. A variant of this is
> submitted for review - https://review.gluster.org/23723 but as it uses
> volume-id from io-stats, this particular patch fails in case of brick-mux
> and shd-mux scenarios.  A proper enhancement of this patch is, providing
> 'volume-id' option in AFR itself, so glusterd (while generating volfiles)
> sends the proper vol-id to instance.
>
> Pros: Minimal code changes to the above patch.
> Cons: One more option to AFR (not exposed to users).
>
> 2. Add* cluster-id *to glusterd, and pass it to all processes. Let
> replicate use this in thin-arbiter file. This too will solve the issue.
>
> Pros: A cluster-id is good to have in any distributed system, specially
> when there are deployments which will be 3 node each in different clusters.
> Identifying bricks, services as part of a cluster is better.
>
> Cons: Code changes are more, and in glusterd component.
>
> On another note, 1 above is purely for Thin-Arbiter feature only, where as
> 2nd option would be useful in debugging, and other solutions which
> involves multiple clusters.
>
> Let me know what you all think about this. This is good to be discussed in
> next week's meeting, and taken to completion.
>
> Regards,
> Amar
> ---
> https://kadalu.io
> Storage made easy for k8s
>
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Modifying gluster's logging mechanism

2019-11-21 Thread Atin Mukherjee
This is definitely a good start. In fact the experiment you have done which
indicates a 20% improvement of run time perf with out logger does put this
work for a ‘worth a try’ category for sure. The only thing we need to be
mindful here is the ordering of the logs to be provided, either through a
tool or the logger itself taking care of it.

On Thu, 21 Nov 2019 at 18:34, Barak Sason Rofman 
wrote:

> Hello Gluster community,
>
> My name is Barak and I’ve joined RH gluster development in August.
> Shortly after my arrival, I’ve identified a potential problem with
> gluster’s logging mechanism and I’d like to bring the matter up for
> discussion.
>
> The general concept of the current mechanism is that every worker thread
> that needs to log a message has to contend for a mutex which guards the log
> file, write the message and, flush the data and then release the mutex.
> I see two design / implementation problems with that mechanism:
>
>1.
>
>The mutex that guards the log file is likely under constant contention.
>2.
>
>The fact that each worker thread perform the IO by himself, thus
>slowing his "real" work.
>
>
> Initial tests, done by *removing logging from the regression testing,
> shows an improvement of about 20% in run time*. This indicates we’re
> taking a pretty heavy performance hit just because of the logging activity.
>
> In addition to these problems, the logging module is due for an upgrade:
>
>1.
>
>There are dozens of APIs in the logger, much of them are deprecated -
>this makes it very hard for new developers to keep evolving the project.
>2.
>
>One of the key points for Gluster-X, presented in October at
>Bangalore, is the switch to a structured logging all across gluster.
>
>
> Given these points, I believe we’re in a position that allows us to
> upgrade the logging mechanism by both switching to structured logging
> across the project AND replacing the logging system itself, thus “killing
> two birds with one stone”.
>
> Moreover, if the upgrade is successful, the new logger mechanism might be
> adopted by other teams in Red Hat, which lead to uniform logging activity
> across different products.
>
> I’d like to propose a logging utility I’ve been working on for the past
> few weeks.
> This project is still a work in progress (and still much work needs to be
> done in it), but I’d like to bring this matter up now so if the community
> will want to advance on that front, we could collaborate and shape the
> logger to best suit the community’s needs.
>
> An overview of the system:
>
> The logger provides several (number and size are user-defined)
> pre-allocated buffers which threads can 'register' to and receive a private
> buffer. In addition, a single, shared buffer is also pre-allocated (size is
> user-defined). The number of buffers and their size is modifiable at
> runtime (not yet implemented).
>
> Worker threads write messages in one of 3 ways that will be described
> next, and an internal logger threads constantly iterates the existing
> buffers and drains the data to the log file.
>
> As all allocations are allocated at the initialization stage, no special
> treatment it needed for "out of memory" cases.
>
> The following writing levels exist:
>
>1.
>
>Level 1 - Lockless writing: Lockless writing is achieved by assigning
>each thread a private ring buffer. A worker threads write to that buffer
>and the logger thread drains that buffer into a log file.
>
> In case the private ring buffer is full and not yet drained, or in case
> the worker thread has not registered for a private buffer, we fall down to
> the following writing methods:
>
>1.
>
>Level 2 - Shared buffer writing: The worker thread will write it's
>data into a buffer that's shared across all threads. This is done in a
>synchronized manner.
>
> In case the private ring buffer is full and not yet drained AND the shared
> ring buffer is full and not yet drained, or in case the worker thread has
> not registered for a private buffer, we fall down to the last writing
> method:
>
>1.
>
>Level 3 - Direct write: This is the slowest form of writing - the
>worker thread directly write to the log file.
>
> The idea behind this utility is to reduce as much as possible the impact
> of logging on runtime. Part of this reduction comes at the cost of having
> to parse and reorganize the messages in the log files using a dedicated
> tool (yet to be implemented) as there is no guarantee on the order of
> logged messages.
>
> The full logger project is hosted on:
> https://github.com/BarakSason/Lockless_Logger
>
> For project documentation visit:
> https://baraksason.github.io/Lockless_Logger/
>
> I thank you all for reading through my suggestion and I’m looking forward
> to your feedback,
> --
> *Barak Sason Rofman*
>
> Gluster Storage Development
>
> Red Hat Israel 
>
> 34 Jerusalem rd. Ra'anana, 43501
>
> bsaso...@redhat.c

Re: [Gluster-devel] Upstream nightly build on Centos is failing with glusterd crash

2019-08-27 Thread Atin Mukherjee
This issue is fixed now. Thanks to Nithya for root causing and fixing it.

On Fri, Aug 23, 2019 at 11:19 AM Bala Konda Reddy Mekala 
wrote:

> Hi,
> On fresh installation with the nightly build[1], "systemctl glusterd
> start" is crashing with a glusterd crash (coredump). Bug was filed[2] and
> centos-ci for glusto-tests is currently blocked because of the bug. Please
> look into it.
>
> Thanks,
> Bala
>
> [1] http://artifacts.ci.centos.org/gluster/nightly/master/7/x86_64/
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1744420
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4710

2019-08-26 Thread Atin Mukherjee
Since last few days I was trying to understand the nightly failures we were
seeing even after addressing the port already in use issue. So here's the
analysis:

>From console output of
https://build.gluster.org/job/regression-test-burn-in/4710/consoleFull



*19:51:56* Started by upstream project "nightly-master
" build number 843
*19:51:56*
originally caused by:*19:51:56*  Started by timer*19:51:56* Running as
SYSTEM*19:51:57* Building remotely on builder209.aws.gluster.org

(centos7) in workspace
/home/jenkins/root/workspace/regression-test-burn-in*19:51:58* No
credentials specified*19:51:58*  > git rev-parse --is-inside-work-tree
# timeout=10*19:51:58* Fetching changes from the remote Git
repository*19:51:58*  > git config remote.origin.url
git://review.gluster.org/glusterfs.git # timeout=10*19:51:58* Fetching
upstream changes from git://review.gluster.org/glusterfs.git*19:51:58*
 > git --version # timeout=10*19:51:58*  > git fetch --tags --progress
git://review.gluster.org/glusterfs.git refs/heads/master #
timeout=10*19:52:01*  > git rev-parse origin/master^{commit} #
timeout=10*19:52:01* Checking out Revision
a31fad885c30cbc1bea652349c7d52bac1414c08 (origin/master)*19:52:01*  >
git config core.sparsecheckout # timeout=10*19:52:01  > git checkout
-f a31fad885c30cbc1bea652349c7d52bac1414c08 # timeout=10
19:52:02 Commit message: "tests: heal-info add --xml option for more
coverage"**19:52:02*  > git rev-list --no-walk
a31fad885c30cbc1bea652349c7d52bac1414c08 # timeout=10*19:52:02*
[regression-test-burn-in] $ /bin/bash
/tmp/jenkins7274529097702336737.sh*19:52:02* Start time Mon Aug 26
14:22:02 UTC 2019




The latest commit which it picked up as part of git checkout is quite old
and hence we continue to see the similar failures in the latest nightly
runs which has been already addressed by commit c370c70

commit c370c70f77079339e2cfb7f284f3a2fb13fd2f97
Author: Mohit Agrawal 
Date:   Tue Aug 13 18:45:43 2019 +0530

rpc: glusterd start is failed and throwing an error Address already in
use

Problem: Some of the .t are failed due to bind is throwing
 an error EADDRINUSE

Solution: After killing all gluster processes .t is trying
  to start glusterd but somehow if kernel has not cleaned
  up resources(socket) then glusterd startup is failed due to
  bind system call failure.To avoid the issue retries to call
  bind 10 times to execute system call succesfully

Change-Id: Ia5fd6b788f7b211c1508c1b7304fc08a32266629
Fixes: bz#1743020
Signed-off-by: Mohit Agrawal 

So the (puzzling) question is - why are we picking up old commit?

In my local setup when I run the following command I do see the latest
commit id being picked up:

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git
rev-parse origin/master^{commit} # timeout=10
7926992e65d0a07fdc784a6e45740306d9b4a9f2

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git show
7926992e65d0a07fdc784a6e45740306d9b4a9f2
commit 7926992e65d0a07fdc784a6e45740306d9b4a9f2 (origin/master,
origin/HEAD, master)
Author: Sanju Rakonde 
Date:   Mon Aug 26 12:38:40 2019 +0530

glusterd: Unused value coverity fix

CID: 1288765
updates: bz#789278

Change-Id: Ie6b01f81339769f44d82fd7c32ad0ed1a697c69c
Signed-off-by: Sanju Rakonde 



-- Forwarded message -
From: 
Date: Mon, Aug 26, 2019 at 11:32 PM
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4710
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4710/display/redirect>

--
[...truncated 4.18 MB...]
./tests/features/lock-migration/lkmigration-set-option.t  -  7 second
./tests/bugs/upcall/bug-1458127.t  -  7 second
./tests/bugs/transport/bug-873367.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/replicate/bug-986905.t  -  7 second
./tests/bugs/replicate/bug-921231.t  -  7 second
./tests/bugs/replicate/bug-1132102.t  -  7 second
./tests/bugs/replicate/bug-1037501.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/posix/bug-1122028.t  -  7 second
./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
./tests/bugs/fuse/bug-983477.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/distribute/bug-1086228.t  -  7 second
./tests/bugs/cli/bug-1087487.t  -  7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/ctime/ctime-noatime.t  -  7 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  7 second
./tests/basic/afr/ta.t  -  7 second
./tests/basic/afr/ta-shd.t  -  7 second
./tests/basic/afr/root-squash-self-heal.t  -  7 second
./tests/basic/afr/granular-esh/add-brick.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -  6

Re: [Gluster-devel] [Gluster-users] [Gluster-Maintainers] glusterfs-7.0rc0 released

2019-08-26 Thread Atin Mukherjee
On Mon, Aug 26, 2019 at 11:18 AM Rinku Kothiya  wrote:

> Hi,
>
> Release-7 RC0 packages are built. This is a good time to start testing the
> release bits, and reporting any issues on bugzilla.
> Do post on the lists any testing done and feedback for the same.
>
> We have about 2 weeks to GA of release-6 barring any major blockers
> uncovered during the test phase. Please take this time to help make the
> release effective, by testing the same.
>

I believe you meant release-7 here :-)

I'd like to request that just like release-6, we pay some attention on the
upgrade testing (release-4/release-5/release-6 to release-7) paths and
report back issues here (along with bugzilla links).


> Packages for Fedora 29, Fedora 30, RHEL 8 are at
> https://download.gluster.org/pub/gluster/glusterfs/qa-releases/7.0rc0/
>
> Packages are signed. The public key is at
> https://download.gluster.org/pub/gluster/glusterfs/6/rsa.pub
>
> Packages for Stretch,Bullseye and CentOS7 will be there as soon as they
> get built.
>
> Regards
> Rinku
>
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Assistance setting up Gluster

2019-07-23 Thread Atin Mukherjee
Sanju - can you please help Barak?

>From a quick glance of the log it seems that this wasn’t a clean setup.

Barak - can you please have an empty /var/lib/glusterd/ and start over
again? Also make sure that there’s no glusterd process already running.

On Mon, 22 Jul 2019 at 14:40, Barak Sason  wrote:

> Greeting Yaniv,
>
> Thank you very much for your response.
>
> As you suggested, I'm installing additional VM (CentOs) on which I'll try
> to use the repo you suggested in order to get Gluster up and running. I'll
> update on progress in this matter later today, as it'll take a bit of time
> to get the VM ready.
>
> In addition, I'll post the RHEL problem in a separate thread, as you
> requested.
>
> In the meantime, let's focus on the Ubuntu problem.
> I'm attaching the log file from Ubuntu, corresponding to running 'sudo
> glusterd' command (attachment - glusterd.log).
> Regarding you question about running manually - I've followed the
> instructions specified in the INSTALL.txt file which comes with the repo
> and specifies the following steps for installation:
> 1- ./autogen.sh
> 2- ./configure
> 3- make install
> Please let me know if this somehow incorrect.
>
> I kindly thank you for your time and effort,
>
> Barak
>
> On Mon, Jul 22, 2019 at 8:10 AM Yaniv Kaul  wrote:
>
>>
>>
>> On Mon, Jul 22, 2019 at 1:20 AM Barak Sason  wrote:
>>
>>> Hello everyone,
>>>
>>> My name is Barak and I'll soon be joining the Gluster development team
>>> as a part of Red Hat.
>>>
>>
>> Hello and welcome to the Gluster community.
>>
>>>
>>> As a preparation for my upcoming employment I've been trying to get
>>> Gluster up and running on my system, but came across some technical
>>> difficulties.
>>> I'll appreciate any assistance you may provide.
>>>
>>> I have 2 VMs on my PC - Ubuntu 18, which I used for previous
>>> development  and RHEL 8 which I installed a fresh copy just days ago.
>>>
>>
>> 2 VMs is really minimal. You should use more.
>>
>>> The copy of Gluster code I'm working with is a clone of the master
>>> repository.
>>>
>>> On Ubuntu installation completed, but running the command 'sudo
>>> glusterd' does nothing. Debugging with gdb shows that the program
>>> terminates very early due to an error.
>>> At glusterfsd.c:2878 (main method) there is a call to 'daemonize'
>>> method. at glusterfsd.c:2568 a call to sys_read fails with errno 17.
>>> I'm unsure why this happens and I was unable to solve this.
>>> I've tried to run 'sudo glusterd -N' in order to deactivate
>>> deamonization, but this also fails at glusterfsd.c:2712
>>> ('glusterfs_process_volfp' method). I was unable to solve this issue too.
>>>
>>> On RHEL, running ./configure results in an error regarding 'rpcgen'.
>>> Running ./configure --without-libtirp was unhelpful and results in the
>>> same error.
>>>
>>
>> I'd separate the two issues to two different email threads, as they may
>> or may not be related.
>> Please provide logs for each.
>> Why are you running glusterd manually, btw?
>>
>> You may want to take a look at https://github.com/mykaul/vg - which is a
>> simple way to set up Gluster on CentOS 7 VMs for testing. I have not tried
>> it for some time - let me know how it works for you.
>> Y.
>>
>>>
>>> As of right now I'm unable to proceed so I ask for your assistance.
>>>
>>> Thank you all very much.
>>> ___
>>>
>>> Community Meeting Calendar:
>>>
>>> APAC Schedule -
>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>> Bridge: https://bluejeans.com/836554017
>>>
>>> NA/EMEA Schedule -
>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>> Bridge: https://bluejeans.com/486278655
>>>
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
> --
- Atin (atinm)
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Re-Compile glusterd1 and add it to the stack

2019-07-15 Thread Atin Mukherjee
David - I don't see a GF_OPTION_INIT in init () of read-only.c . How is
that working even when you're compiling the entire source?

On Mon, Jul 15, 2019 at 7:40 PM David Spisla  wrote:

> Hello Vijay,
> there is a patch file attached. You can see the code there. I oriented
> myself here:
> https://review.gluster.org/#/c/glusterfs/+/18633/
>
> As you can see there is no additional code in glusterd-volgen.c . Both
> glusterd-volgen.c and glusterd-volume.set.c will be compiled into
> glusterd.so .
> Its still the problem, that my new option is not available if I only
> re-compile glusterd.so . Compiling and using the whole RPMs is working
>
> It is not possible to re-compile glusterd.so ?
>
> Regards
> David Spisla
>
> Am Do., 11. Juli 2019 um 20:35 Uhr schrieb Vijay Bellur <
> vbel...@redhat.com>:
>
>> Hi David,
>>
>> If the option is related to a particular translator, you would need to
>> add that option in the options table of the translator and add code in
>> glusterd-volgen.c to generate that option in the volfiles.
>>
>> Would it be possible to share the code diff that you are trying out?
>>
>> Regards,
>> Vijay
>>
>> On Wed, Jul 10, 2019 at 3:11 AM David Spisla  wrote:
>>
>>> Hello Gluster Devels,
>>>
>>> I add a custom volume option to glusterd-volume-set.c . I could build my
>>> own RPMs but I don't want this, I only want to add new compiled glusterd to
>>> the stack. I tried it out to copy glusterd.so to
>>> /usr/lib64/glusterfs/x.x/xlator/mgmt . After this glusterd is running
>>> normally and I can create volumes but in the vol files my new option is not
>>> there and if I want to start the volume it failed.
>>>
>>> It seems to be that I need to add some other files to the stack. Any
>>> idea?
>>>
>>> Regards
>>> David Spisla
>>> ___
>>>
>>> Community Meeting Calendar:
>>>
>>> APAC Schedule -
>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>> Bridge: https://bluejeans.com/836554017
>>>
>>> NA/EMEA Schedule -
>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>> Bridge: https://bluejeans.com/486278655
>>>
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/836554017
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/486278655
>>
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Requesting reviews [Re: Release 7 Branch Created]

2019-07-15 Thread Atin Mukherjee
Please ensure :
1. commit message has the explanation on the motive behind this change.
2. I always feel more confident if a patch has passed regression to kick
start the review. Can you please ensure that verified flag is put up?

On Mon, Jul 15, 2019 at 5:27 PM Jiffin Tony Thottan 
wrote:

> Hi,
>
> The "Add Ganesha HA bits back to glusterfs code repo"[1] is targeted for
> glusterfs-7. Requesting maintainers to review below two patches
>
> [1]
> https://review.gluster.org/#/q/topic:ref-663+(status:open+OR+status:merged)
>
> Regards,
>
> Jiffin
>
> On 15/07/19 5:23 PM, Jiffin Thottan wrote:
> >
> > - Original Message -
> > From: "Rinku Kothiya" 
> > To: maintain...@gluster.org, gluster-devel@gluster.org, "Shyam
> Ranganathan" 
> > Sent: Wednesday, July 3, 2019 10:30:58 AM
> > Subject: [Gluster-devel] Release 7 Branch Created
> >
> > Hi Team,
> >
> > Release 7 branch has been created in upstream.
> >
> > ## Schedule
> >
> > Curretnly the plan working backwards on the schedule, here's what we
> have:
> > - Announcement: Week of Aug 4th, 2019
> > - GA tagging: Aug-02-2019
> > - RC1: On demand before GA
> > - RC0: July-03-2019
> > - Late features cut-off: Week of June-24th, 2018
> > - Branching (feature cutoff date): June-17-2018
> >
> > Regards
> > Rinku
> >
> > ___
> >
> > Community Meeting Calendar:
> >
> > APAC Schedule -
> > Every 2nd and 4th Tuesday at 11:30 AM IST
> > Bridge: https://bluejeans.com/836554017
> >
> > NA/EMEA Schedule -
> > Every 1st and 3rd Tuesday at 01:00 PM EDT
> > Bridge: https://bluejeans.com/486278655
> >
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Release 5.7 or 5.8

2019-07-14 Thread Atin Mukherjee
On Fri, 12 Jul 2019 at 21:14, Niels de Vos  wrote:

> On Thu, Jul 11, 2019 at 01:02:48PM +0530, Hari Gowtham wrote:
> > Hi,
> >
> > We came across an build issue with release 5.7. It was related the
> > python version.
> > A fix for it ha been posted [
> https://review.gluster.org/#/c/glusterfs/+/23028 ]
> > Once we take this fix in we need to go ahead with tagging and release it.
> > Though we have tagged 5.7, we weren't able to package 5.7 because of
> this issue.
> >
> > Now the question is, to create 5.7.1 or go with 5.8 as recreating a
> > tag isn't an option.
> > My take is to create 5.8 and mark 5.7 obsolete. And the reasons are as
> below:
> > *) We have moved on to using 5.x.  Going back to 5.x.y will be confusing.
> > *) 5.8 is also due as we got delayed a lot in this issue.
> >
> > If we have any other opinion, please let us know so we can decide and
> > go ahead with the best option.
>
> I would go with 5.7.1. However if 5.8 would be tagged around the same
> time, then only do 5.8.


Since 5.8 is nearing, lets do 5.8 instead of 5.7.1?


>
> Niels
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
> --
- Atin (atinm)
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Rebase your patches to avoid fedora-smoke failure

2019-07-13 Thread Atin Mukherjee
With https://review.gluster.org/23033 being now merged, we should be
unblocked on the fedora-smoke failure. Request all of the patch owners to
rebase your respective patches to get unblocked.
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4631

2019-07-02 Thread Atin Mukherjee
Can we check the following failure?

1 test(s) failed
./tests/00-geo-rep/georep-basic-dr-rsync-arbiter.t

-- Forwarded message -
From: 
Date: Mon, Jul 1, 2019 at 11:48 PM
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4631
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4631/display/redirect?page=changes
>

Changes:

[Amar Tumballi] rfc.sh: Improve bug identification

[Amar Tumballi] glusterd: fix clang scan defects

[Amar Tumballi] core: use multiple servers while mounting a volume using
ipv6

--
[...truncated 3.99 MB...]
./tests/bugs/replicate/bug-1561129-enospc.t  -  9 second
./tests/bugs/replicate/bug-1221481-allow-fops-on-dir-split-brain.t  -  9
second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  9 second
./tests/bugs/protocol/bug-1321578.t  -  9 second
./tests/bugs/posix/bug-1122028.t  -  9 second
./tests/bugs/glusterfs/bug-861015-log.t  -  9 second
./tests/bugs/glusterfs/bug-848251.t  -  9 second
./tests/bugs/glusterd/bug-949930.t  -  9 second
./tests/bugs/gfapi/bug-1032894.t  -  9 second
./tests/bugs/fuse/bug-983477.t  -  9 second
./tests/bugs/ec/bug-1227869.t  -  9 second
./tests/bugs/distribute/bug-1088231.t  -  9 second
./tests/bugs/cli/bug-1022905.t  -  9 second
./tests/bugs/changelog/bug-1321955.t  -  9 second
./tests/bugs/changelog/bug-1208470.t  -  9 second
./tests/bugs/bug-1371806_2.t  -  9 second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  9 second
./tests/basic/quota-nfs.t  -  9 second
./tests/basic/md-cache/bug-1317785.t  -  9 second
./tests/basic/glusterd/thin-arbiter-volume-probe.t  -  9 second
./tests/basic/fop-sampling.t  -  9 second
./tests/basic/ctime/ctime-noatime.t  -  9 second
./tests/basic/changelog/changelog-rename.t  -  9 second
./tests/basic/afr/stale-file-lookup.t  -  9 second
./tests/basic/afr/root-squash-self-heal.t  -  9 second
./tests/basic/afr/afr-read-hash-mode.t  -  9 second
./tests/basic/afr/add-brick-self-heal.t  -  9 second
./tests/bugs/upcall/bug-1369430.t  -  8 second
./tests/bugs/transport/bug-873367.t  -  8 second
./tests/bugs/snapshot/bug-1260848.t  -  8 second
./tests/bugs/snapshot/bug-1064768.t  -  8 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  8 second
./tests/bugs/shard/bug-1258334.t  -  8 second
./tests/bugs/replicate/bug-986905.t  -  8 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
8 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/replicate/bug-1037501.t  -  8 second
./tests/bugs/quota/bug-1104692.t  -  8 second
./tests/bugs/posix/bug-1175711.t  -  8 second
./tests/bugs/md-cache/afr-stale-read.t  -  8 second
./tests/bugs/glusterfs/bug-902610.t  -  8 second
./tests/bugs/glusterfs/bug-872923.t  -  8 second
./tests/bugs/glusterd/bug-1242875-do-not-pass-volinfo-quota.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/bug-1258069.t  -  8 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  8
second
./tests/basic/xlator-pass-through-sanity.t  -  8 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  8 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  8 second
./tests/basic/fencing/fencing-crash-conistency.t  -  8 second
./tests/basic/ec/statedump.t  -  8 second
./tests/basic/distribute/file-create.t  -  8 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  8 second
./tests/basic/afr/tarissue.t  -  8 second
./tests/basic/afr/ta-read.t  -  8 second
./tests/basic/afr/granular-esh/add-brick.t  -  8 second
./tests/gfid2path/gfid2path_fuse.t  -  7 second
./tests/bugs/shard/bug-1259651.t  -  7 second
./tests/bugs/replicate/bug-767585-gfid.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/replicate/bug-1101647.t  -  7 second
./tests/bugs/quota/bug-1287996.t  -  7 second
./tests/bugs/quota/bug-1243798.t  -  7 second
./tests/bugs/nfs/bug-915280.t  -  7 second
./tests/bugs/io-cache/bug-858242.t  -  7 second
./tests/bugs/glusterd/quorum-value-check.t  -  7 second
./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  7 second
./tests/bugs/distribute/bug-884597.t  -  7 second
./tests/bugs/distribute/bug-1368012.t  -  7 second
./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t  -  7 second
./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  7 second
./tests/bugs/bug-1702299.t  -  7 second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/bitrot/bug-1221914.t  -  7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/ec/ec-read-policy.t  -  7 second
./tests/basic/ec/ec-anonymous-fd.t  -  7 second
./tests/basic/afr/ta.t  -  7 second
./tests/basic/afr/ta-shd.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  6 second
./

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4632

2019-07-02 Thread Atin Mukherjee
Can we check these failures please?

2 test(s) failed
./tests/bugs/glusterd/bug-1699339.t
./tests/bugs/glusterd/bug-857330/normal.t

-- Forwarded message -
From: 
Date: Wed, Jul 3, 2019 at 12:08 AM
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4632
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4632/display/redirect?page=changes
>

Changes:

[Amar Tumballi] Removing one top command from gluster v help

[Amar Tumballi] glusterfs-fops: fix the modularity

[Sheetal Pamecha] cli: Remove Wformat-truncation compiler warning

[Nithya Balachandran] cluster/dht:  Fixed a memleak in dht_rename_cbk

--
[...truncated 4.02 MB...]
./tests/bugs/ec/bug-1227869.t  -  9 second
./tests/bugs/distribute/bug-1122443.t  -  9 second
./tests/bugs/cli/bug-1022905.t  -  9 second
./tests/bugs/changelog/bug-1321955.t  -  9 second
./tests/bugs/changelog/bug-1208470.t  -  9 second
./tests/bugs/bug-1258069.t  -  9 second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  9 second
./tests/bitrot/bug-1207627-bitrot-scrub-status.t  -  9 second
./tests/basic/xlator-pass-through-sanity.t  -  9 second
./tests/basic/md-cache/bug-1317785.t  -  9 second
./tests/basic/glusterd/thin-arbiter-volume-probe.t  -  9 second
./tests/basic/afr/stale-file-lookup.t  -  9 second
./tests/basic/afr/root-squash-self-heal.t  -  9 second
./tests/basic/afr/add-brick-self-heal.t  -  9 second
./tests/features/readdir-ahead.t  -  8 second
./tests/bugs/snapshot/bug-1260848.t  -  8 second
./tests/bugs/snapshot/bug-1064768.t  -  8 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  8 second
./tests/bugs/shard/bug-1260637.t  -  8 second
./tests/bugs/replicate/bug-986905.t  -  8 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/replicate/bug-1101647.t  -  8 second
./tests/bugs/replicate/bug-1037501.t  -  8 second
./tests/bugs/protocol/bug-1321578.t  -  8 second
./tests/bugs/posix/bug-1175711.t  -  8 second
./tests/bugs/nfs/bug-915280.t  -  8 second
./tests/bugs/io-cache/bug-858242.t  -  8 second
./tests/bugs/glusterfs/bug-872923.t  -  8 second
./tests/bugs/glusterfs/bug-848251.t  -  8 second
./tests/bugs/glusterd/bug-1696046.t  -  8 second
./tests/bugs/glusterd/bug-1242875-do-not-pass-volinfo-quota.t  -  8 second
./tests/bugs/fuse/bug-983477.t  -  8 second
./tests/bugs/distribute/bug-1088231.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/bug-1371806_2.t  -  8 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  8
second
./tests/bitrot/br-stub.t  -  8 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  8 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  8 second
./tests/basic/fop-sampling.t  -  8 second
./tests/basic/fencing/fencing-crash-conistency.t  -  8 second
./tests/basic/ec/statedump.t  -  8 second
./tests/basic/distribute/file-create.t  -  8 second
./tests/basic/ctime/ctime-noatime.t  -  8 second
./tests/basic/changelog/changelog-rename.t  -  8 second
./tests/basic/afr/tarissue.t  -  8 second
./tests/basic/afr/ta-read.t  -  8 second
./tests/basic/afr/granular-esh/add-brick.t  -  8 second
./tests/basic/afr/afr-read-hash-mode.t  -  8 second
./tests/bugs/upcall/bug-1369430.t  -  7 second
./tests/bugs/shard/bug-1342298.t  -  7 second
./tests/bugs/shard/bug-1259651.t  -  7 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/quota/bug-1243798.t  -  7 second
./tests/bugs/quota/bug-1104692.t  -  7 second
./tests/bugs/nfs/bug-1116503.t  -  7 second
./tests/bugs/md-cache/afr-stale-read.t  -  7 second
./tests/bugs/glusterd/quorum-value-check.t  -  7 second
./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t  -  7 second
./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  7 second
./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t  -  7 second
./tests/bugs/distribute/bug-884597.t  -  7 second
./tests/bugs/distribute/bug-1368012.t  -  7 second
./tests/bugs/core/bug-1699025-brick-mux-detach-brick-fd-issue.t  -  7 second
./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  7 second
./tests/bugs/bug-1702299.t  -  7 second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/basic/ec/ec-read-policy.t  -  7 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  7 second
./tests/basic/afr/ta.t  -  7 second
./tests/basic/afr/ta-shd.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/basic/afr/arbiter-remove-brick.t  -  7 second
./tests/gfid2path/gfid2path_nfs.t  -  6 second
./tests/gfid2path/get-gfid-to-path.t  -  6 second
./tests/bugs/snapshot/bug-1178079.t  -  6 second
./tests/bugs/shard/bug-1272986.t  -  6 second
./tests/bugs/shard/bug-1258334.t  -  6 second
./tests/bugs/replicate/bug-7

[Gluster-devel] Fwd: Details on the Scan.coverity.com June 2019 Upgrade

2019-06-12 Thread Atin Mukherjee
Fyi..no scan for 3-4 days starting from June 17th for the upgrade. Post
that we may have to do some changes to use the new build tool?

-- Forwarded message -
From: Peter Degen-Portnoy 
Date: Thu, 13 Jun 2019 at 00:48
Subject: Details on the Scan.coverity.com June 2019 Upgrade


June 17, 9 a.m. MDT
View this email in a browser
<http://s1192102248.t.en25.com/e/es?s=1061282284&e=27454&elqTrackId=9B23147D979D65E5A8007070988C0952&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>

[image: Synopsys]
<http://app.go2.synopsys.com/e/er?s=1061282284&lid=135&elqTrackId=9f811d0039ba4384929517bcbd442a62&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>
Coverity Scan 2019 Upgrade


Dear Atin Mukherjee,
Thank you for being an active user of scan.coverity.com
<http://scan.coverity.com/?elqTrackId=9C51F5BAD2400EBFFE0E01243C4AA171&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>.
We have some important news to share with you.

As you know, the version of Coverity used by the Scan website is somewhat
out of date. So we’re pleased to announce that we’re upgrading to the
latest stable production version.

We’re currently verifying the upgrade. Here’s what you can expect:

We plan to start the upgrade *Monday, June 17, around 9 a.m. MDT*. We
expect the process to last 3–4 days.

During this time, scan.coverity.com
<http://scan.coverity.com/?elqTrackId=D1ABC7C684CB87828CFB36320A2AF5A4&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>
may be offline and unavailable. If possible, we’ll provide access to
scan.coverity.com
<http://scan.coverity.com/?elqTrackId=4EE82209D4BAEA92109B7F43E5E29A85&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>
in read-only mode.

After the upgrade, you should use the new Build tool that matches the
upgraded version of Coverity. Specifically, the build tool from Coverity
8.7 will no longer be supported.

You can find details about the upgrade and the new build tool on the Scan
Status Community
<https://community.synopsys.com/s/topic/0TO2H001CN7WAM/coverity-scan-status?elqTrackId=74ABF28FFFED1CC3640DF9488B944FF8&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>
page. You can also subscribe to scan.coverity.com
<http://scan.coverity.com/?elqTrackId=96A06F4431AEB8522A4F34485E3BEFC5&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>
status updates on this page by clicking the “Follow” button and selecting
“Every Post.”

Please take a look at the information on the Scan Status Community page. If
you have any questions about the upgrade, post them on the Synopsys
Software Integrity Community
<https://community.synopsys.com/s?elqTrackId=65C1D4E430D2DD745B0CB18E5D1534BD&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>.
We’ll answer as soon as we can.

Sincerely yours,
The Scan Administrators

scan-ad...@coverity.com


Follow

<http://app.go2.synopsys.com/e/er?s=1061282284&lid=125&elqTrackId=bbcf46aecb16401998908a1188a923f8&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>

<http://app.go2.synopsys.com/e/er?s=1061282284&lid=124&elqTrackId=8b40283ce440430c84c654356372c7db&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>

<http://app.go2.synopsys.com/e/er?s=1061282284&lid=123&elqTrackId=e0ddb9a40d964e0a94ba913171ec2dfb&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>

<http://app.go2.synopsys.com/e/er?s=1061282284&lid=122&elqTrackId=b308f21b577b4d10b263a215bbbf9a99&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>

© 2019 Synopsys, Inc. All Rights Reserved
690 E Middlefield Rd, Mountain View, CA 94043
<https://www.google.com/maps/search/690+E+Middlefield+Rd,+Mountain+View,+CA+94043?entry=gmail&source=g>

About
<http://app.go2.synopsys.com/e/er?s=1061282284&lid=121&elqTrackId=132ef56ebff54cd7b4d5df5d4412d5be&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1>
   Privacy
<https://www.synopsys.com/company/legal/privacy-policy.html?elq_mid=839&elq_cid=55051&elqTrackId=d136f8e3fb834ab6991f411fee431324&elq=656ea81285804a36a32a2e5aface86fc&elqaid=1220&elqat=1&elqCampaignId=872>
   Unsubscribe
<http://app.go2.synopsys.com/e/u?s=1061282284&elq=656ea81285804a36a32a2e5aface86fc>



-- 
--Atin
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



[Gluster-devel] https://build.gluster.org/job/centos7-regression/6404/consoleFull - Problem accessing //job/centos7-regression/6404/consoleFull. Reason: Not found

2019-06-11 Thread Atin Mukherjee
https://bugzilla.redhat.com/show_bug.cgi?id=1719174

The patch which failed the regression is https://review.gluster.org/22851 .
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359

2019-06-10 Thread Atin Mukherjee
On Fri, Jun 7, 2019 at 10:07 AM Amar Tumballi Suryanarayan <
atumb...@redhat.com> wrote:

> Got time to test subdir-mount.t failing in brick-mux scenario.
>
> I noticed some issues, where I need further help from glusterd team.
>
> subdir-mount.t expects 'hook' script to run after add-brick to make sure
> the required subdirectories are healed and are present in new bricks. This
> is important as subdir mount expects the subdirs to exist for successful
> mount.
>
> But in case of brick-mux setup, I see that in some cases (6/10), hook
> script (add-brick/post-hook/S13-create-subdir-mount.sh) started getting
> executed after 20second of finishing the add-brick command. Due to this,
> the mount which we execute after add-brick failed.
>
> My question is, what is making post hook script to run so late ??
>

It's not only the add-brick in the post hook. Given post hook scripts are
async in nature, I see the respective hook scripts of create/start/set
volume operation have executed quite a late which is very surprising until
and unless some thread has been stuck for quite a while. Unfortunately for
both Mohit and I, the issue isn't reproducible locally. Mohit would give it
a try in softserve infra but at this point of time, there's no conclusive
evidence, the analysis continues.

Amar - would it be possible for you to do a git blame given you can
reproduce this? May 31 nightly (
https://build.gluster.org/job/regression-test-with-multiplex/1359/) is when
this test started failing.


> I can recreate the issues locally on my laptop too.
>
>
> On Sat, Jun 1, 2019 at 4:55 PM Atin Mukherjee  wrote:
>
>> subdir-mount.t has started failing in brick mux regression nightly. This
>> needs to be fixed.
>>
>> Raghavendra - did we manage to get any further clue on uss.t failure?
>>
>> -- Forwarded message -
>> From: 
>> Date: Fri, 31 May 2019 at 23:34
>> Subject: [Gluster-Maintainers] Build failed in Jenkins:
>> regression-test-with-multiplex #1359
>> To: , , ,
>> , 
>>
>>
>> See <
>> https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes
>> >
>>
>> Changes:
>>
>> [atin] glusterd: add an op-version check
>>
>> [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper
>> function
>>
>> [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop
>>
>> [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs
>>
>> [Kotresh H R] tests/geo-rep: Add EC volume test case
>>
>> [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock
>>
>> [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send
>> reconfigure
>>
>> [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep
>>
>> [atin] glusterd: Optimize code to copy dictionary in handshake code path
>>
>> --
>> [...truncated 3.18 MB...]
>> ./tests/basic/afr/stale-file-lookup.t  -  9 second
>> ./tests/basic/afr/granular-esh/replace-brick.t  -  9 second
>> ./tests/basic/afr/granular-esh/add-brick.t  -  9 second
>> ./tests/basic/afr/gfid-mismatch.t  -  9 second
>> ./tests/performance/open-behind.t  -  8 second
>> ./tests/features/ssl-authz.t  -  8 second
>> ./tests/features/readdir-ahead.t  -  8 second
>> ./tests/bugs/upcall/bug-1458127.t  -  8 second
>> ./tests/bugs/transport/bug-873367.t  -  8 second
>> ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  8 second
>> ./tests/bugs/replicate/bug-1132102.t  -  8 second
>> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
>> -  8 second
>> ./tests/bugs/quota/bug-1104692.t  -  8 second
>> ./tests/bugs/posix/bug-1360679.t  -  8 second
>> ./tests/bugs/posix/bug-1122028.t  -  8 second
>> ./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
>> ./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
>> ./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
>> ./tests/bugs/glusterd/bug-1696046.t  -  8 second
>> ./tests/bugs/fuse/bug-983477.t  -  8 second
>> ./tests/bugs/ec/bug-1227869.t  -  8 second
>> ./tests/bugs/distribute/bug-1088231.t  -  8 second
>> ./tests/bugs/distribute/bug-1086228.t  -  8 second
>> ./tests/bugs/cli/bug-1087487.t  -  8 second
>> ./tests/bugs/cli/bug-1022905.t  -  8 second
>> ./tests/bugs/bug-1258069.t  -  8 second
>> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
>> -  8 second
>> ./tests/basic/xlator-pass-through-sanity.t  -  8 second
>> ./t

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1357

2019-06-01 Thread Atin Mukherjee
Rafi - tests/bugs/glusterd/serializ
e-shd-manager-glusterd-
restart.t seems to be failing often. Can you please investigate the reason
of this spurious failure?

-- Forwarded message -
From: 
Date: Thu, 30 May 2019 at 23:22
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #1357
To: , 


See <
https://build.gluster.org/job/regression-test-with-multiplex/1357/display/redirect?page=changes
>

Changes:

[Xavi Hernandez] tests: add tests for different signal handling

[Xavi Hernandez] marker: remove some unused functions

[Xavi Hernandez] glusterd: coverity fix

--
[...truncated 2.92 MB...]
./tests/basic/ec/ec-root-heal.t  -  9 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  9 second
./tests/basic/afr/ta.t  -  9 second
./tests/basic/afr/gfid-mismatch.t  -  9 second
./tests/performance/open-behind.t  -  8 second
./tests/features/ssl-authz.t  -  8 second
./tests/features/readdir-ahead.t  -  8 second
./tests/features/lock-migration/lkmigration-set-option.t  -  8 second
./tests/bugs/replicate/bug-921231.t  -  8 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/posix/bug-990028.t  -  8 second
./tests/bugs/posix/bug-1360679.t  -  8 second
./tests/bugs/nfs/bug-915280.t  -  8 second
./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
./tests/bugs/glusterfs/bug-872923.t  -  8 second
./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
./tests/bugs/glusterd/bug-1696046.t  -  8 second
./tests/bugs/distribute/bug-1088231.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/bug-1258069.t  -  8 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  8
second
./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
-  8 second
./tests/basic/quota-nfs.t  -  8 second
./tests/basic/ec/statedump.t  -  8 second
./tests/basic/ctime/ctime-noatime.t  -  8 second
./tests/basic/afr/ta-shd.t  -  8 second
./tests/basic/afr/arbiter-remove-brick.t  -  8 second
./tests/line-coverage/cli-peer-and-volume-operations.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  7 second
./tests/gfid2path/block-mount-access.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -  7 second
./tests/bugs/transport/bug-873367.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/snapshot/bug-1064768.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  7 second
./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  7 second
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/replicate/bug-1101647.t  -  7 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  7 second
./tests/bugs/quota/bug-1104692.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/posix/bug-1122028.t  -  7 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  7 second
./tests/bugs/glusterfs/bug-848251.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/distribute/bug-884597.t  -  7 second
./tests/bugs/distribute/bug-1122443.t  -  7 second
./tests/bugs/changelog/bug-1208470.t  -  7 second
./tests/bugs/bug-1371806_2.t  -  7 second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
./tests/bitrot/bug-1221914.t  -  7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/xlator-pass-through-sanity.t  -  7 second
./tests/basic/trace.t  -  7 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  7 second
./tests/basic/distribute/file-create.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/bugs/shard/bug-1342298.t  -  6 second
./tests/bugs/shard/bug-1272986.t  -  6 second
./tests/bugs/shard/bug-1259651.t  -  6 second
./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
./tests/bugs/replicate/bug-1325792.t  -  6 second
./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t  -  6 second
./tests/bugs/quota/bug-1243798.t  -  6 second
./tests/bugs/protocol/bug-1321578.t  -  6 second
./tests/bugs/posix/bug-765380.t  -  6 second
./tests/bugs/nfs/bug-877885.t  -  6 second
./tests/bugs/nfs/bug-847622.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
./tests/bugs/md-cache/bug-1211863_unlink.t  -  6 second
./tests/bugs/io-stats/bug-1598548.t  -  6 second
./tests/bugs/io-cache/bug-858242.t  -  6 second
./tests/bugs/glusterfs/bug-893378.t  -  6 second
./tests/bugs/glusterfs/bug-856455.t  -  6 second
./tests/bugs/glusterd/quorum-value-

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1359

2019-06-01 Thread Atin Mukherjee
subdir-mount.t has started failing in brick mux regression nightly. This
needs to be fixed.

Raghavendra - did we manage to get any further clue on uss.t failure?

-- Forwarded message -
From: 
Date: Fri, 31 May 2019 at 23:34
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #1359
To: , , , <
amukh...@redhat.com>, 


See <
https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes
>

Changes:

[atin] glusterd: add an op-version check

[atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper
function

[atin] glusterd/svc: Stop stale process using the glusterd_proc_stop

[Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs

[Kotresh H R] tests/geo-rep: Add EC volume test case

[Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock

[Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send
reconfigure

[Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep

[atin] glusterd: Optimize code to copy dictionary in handshake code path

--
[...truncated 3.18 MB...]
./tests/basic/afr/stale-file-lookup.t  -  9 second
./tests/basic/afr/granular-esh/replace-brick.t  -  9 second
./tests/basic/afr/granular-esh/add-brick.t  -  9 second
./tests/basic/afr/gfid-mismatch.t  -  9 second
./tests/performance/open-behind.t  -  8 second
./tests/features/ssl-authz.t  -  8 second
./tests/features/readdir-ahead.t  -  8 second
./tests/bugs/upcall/bug-1458127.t  -  8 second
./tests/bugs/transport/bug-873367.t  -  8 second
./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  8 second
./tests/bugs/quota/bug-1104692.t  -  8 second
./tests/bugs/posix/bug-1360679.t  -  8 second
./tests/bugs/posix/bug-1122028.t  -  8 second
./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
./tests/bugs/glusterd/bug-1696046.t  -  8 second
./tests/bugs/fuse/bug-983477.t  -  8 second
./tests/bugs/ec/bug-1227869.t  -  8 second
./tests/bugs/distribute/bug-1088231.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/cli/bug-1022905.t  -  8 second
./tests/bugs/bug-1258069.t  -  8 second
./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
-  8 second
./tests/basic/xlator-pass-through-sanity.t  -  8 second
./tests/basic/quota-nfs.t  -  8 second
./tests/basic/glusterd/arbiter-volume.t  -  8 second
./tests/basic/ctime/ctime-noatime.t  -  8 second
./tests/line-coverage/cli-peer-and-volume-operations.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-767585-gfid.t  -  7 second
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/nfs/bug-915280.t  -  7 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  7 second
./tests/bugs/md-cache/bug-1211863_unlink.t  -  7 second
./tests/bugs/glusterfs/bug-848251.t  -  7 second
./tests/bugs/distribute/bug-1122443.t  -  7 second
./tests/bugs/changelog/bug-1208470.t  -  7 second
./tests/bugs/bug-1702299.t  -  7 second
./tests/bugs/bug-1371806_2.t  -  7 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  7
second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  7 second
./tests/basic/fencing/fencing-crash-conistency.t  -  7 second
./tests/basic/distribute/file-create.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/bugs/snapshot/bug-1178079.t  -  6 second
./tests/bugs/snapshot/bug-1064768.t  -  6 second
./tests/bugs/shard/bug-1342298.t  -  6 second
./tests/bugs/shard/bug-1259651.t  -  6 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
6 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  6 second
./tests/bugs/replicate/bug-1325792.t  -  6 second
./tests/bugs/replicate/bug-1101647.t  -  6 second
./tests/bugs/quota/bug-1243798.t  -  6 second
./tests/bugs/protocol/bug-1321578.t  -  6 second
./tests/bugs/nfs/bug-877885.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
./tests/bugs/md-cache/bug-1476324.

[Gluster-devel] tests are timing out in master branch

2019-05-14 Thread Atin Mukherjee
There're random tests which are timing out after 200 secs. My belief is
this is a major regression introduced by some commit recently or the
builders have become extremely slow which I highly doubt. I'd request that
we first figure out the cause, get master back to it's proper health and
then get back to the review/merge queue.

Sanju has already started looking into
/tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t to understand
what test is specifically hanging and consuming more time.
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-infra] is_nfs_export_available from nfs.rc failing too often?

2019-05-08 Thread Atin Mukherjee
On Wed, May 8, 2019 at 7:38 PM Atin Mukherjee  wrote:

> builder204 needs to be fixed, too many failures, mostly none of the
> patches are passing regression.
>

And with that builder201 joins the pool,
https://build.gluster.org/job/centos7-regression/5943/consoleFull


> On Wed, May 8, 2019 at 9:53 AM Atin Mukherjee  wrote:
>
>>
>>
>> On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde  wrote:
>>
>>> Deepshikha,
>>>
>>> I see the failure here[1] which ran on builder206. So, we are good.
>>>
>>
>> Not really,
>> https://build.gluster.org/job/centos7-regression/5909/consoleFull failed
>> on builder204 for similar reasons I believe?
>>
>> I am bit more worried on this issue being resurfacing more often these
>> days. What can we do to fix this permanently?
>>
>>
>>> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull
>>>
>>> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal <
>>> dkhan...@redhat.com> wrote:
>>>
>>>> Sanju, can you please give us more info about the failures.
>>>>
>>>> I see the failures occurring on just one of the builder (builder206).
>>>> I'm taking it back offline for now.
>>>>
>>>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer 
>>>> wrote:
>>>>
>>>>> Le mardi 07 mai 2019 à 20:04 +0530, Sanju Rakonde a écrit :
>>>>> > Looks like is_nfs_export_available started failing again in recent
>>>>> > centos-regressions.
>>>>> >
>>>>> > Michael, can you please check?
>>>>>
>>>>> I will try but I am leaving for vacation tonight, so if I find nothing,
>>>>> until I leave, I guess Deepshika will have to look.
>>>>>
>>>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul  wrote:
>>>>> >
>>>>> > >
>>>>> > >
>>>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer <
>>>>> > > msche...@redhat.com>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Le lundi 22 avril 2019 à 22:57 +0530, Atin Mukherjee a écrit :
>>>>> > > > > Is this back again? The recent patches are failing regression
>>>>> > > > > :-\ .
>>>>> > > >
>>>>> > > > So, on builder206, it took me a while to find that the issue is
>>>>> > > > that
>>>>> > > > nfs (the service) was running.
>>>>> > > >
>>>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs
>>>>> > > > initialisation
>>>>> > > > failed with a rather cryptic message:
>>>>> > > >
>>>>> > > > [2019-04-23 13:17:05.371733] I
>>>>> > > > [socket.c:991:__socket_server_bind] 0-
>>>>> > > > socket.nfs-server: process started listening on port (38465)
>>>>> > > > [2019-04-23 13:17:05.385819] E
>>>>> > > > [socket.c:972:__socket_server_bind] 0-
>>>>> > > > socket.nfs-server: binding to  failed: Address already in use
>>>>> > > > [2019-04-23 13:17:05.385843] E
>>>>> > > > [socket.c:974:__socket_server_bind] 0-
>>>>> > > > socket.nfs-server: Port is already in use
>>>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0-
>>>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14
>>>>> > > >
>>>>> > > > I found where this came from, but a few stuff did surprised me:
>>>>> > > >
>>>>> > > > - the order of print is different that the order in the code
>>>>> > > >
>>>>> > >
>>>>> > > Indeed strange...
>>>>> > >
>>>>> > > > - the message on "started listening" didn't take in account the
>>>>> > > > fact
>>>>> > > > that bind failed on:
>>>>> > > >
>>>>> > >
>>>>> > > Shouldn't it bail out if it failed to bind?
>>>>> > > Some missing 'goto out' around line 975/976?
>>>>> > > Y.
>>>>> > >
>>>

Re: [Gluster-devel] [Gluster-infra] is_nfs_export_available from nfs.rc failing too often?

2019-05-08 Thread Atin Mukherjee
builder204 needs to be fixed, too many failures, mostly none of the patches
are passing regression.

On Wed, May 8, 2019 at 9:53 AM Atin Mukherjee  wrote:

>
>
> On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde  wrote:
>
>> Deepshikha,
>>
>> I see the failure here[1] which ran on builder206. So, we are good.
>>
>
> Not really,
> https://build.gluster.org/job/centos7-regression/5909/consoleFull failed
> on builder204 for similar reasons I believe?
>
> I am bit more worried on this issue being resurfacing more often these
> days. What can we do to fix this permanently?
>
>
>> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull
>>
>> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal <
>> dkhan...@redhat.com> wrote:
>>
>>> Sanju, can you please give us more info about the failures.
>>>
>>> I see the failures occurring on just one of the builder (builder206).
>>> I'm taking it back offline for now.
>>>
>>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer 
>>> wrote:
>>>
>>>> Le mardi 07 mai 2019 à 20:04 +0530, Sanju Rakonde a écrit :
>>>> > Looks like is_nfs_export_available started failing again in recent
>>>> > centos-regressions.
>>>> >
>>>> > Michael, can you please check?
>>>>
>>>> I will try but I am leaving for vacation tonight, so if I find nothing,
>>>> until I leave, I guess Deepshika will have to look.
>>>>
>>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul  wrote:
>>>> >
>>>> > >
>>>> > >
>>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer <
>>>> > > msche...@redhat.com>
>>>> > > wrote:
>>>> > >
>>>> > > > Le lundi 22 avril 2019 à 22:57 +0530, Atin Mukherjee a écrit :
>>>> > > > > Is this back again? The recent patches are failing regression
>>>> > > > > :-\ .
>>>> > > >
>>>> > > > So, on builder206, it took me a while to find that the issue is
>>>> > > > that
>>>> > > > nfs (the service) was running.
>>>> > > >
>>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs
>>>> > > > initialisation
>>>> > > > failed with a rather cryptic message:
>>>> > > >
>>>> > > > [2019-04-23 13:17:05.371733] I
>>>> > > > [socket.c:991:__socket_server_bind] 0-
>>>> > > > socket.nfs-server: process started listening on port (38465)
>>>> > > > [2019-04-23 13:17:05.385819] E
>>>> > > > [socket.c:972:__socket_server_bind] 0-
>>>> > > > socket.nfs-server: binding to  failed: Address already in use
>>>> > > > [2019-04-23 13:17:05.385843] E
>>>> > > > [socket.c:974:__socket_server_bind] 0-
>>>> > > > socket.nfs-server: Port is already in use
>>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0-
>>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14
>>>> > > >
>>>> > > > I found where this came from, but a few stuff did surprised me:
>>>> > > >
>>>> > > > - the order of print is different that the order in the code
>>>> > > >
>>>> > >
>>>> > > Indeed strange...
>>>> > >
>>>> > > > - the message on "started listening" didn't take in account the
>>>> > > > fact
>>>> > > > that bind failed on:
>>>> > > >
>>>> > >
>>>> > > Shouldn't it bail out if it failed to bind?
>>>> > > Some missing 'goto out' around line 975/976?
>>>> > > Y.
>>>> > >
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > >
>>>>
>>>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967
>>>> > > >
>>>> > > > The message about port 38465 also threw me off the track. The
>>>> > > > real
>>>> > > > issue is that the service nfs was already running, and I couldn't
>>>> > > > find
&

Re: [Gluster-devel] [Gluster-infra] is_nfs_export_available from nfs.rc failing too often?

2019-05-07 Thread Atin Mukherjee
On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde  wrote:

> Deepshikha,
>
> I see the failure here[1] which ran on builder206. So, we are good.
>

Not really,
https://build.gluster.org/job/centos7-regression/5909/consoleFull failed on
builder204 for similar reasons I believe?

I am bit more worried on this issue being resurfacing more often these
days. What can we do to fix this permanently?


> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull
>
> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal 
> wrote:
>
>> Sanju, can you please give us more info about the failures.
>>
>> I see the failures occurring on just one of the builder (builder206). I'm
>> taking it back offline for now.
>>
>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer 
>> wrote:
>>
>>> Le mardi 07 mai 2019 à 20:04 +0530, Sanju Rakonde a écrit :
>>> > Looks like is_nfs_export_available started failing again in recent
>>> > centos-regressions.
>>> >
>>> > Michael, can you please check?
>>>
>>> I will try but I am leaving for vacation tonight, so if I find nothing,
>>> until I leave, I guess Deepshika will have to look.
>>>
>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul  wrote:
>>> >
>>> > >
>>> > >
>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer <
>>> > > msche...@redhat.com>
>>> > > wrote:
>>> > >
>>> > > > Le lundi 22 avril 2019 à 22:57 +0530, Atin Mukherjee a écrit :
>>> > > > > Is this back again? The recent patches are failing regression
>>> > > > > :-\ .
>>> > > >
>>> > > > So, on builder206, it took me a while to find that the issue is
>>> > > > that
>>> > > > nfs (the service) was running.
>>> > > >
>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs
>>> > > > initialisation
>>> > > > failed with a rather cryptic message:
>>> > > >
>>> > > > [2019-04-23 13:17:05.371733] I
>>> > > > [socket.c:991:__socket_server_bind] 0-
>>> > > > socket.nfs-server: process started listening on port (38465)
>>> > > > [2019-04-23 13:17:05.385819] E
>>> > > > [socket.c:972:__socket_server_bind] 0-
>>> > > > socket.nfs-server: binding to  failed: Address already in use
>>> > > > [2019-04-23 13:17:05.385843] E
>>> > > > [socket.c:974:__socket_server_bind] 0-
>>> > > > socket.nfs-server: Port is already in use
>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0-
>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14
>>> > > >
>>> > > > I found where this came from, but a few stuff did surprised me:
>>> > > >
>>> > > > - the order of print is different that the order in the code
>>> > > >
>>> > >
>>> > > Indeed strange...
>>> > >
>>> > > > - the message on "started listening" didn't take in account the
>>> > > > fact
>>> > > > that bind failed on:
>>> > > >
>>> > >
>>> > > Shouldn't it bail out if it failed to bind?
>>> > > Some missing 'goto out' around line 975/976?
>>> > > Y.
>>> > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>>
>>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967
>>> > > >
>>> > > > The message about port 38465 also threw me off the track. The
>>> > > > real
>>> > > > issue is that the service nfs was already running, and I couldn't
>>> > > > find
>>> > > > anything listening on port 38465
>>> > > >
>>> > > > once I do service nfs stop, it no longer failed.
>>> > > >
>>> > > > So far, I do know why nfs.service was activated.
>>> > > >
>>> > > > But at least, 206 should be fixed, and we know a bit more on what
>>> > > > would
>>> > > > be causing some failure.
>>> > > >
>>> > > >
>>> > > >
>>> > > &g

Re: [Gluster-devel] [Gluster-users] Meeting Details on footer of the gluster-devel and gluster-user mailing list

2019-05-07 Thread Atin Mukherjee
On Wed, May 8, 2019 at 9:45 AM Atin Mukherjee  wrote:

>
>
> On Wed, May 8, 2019 at 12:08 AM Vijay Bellur  wrote:
>
>>
>>
>> On Tue, May 7, 2019 at 11:15 AM FNU Raghavendra Manjunath <
>> rab...@redhat.com> wrote:
>>
>>>
>>> + 1 to this.
>>>
>>
>> I have updated the footer of gluster-devel. If that looks ok, we can
>> extend it to gluster-users too.
>>
>> In case of a month with 5 Tuesdays, we can skip the 5th Tuesday and
>> always stick to the first 4 Tuesdays of every month. That will help in
>> describing the community meeting schedule better. If we want to keep the
>> schedule running on alternate Tuesdays, please let me know and the mailing
>> list footers can be updated accordingly :-).
>>
>>
>>> There is also one more thing. For some reason, the community meeting is
>>> not visible in my calendar (especially NA region). I am not sure if anyone
>>> else also facing this issue.
>>>
>>
>> I did face this issue. Realized that we had a meeting today and showed up
>> at the meeting a while later but did not see many participants. Perhaps,
>> the calendar invite has to be made a recurring one.
>>
>
> We'd need to explicitly import the invite and add it to our calendar,
> otherwise it doesn't reflect.
>

And you're right that the last series wasn't a recurring one either.


>
>> Thanks,
>> Vijay
>>
>>
>>>
>>> Regards,
>>> Raghavendra
>>>
>>> On Tue, May 7, 2019 at 5:19 AM Ashish Pandey 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> While we send a mail on gluster-devel or gluster-user mailing list,
>>>> following content gets auto generated and placed at the end of mail.
>>>>
>>>> Gluster-users mailing list
>>>> gluster-us...@gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> Gluster-devel mailing list
>>>> Gluster-devel@gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>>> In the similar way, is it possible to attach meeting schedule and link at 
>>>> the end of every such mails?
>>>> Like this -
>>>>
>>>> Meeting schedule -
>>>>
>>>>
>>>>- APAC friendly hours
>>>>   - Tuesday 14th May 2019, 11:30AM IST
>>>>   - Bridge: https://bluejeans.com/836554017
>>>>   - NA/EMEA
>>>>   - Tuesday 7th May 2019, 01:00 PM EDT
>>>>   - Bridge: https://bluejeans.com/486278655
>>>>
>>>> Or just a link to meeting minutes details??
>>>>  
>>>> <https://github.com/gluster/community/tree/master/meetings>https://github.com/gluster/community/tree/master/meetings
>>>>
>>>> This will help developers and users of the community to know when and 
>>>> where meeting happens and how to attend those meetings.
>>>>
>>>> ---
>>>> Ashish
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ___
>>>> Gluster-users mailing list
>>>> gluster-us...@gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> ___
>>> Gluster-users mailing list
>>> gluster-us...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> ___
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/836554017
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/486278655
>>
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] Meeting Details on footer of the gluster-devel and gluster-user mailing list

2019-05-07 Thread Atin Mukherjee
On Wed, May 8, 2019 at 12:08 AM Vijay Bellur  wrote:

>
>
> On Tue, May 7, 2019 at 11:15 AM FNU Raghavendra Manjunath <
> rab...@redhat.com> wrote:
>
>>
>> + 1 to this.
>>
>
> I have updated the footer of gluster-devel. If that looks ok, we can
> extend it to gluster-users too.
>
> In case of a month with 5 Tuesdays, we can skip the 5th Tuesday and always
> stick to the first 4 Tuesdays of every month. That will help in describing
> the community meeting schedule better. If we want to keep the schedule
> running on alternate Tuesdays, please let me know and the mailing list
> footers can be updated accordingly :-).
>
>
>> There is also one more thing. For some reason, the community meeting is
>> not visible in my calendar (especially NA region). I am not sure if anyone
>> else also facing this issue.
>>
>
> I did face this issue. Realized that we had a meeting today and showed up
> at the meeting a while later but did not see many participants. Perhaps,
> the calendar invite has to be made a recurring one.
>

We'd need to explicitly import the invite and add it to our calendar,
otherwise it doesn't reflect.


> Thanks,
> Vijay
>
>
>>
>> Regards,
>> Raghavendra
>>
>> On Tue, May 7, 2019 at 5:19 AM Ashish Pandey  wrote:
>>
>>> Hi,
>>>
>>> While we send a mail on gluster-devel or gluster-user mailing list,
>>> following content gets auto generated and placed at the end of mail.
>>>
>>> Gluster-users mailing list
>>> gluster-us...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> In the similar way, is it possible to attach meeting schedule and link at 
>>> the end of every such mails?
>>> Like this -
>>>
>>> Meeting schedule -
>>>
>>>
>>>- APAC friendly hours
>>>   - Tuesday 14th May 2019, 11:30AM IST
>>>   - Bridge: https://bluejeans.com/836554017
>>>   - NA/EMEA
>>>   - Tuesday 7th May 2019, 01:00 PM EDT
>>>   - Bridge: https://bluejeans.com/486278655
>>>
>>> Or just a link to meeting minutes details??
>>>  
>>> https://github.com/gluster/community/tree/master/meetings
>>>
>>> This will help developers and users of the community to know when and where 
>>> meeting happens and how to attend those meetings.
>>>
>>> ---
>>> Ashish
>>>
>>>
>>>
>>>
>>>
>>>
>>> ___
>>> Gluster-users mailing list
>>> gluster-us...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> ___
>> Gluster-users mailing list
>> gluster-us...@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
> ___
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
___

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/836554017

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/486278655

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Coverity scan - how does it ignore dismissed defects & annotations?

2019-05-03 Thread Atin Mukherjee
On Fri, 3 May 2019 at 16:07, Amar Tumballi Suryanarayan 
wrote:

>
>
> On Fri, May 3, 2019 at 3:17 PM Atin Mukherjee  wrote:
>
>>
>>
>> On Fri, 3 May 2019 at 14:59, Xavi Hernandez  wrote:
>>
>>> Hi Atin,
>>>
>>> On Fri, May 3, 2019 at 10:57 AM Atin Mukherjee 
>>> wrote:
>>>
>>>> I'm bit puzzled on the way coverity is reporting the open defects on
>>>> GD1 component. As you can see from [1], technically we have 6 open defects
>>>> and all of the rest are being marked as dismissed. We tried to put some
>>>> additional annotations in the code through [2] to see if coverity starts
>>>> feeling happy but the result doesn't change. I still see in the report it
>>>> complaints about open defect of GD1 as 25 (7 as High, 18 as medium and 1 as
>>>> Low). More interestingly yesterday's report claimed we fixed 8 defects,
>>>> introduced 1, but the overall count remained as 102. I'm not able to
>>>> connect the dots of this puzzle, can anyone?
>>>>
>>>
>>> Maybe we need to modify all dismissed CID's so that Coverity considers
>>> them again and, hopefully, mark them as solved with the newer updates. They
>>> have been manually marked to be ignored, so they are still there...
>>>
>>
>> After yesterday’s run I set the severity for all of them to see if
>> modifications to these CIDs make any difference or not. So fingers crossed
>> till the next report comes :-) .
>>
>
> If you noticed the previous day report, it was 101 'Open defects' and 65
> 'Dismissed' (which means, they are not 'fixed in code', but dismissed as
> false positive or ignore in CID dashboard.
>
> Now, it is 57 'Dismissed', which means, your patch has actually fixed 8
> defects.
>
>
>>
>>
>>> Just a thought, I'm not sure how this really works.
>>>
>>
>> Same here, I don’t understand the exact workflow and hence seeking
>> additional ideas.
>>
>>
> Looks like we should consider overall open defects as Open + Dismissed.
>

This is why I’m concerned. There’re defects which we clearly can’t or don’t
want to fix and in that case even though they are marked as dismissed the
overall open defect count doesn’t come down. So we’d never be able to come
down below total number of dismissed defects :-( .

However today’s report bring the overall count down to 97 from 102.
Coverity claimed we fixed 0 defects since last scan which means somehow my
update at those GD1 dismissed defects did a trick for 5 defects. This
continues to be a great puzzle for me!


>
>>
>>> Xavi
>>>
>>>
>>>>
>>>> [1] https://scan.coverity.com/projects/gluster-glusterfs/view_defects
>>>> [2] https://review.gluster.org/#/c/22619/
>>>> ___
>>>> Gluster-devel mailing list
>>>> Gluster-devel@gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> --
>> - Atin (atinm)
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> --
> Amar Tumballi (amarts)
>
-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Coverity scan - how does it ignore dismissed defects & annotations?

2019-05-03 Thread Atin Mukherjee
On Fri, 3 May 2019 at 14:59, Xavi Hernandez  wrote:

> Hi Atin,
>
> On Fri, May 3, 2019 at 10:57 AM Atin Mukherjee 
> wrote:
>
>> I'm bit puzzled on the way coverity is reporting the open defects on GD1
>> component. As you can see from [1], technically we have 6 open defects and
>> all of the rest are being marked as dismissed. We tried to put some
>> additional annotations in the code through [2] to see if coverity starts
>> feeling happy but the result doesn't change. I still see in the report it
>> complaints about open defect of GD1 as 25 (7 as High, 18 as medium and 1 as
>> Low). More interestingly yesterday's report claimed we fixed 8 defects,
>> introduced 1, but the overall count remained as 102. I'm not able to
>> connect the dots of this puzzle, can anyone?
>>
>
> Maybe we need to modify all dismissed CID's so that Coverity considers
> them again and, hopefully, mark them as solved with the newer updates. They
> have been manually marked to be ignored, so they are still there...
>

After yesterday’s run I set the severity for all of them to see if
modifications to these CIDs make any difference or not. So fingers crossed
till the next report comes :-) .


> Just a thought, I'm not sure how this really works.
>

Same here, I don’t understand the exact workflow and hence seeking
additional ideas.


> Xavi
>
>
>>
>> [1] https://scan.coverity.com/projects/gluster-glusterfs/view_defects
>> [2] https://review.gluster.org/#/c/22619/
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
> --
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Coverity scan - how does it ignore dismissed defects & annotations?

2019-05-03 Thread Atin Mukherjee
I'm bit puzzled on the way coverity is reporting the open defects on GD1
component. As you can see from [1], technically we have 6 open defects and
all of the rest are being marked as dismissed. We tried to put some
additional annotations in the code through [2] to see if coverity starts
feeling happy but the result doesn't change. I still see in the report it
complaints about open defect of GD1 as 25 (7 as High, 18 as medium and 1 as
Low). More interestingly yesterday's report claimed we fixed 8 defects,
introduced 1, but the overall count remained as 102. I'm not able to
connect the dots of this puzzle, can anyone?

[1] https://scan.coverity.com/projects/gluster-glusterfs/view_defects
[2] https://review.gluster.org/#/c/22619/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Should we enable contention notification by default ?

2019-05-02 Thread Atin Mukherjee
On Thu, 2 May 2019 at 20:38, Xavi Hernandez  wrote:

> On Thu, May 2, 2019 at 4:06 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Thu, 2 May 2019 at 19:14, Xavi Hernandez 
>> wrote:
>>
>>> On Thu, 2 May 2019, 15:37 Milind Changire,  wrote:
>>>
>>>> On Thu, May 2, 2019 at 6:44 PM Xavi Hernandez 
>>>> wrote:
>>>>
>>>>> Hi Ashish,
>>>>>
>>>>> On Thu, May 2, 2019 at 2:17 PM Ashish Pandey 
>>>>> wrote:
>>>>>
>>>>>> Xavi,
>>>>>>
>>>>>> I would like to keep this option (features.lock-notify-contention)
>>>>>> enabled by default.
>>>>>> However, I can see that there is one more option which will impact
>>>>>> the working of this option which is "notify-contention-delay"
>>>>>>
>>>>>
>>>> Just a nit. I wish the option was called "notify-contention-interval"
>>>> The "delay" part doesn't really emphasize where the delay would be put
>>>> in.
>>>>
>>>
>>> It makes sense. Maybe we can also rename it or add a second name
>>> (alias). If there are no objections, I will send a patch with the change.
>>>
>>> Xavi
>>>
>>>
>>>>
>>>>>  .description = "This value determines the minimum amount of time "
>>>>>> "(in seconds) between upcall contention
>>>>>> notifications "
>>>>>> "on the same inode. If multiple lock requests are
>>>>>> "
>>>>>> "received during this period, only one upcall
>>>>>> will "
>>>>>> "be sent."},
>>>>>>
>>>>>> I am not sure what should be the best value for this option if we
>>>>>> want to keep features.lock-notify-contention ON by default?
>>>>>> It looks like if we keep the value of notify-contention-delay more,
>>>>>> say 5 sec, it will wait for this much time to send up call
>>>>>> notification which does not look good.
>>>>>>
>>>>>
>>>>> No, the first notification is sent immediately. What this option does
>>>>> is to define the minimum interval between notifications. This interval is
>>>>> per lock. This is done to avoid storms of notifications if many requests
>>>>> come referencing the same lock.
>>>>>
>>>>> Is my understanding correct?
>>>>>> What will be impact of this value and what should be the default
>>>>>> value of this option?
>>>>>>
>>>>>
>>>>> I think the current default value of 5 seconds seems good enough. If
>>>>> there are many bricks, each brick could send a notification per lock. 1000
>>>>> bricks would mean a client would receive 1000 notifications every 5
>>>>> seconds. It doesn't seem too much, but in those cases 10, and considering
>>>>> we could have other locks, maybe a higher value could be better.
>>>>>
>>>>> Xavi
>>>>>
>>>>>
>>>>>>
>>>>>> ---
>>>>>> Ashish
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *From: *"Xavi Hernandez" 
>>>>>> *To: *"gluster-devel" 
>>>>>> *Cc: *"Pranith Kumar Karampuri" , "Ashish
>>>>>> Pandey" , "Amar Tumballi" 
>>>>>> *Sent: *Thursday, May 2, 2019 4:15:38 PM
>>>>>> *Subject: *Should we enable contention notification by default ?
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> there's a feature in the locks xlator that sends a notification to
>>>>>> current owner of a lock when another client tries to acquire the same 
>>>>>> lock.
>>>>>> This way the current owner is made aware of the contention and can 
>>>>>> release
>>>>>> the lock as soon as possible to allow the other client to proceed.
>>>>>>
>>>>>> This is specia

Re: [Gluster-devel] Should we enable contention notification by default ?

2019-05-02 Thread Atin Mukherjee
On Thu, 2 May 2019 at 19:14, Xavi Hernandez  wrote:

> On Thu, 2 May 2019, 15:37 Milind Changire,  wrote:
>
>> On Thu, May 2, 2019 at 6:44 PM Xavi Hernandez 
>> wrote:
>>
>>> Hi Ashish,
>>>
>>> On Thu, May 2, 2019 at 2:17 PM Ashish Pandey 
>>> wrote:
>>>
 Xavi,

 I would like to keep this option (features.lock-notify-contention)
 enabled by default.
 However, I can see that there is one more option which will impact the
 working of this option which is "notify-contention-delay"

>>>
>> Just a nit. I wish the option was called "notify-contention-interval"
>> The "delay" part doesn't really emphasize where the delay would be put in.
>>
>
> It makes sense. Maybe we can also rename it or add a second name (alias).
> If there are no objections, I will send a patch with the change.
>
> Xavi
>
>
>>
>>>  .description = "This value determines the minimum amount of time "
 "(in seconds) between upcall contention
 notifications "
 "on the same inode. If multiple lock requests are "
 "received during this period, only one upcall will "
 "be sent."},

 I am not sure what should be the best value for this option if we want
 to keep features.lock-notify-contention ON by default?
 It looks like if we keep the value of notify-contention-delay more, say
 5 sec, it will wait for this much time to send up call
 notification which does not look good.

>>>
>>> No, the first notification is sent immediately. What this option does is
>>> to define the minimum interval between notifications. This interval is per
>>> lock. This is done to avoid storms of notifications if many requests come
>>> referencing the same lock.
>>>
>>> Is my understanding correct?
 What will be impact of this value and what should be the default value
 of this option?

>>>
>>> I think the current default value of 5 seconds seems good enough. If
>>> there are many bricks, each brick could send a notification per lock. 1000
>>> bricks would mean a client would receive 1000 notifications every 5
>>> seconds. It doesn't seem too much, but in those cases 10, and considering
>>> we could have other locks, maybe a higher value could be better.
>>>
>>> Xavi
>>>
>>>

 ---
 Ashish






 --
 *From: *"Xavi Hernandez" 
 *To: *"gluster-devel" 
 *Cc: *"Pranith Kumar Karampuri" , "Ashish Pandey"
 , "Amar Tumballi" 
 *Sent: *Thursday, May 2, 2019 4:15:38 PM
 *Subject: *Should we enable contention notification by default ?

 Hi all,

 there's a feature in the locks xlator that sends a notification to
 current owner of a lock when another client tries to acquire the same lock.
 This way the current owner is made aware of the contention and can release
 the lock as soon as possible to allow the other client to proceed.

 This is specially useful when eager-locking is used and multiple
 clients access the same files and directories. Currently both replicated
 and dispersed volumes use eager-locking and can use contention notification
 to force an early release of the lock.

 Eager-locking reduces the number of network requests required for each
 operation, improving performance, but could add delays to other clients
 while it keeps the inode or entry locked. With the contention notification
 feature we avoid this delay, so we get the best performance with minimal
 issues in multiclient environments.

 Currently the contention notification feature is controlled by the
 'features.lock-notify-contention' option and it's disabled by default.
 Should we enable it by default ?

 I don't see any reason to keep it disabled by default. Does anyone
 foresee any problem ?

>>>
Is it a server only option? Otherwise it will break backward compatibility
if we rename the key, If alias can get this fixed, that’s a better choice
but I’m not sure if it solves all the problems.


 Regards,

 Xavi

 ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Milind
>>
>> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Weekly Untriaged Bugs

2019-04-28 Thread Atin Mukherjee
While I understand this report captured bugs filed since last 1 week and do
not have ‘Triaged’ keyword, does it make better sense to exclude bugs which
aren’t in NEW state?

I believe the intention of this report is to check what all bugs haven’t
been looked at by maintainers/developers yet. BZs which are already fixed
or in ASSIGNED/POST state need not to feature in this list is what I
believe as otherwise it gives a false impression that too many bugs are
getting unnoticed which isn’t the reality. Thoughts?

On Mon, 22 Apr 2019 at 07:15,  wrote:

> [...truncated 6 lines...]
> https://bugzilla.redhat.com/1699023 / core: Brick is not able to detach
> successfully in brick_mux environment
> https://bugzilla.redhat.com/1695416 / core: client log flooding with
> intentional socket shutdown message when a brick is down
> https://bugzilla.redhat.com/1695480 / core: Global Thread Pool
> https://bugzilla.redhat.com/1694943 / core: parallel-readdir slows down
> directory listing
> https://bugzilla.redhat.com/1700295 / core: The data couldn't be flushed
> immediately even with O_SYNC in glfs_create or with
> glfs_fsync/glfs_fdatasync after glfs_write.
> https://bugzilla.redhat.com/1698861 / disperse: Renaming a directory when
> 2 bricks of multiple disperse subvols are down leaves both old and new dirs
> on the bricks.
> https://bugzilla.redhat.com/1697293 / distribute: DHT: print hash and
> layout values in hexadecimal format in the logs
> https://bugzilla.redhat.com/1701039 / distribute: gluster replica 3
> arbiter Unfortunately data not distributed equally
> https://bugzilla.redhat.com/1697971 / fuse: Segfault in FUSE process,
> potential use after free
> https://bugzilla.redhat.com/1694139 / glusterd: Error waiting for job
> 'heketi-storage-copy-job' to complete on one-node k3s deployment.
> https://bugzilla.redhat.com/1695099 / glusterd: The number of glusterfs
> processes keeps increasing, using all available resources
> https://bugzilla.redhat.com/1692349 / project-infrastructure:
> gluster-csi-containers job is failing
> https://bugzilla.redhat.com/1698716 / project-infrastructure: Regression
> job did not vote for https://review.gluster.org/#/c/glusterfs/+/22366/
> https://bugzilla.redhat.com/1698694 / project-infrastructure: regression
> job isn't voting back to gerrit
> https://bugzilla.redhat.com/1699712 / project-infrastructure: regression
> job is voting Success even in case of failure
> https://bugzilla.redhat.com/1693385 / project-infrastructure: request to
> change the version of fedora in fedora-smoke-job
> https://bugzilla.redhat.com/1695484 / project-infrastructure: smoke fails
> with "Build root is locked by another process"
> https://bugzilla.redhat.com/1693184 / replicate: A brick
> process(glusterfsd) died with 'memory violation'
> https://bugzilla.redhat.com/1698566 / selfheal: shd crashed while
> executing ./tests/bugs/core/bug-1432542-mpx-restart-crash.t in CI
> https://bugzilla.redhat.com/1699309 / snapshot: Gluster snapshot fails
> with systemd autmounted bricks
> https://bugzilla.redhat.com/1696633 / tests: GlusterFs v4.1.5 Tests from
> /tests/bugs/ module failing on Intel
> https://bugzilla.redhat.com/1697812 / website: mention a pointer to all
> the mailing lists available under glusterfs project(
> https://www.gluster.org/community/)
> [...truncated 2 lines...]___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] BZ updates

2019-04-23 Thread Atin Mukherjee
Absolutely agree and I definitely think this would help going forward.

On Wed, Apr 24, 2019 at 8:45 AM Nithya Balachandran 
wrote:

> All,
>
> When working on a bug, please ensure that you update the BZ with any
> relevant information as well as the RCA. I have seen several BZs in the
> past which report crashes, however they do not have a bt or RCA captured.
> Having this information in the BZ makes it much easier to see if a newly
> reported issue has already been fixed.
>
> I propose that maintainers merge patches only if the BZs are updated with
> required information. It will take some time to make this a habit but it
> will pay off in the end.
>
> Regards,
> Nithya
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] is_nfs_export_available from nfs.rc failing too often?

2019-04-22 Thread Atin Mukherjee
Is this back again? The recent patches are failing regression :-\ .

On Wed, 3 Apr 2019 at 19:26, Michael Scherer  wrote:

> Le mercredi 03 avril 2019 à 16:30 +0530, Atin Mukherjee a écrit :
> > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan 
> > wrote:
> >
> > > Hi,
> > >
> > > is_nfs_export_available is just a wrapper around "showmount"
> > > command AFAIR.
> > > I saw following messages in console output.
> > >  mount.nfs: rpc.statd is not running but is required for remote
> > > locking.
> > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks local, or
> > > start
> > > statd.
> > > 05:06:55 mount.nfs: an incorrect mount option was specified
> > >
> > > For me it looks rpcbind may not be running on the machine.
> > > Usually rpcbind starts automatically on machines, don't know
> > > whether it
> > > can happen or not.
> > >
> >
> > That's precisely what the question is. Why suddenly we're seeing this
> > happening too frequently. Today I saw atleast 4 to 5 such failures
> > already.
> >
> > Deepshika - Can you please help in inspecting this?
>
> So we think (we are not sure) that the issue is a bit complex.
>
> What we were investigating was nightly run fail on aws. When the build
> crash, the builder is restarted, since that's the easiest way to clean
> everything (since even with a perfect test suite that would clean
> itself, we could always end in a corrupt state on the system, WRT
> mount, fs, etc).
>
> In turn, this seems to cause trouble on aws, since cloud-init or
> something rename eth0 interface to ens5, without cleaning to the
> network configuration.
>
> So the network init script fail (because the image say "start eth0" and
> that's not present), but fail in a weird way. Network is initialised
> and working (we can connect), but the dhclient process is not in the
> right cgroup, and network.service is in failed state. Restarting
> network didn't work. In turn, this mean that rpc-statd refuse to start
> (due to systemd dependencies), which seems to impact various NFS tests.
>
> We have also seen that on some builders, rpcbind pick some IP v6
> autoconfiguration, but we can't reproduce that, and there is no ip v6
> set up anywhere. I suspect the network.service failure is somehow
> involved, but fail to see how. In turn, rpcbind.socket not starting
> could cause NFS test troubles.
>
> Our current stop gap fix was to fix all the builders one by one. Remove
> the config, kill the rogue dhclient, restart network service.
>
> However, we can't be sure this is going to fix the problem long term
> since this only manifest after a crash of the test suite, and it
> doesn't happen so often. (plus, it was working before some day in the
> past, when something did make this fail, and I do not know if that's a
> system upgrade, or a test change, or both).
>
> So we are still looking at it to have a complete understanding of the
> issue, but so far, we hacked our way to make it work (or so do I
> think).
>
> Deepshika is working to fix it long term, by fixing the issue regarding
> eth0/ens5 with a new base image.
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
>
>
> --
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

2019-04-17 Thread Atin Mukherjee
On Wed, 17 Apr 2019 at 10:53, Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> Hi,
>
> In my recent test, I found that there are very severe glusterfsd memory
> leak when enable socket ssl option
>

What gluster version are you testing? Would you be able to continue your
investigation and share the root cause?

-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee
On Wed, Apr 17, 2019 at 12:33 AM Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Tue, Apr 16, 2019 at 10:27 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
>>> wrote:
>>>
>>>> Status: Tagging pending
>>>>
>>>> Waiting on patches:
>>>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>>>   https://review.gluster.org/c/glusterfs/+/22579
>>>
>>>
>>> The regression doesn't pass for the mainline patch. I believe master is
>>> broken now. With latest master sdfs-sanity.t always fail. We either need to
>>> fix it or mark it as bad test.
>>>
>>
>> commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
>> Author: Pranith Kumar K 
>> Date:   Mon Apr 1 11:14:56 2019 +0530
>>
>>features/locks: error-out {inode,entry}lk fops with all-zero lk-owner
>>
>>Problem:
>>Sometimes we find that developers forget to assign lk-owner for an
>>inodelk/entrylk/lk before writing code to wind these fops. locks
>>xlator at the moment allows this operation. This leads to multiple
>>threads in the same client being able to get locks on the inode
>>because lk-owner is same and transport is same. So isolation
>>with locks can't be achieved.
>>
>>Fix:
>>Disallow locks with lk-owner zero.
>>
>>fixes bz#1624701
>>Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
>>Signed-off-by: Pranith Kumar K 
>>
>> With the above commit sdfs-sanity.t started failing. But when I looked at
>> the last regression vote at
>> https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw
>> it voted back positive but the bell rang when I saw the overall regression
>> took less than 2 hours and when I opened the regression link I saw the test
>> actually failed but still this job voted back +1 at gerrit.
>>
>> *Deepshika* - *This is a bad CI bug we have now and have to be addressed
>> at earliest. Please take a look at
>> https://build.gluster.org/job/centos7-regression/5568/consoleFull
>> <https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
>> investigate why the regression vote wasn't negative.*
>>
>> Pranith - I request you to investigate on the sdfs-sanity.t failure
>> because of this patch.
>>
>
> sdfs is supposed to serialize entry fops by doing entrylk, but all the
> locks are being done with all-zero lk-owner. In essence sdfs doesn't
> achieve its goal of mutual exclusion when conflicting operations are
> executed by same client because two locks on same entry with same
> all-zero-owner will get locks. The patch which lead to sdfs-sanity.t
> failure treats inodelk/entrylk/lk fops with all-zero lk-owner as Invalid
> request to prevent these kinds of bugs. So it exposed the bug in sdfs. I
> sent a fix for sdfs @ https://review.gluster.org/#/c/glusterfs/+/22582
>

Since this patch hasn't passed the regression and now that I see
tests/bugs/replicate/bug-1386188-sbrain-fav-child.t hanging and timing out
in the latest nightly regression runs because of the above commit (tested
locally and confirm) I still request that we first revert this commit, get
master back to stable and then put back the required fixes.


>
>> *@Maintainers - Please open up every regression link to see the actual
>> status of the job and don't blindly trust on the +1 vote back at gerrit
>> till this is addressed.*
>>
>> As per the policy, I'm going to revert this commit, watch out for the
>> patch. I request this to be directly pushed with out waiting for the
>> regression vote as we had done before in such breakage. Amar/Shyam - I
>> believe you have this permission?
>>
>
>>
>>> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
>>> tests/basic/sdfs-sanity.t ..
>>> 1..7
>>> ok 1, LINENUM:8
>>> ok 2, LINENUM:9
>>> ok 3, LINENUM:11
>>> ok 4, LINENUM:12
>>> ok 5, LINENUM:13
>>> ok 6, LINENUM:16
>>> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
>>> argument
>>> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
>>> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
>>> not ok 7 , LINENUM:20
>>> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
>>> Fail

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee
On Tue, Apr 16, 2019 at 10:26 PM Atin Mukherjee  wrote:

>
>
> On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
>> wrote:
>>
>>> Status: Tagging pending
>>>
>>> Waiting on patches:
>>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>>   https://review.gluster.org/c/glusterfs/+/22579
>>
>>
>> The regression doesn't pass for the mainline patch. I believe master is
>> broken now. With latest master sdfs-sanity.t always fail. We either need to
>> fix it or mark it as bad test.
>>
>
> commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
> Author: Pranith Kumar K 
> Date:   Mon Apr 1 11:14:56 2019 +0530
>
>features/locks: error-out {inode,entry}lk fops with all-zero lk-owner
>
>Problem:
>Sometimes we find that developers forget to assign lk-owner for an
>inodelk/entrylk/lk before writing code to wind these fops. locks
>xlator at the moment allows this operation. This leads to multiple
>threads in the same client being able to get locks on the inode
>because lk-owner is same and transport is same. So isolation
>with locks can't be achieved.
>
>Fix:
>Disallow locks with lk-owner zero.
>
>fixes bz#1624701
>Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
>Signed-off-by: Pranith Kumar K 
>
> With the above commit sdfs-sanity.t started failing. But when I looked at
> the last regression vote at
> https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw
> it voted back positive but the bell rang when I saw the overall regression
> took less than 2 hours and when I opened the regression link I saw the test
> actually failed but still this job voted back +1 at gerrit.
>
> *Deepshika* - *This is a bad CI bug we have now and have to be addressed
> at earliest. Please take a look at
> https://build.gluster.org/job/centos7-regression/5568/consoleFull
> <https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
> investigate why the regression vote wasn't negative.*
>
> Pranith - I request you to investigate on the sdfs-sanity.t failure
> because of this patch.
>
> *@Maintainers - Please open up every regression link to see the actual
> status of the job and don't blindly trust on the +1 vote back at gerrit
> till this is addressed.*
>
> As per the policy, I'm going to revert this commit, watch out for the
> patch.
>

https://review.gluster.org/#/c/glusterfs/+/22581/
Please review and merge it.

Also since we're already close to 23:00 in IST timezone, I need help from
folks from other timezone in getting
https://review.gluster.org/#/c/glusterfs/+/22578/ rebased and marked
verified +1 once the above fix is merged. This is a blocker to
glusterfs-6.1 as otherwise ctime feature option tuning isn't honoured.

I request this to be directly pushed with out waiting for the regression
> vote as we had done before in such breakage. Amar/Shyam - I believe you
> have this permission?
>
>
>> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
>> tests/basic/sdfs-sanity.t ..
>> 1..7
>> ok 1, LINENUM:8
>> ok 2, LINENUM:9
>> ok 3, LINENUM:11
>> ok 4, LINENUM:12
>> ok 5, LINENUM:13
>> ok 6, LINENUM:16
>> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
>> argument
>> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
>> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
>> not ok 7 , LINENUM:20
>> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
>> Failed 1/7 subtests
>>
>> Test Summary Report
>> ---
>> tests/basic/sdfs-sanity.t (Wstat: 0 Tests: 7 Failed: 1)
>>   Failed test:  7
>> Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  0.58 cusr
>> 0.67 csys =  1.27 CPU)
>> Result: FAIL
>>
>>
>>>
>>> Following patches will not be taken in if CentOS regression does not
>>> pass by tomorrow morning Eastern TZ,
>>> (Pranith/KingLongMee) - cluster-syncop: avoid duplicate unlock of
>>> inodelk/entrylk
>>>   https://review.gluster.org/c/glusterfs/+/22385
>>> (Aravinda) - geo-rep: IPv6 support
>>>   https://review.gluster.org/c/glusterfs/+/22488
>>> (Aravinda) - geo-rep: fix integer config validation
>>>   https://review.gluster.org/c/glusterfs/+/22489
>>>
>>> Tracker bug status:
>>> (Ravi) - Bug 1693155 - Excessive AFR messages from gluster showing in
>>> RHGSWA.
>&g

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee
On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee  wrote:

>
>
> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
> wrote:
>
>> Status: Tagging pending
>>
>> Waiting on patches:
>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>   https://review.gluster.org/c/glusterfs/+/22579
>
>
> The regression doesn't pass for the mainline patch. I believe master is
> broken now. With latest master sdfs-sanity.t always fail. We either need to
> fix it or mark it as bad test.
>

commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
Author: Pranith Kumar K 
Date:   Mon Apr 1 11:14:56 2019 +0530

   features/locks: error-out {inode,entry}lk fops with all-zero lk-owner

   Problem:
   Sometimes we find that developers forget to assign lk-owner for an
   inodelk/entrylk/lk before writing code to wind these fops. locks
   xlator at the moment allows this operation. This leads to multiple
   threads in the same client being able to get locks on the inode
   because lk-owner is same and transport is same. So isolation
   with locks can't be achieved.

   Fix:
   Disallow locks with lk-owner zero.

   fixes bz#1624701
   Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
   Signed-off-by: Pranith Kumar K 

With the above commit sdfs-sanity.t started failing. But when I looked at
the last regression vote at
https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw it
voted back positive but the bell rang when I saw the overall regression
took less than 2 hours and when I opened the regression link I saw the test
actually failed but still this job voted back +1 at gerrit.

*Deepshika* - *This is a bad CI bug we have now and have to be addressed at
earliest. Please take a look at
https://build.gluster.org/job/centos7-regression/5568/consoleFull
<https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
investigate why the regression vote wasn't negative.*

Pranith - I request you to investigate on the sdfs-sanity.t failure because
of this patch.

*@Maintainers - Please open up every regression link to see the actual
status of the job and don't blindly trust on the +1 vote back at gerrit
till this is addressed.*

As per the policy, I'm going to revert this commit, watch out for the
patch. I request this to be directly pushed with out waiting for the
regression vote as we had done before in such breakage. Amar/Shyam - I
believe you have this permission?


> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
> tests/basic/sdfs-sanity.t ..
> 1..7
> ok 1, LINENUM:8
> ok 2, LINENUM:9
> ok 3, LINENUM:11
> ok 4, LINENUM:12
> ok 5, LINENUM:13
> ok 6, LINENUM:16
> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
> argument
> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
> not ok 7 , LINENUM:20
> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
> Failed 1/7 subtests
>
> Test Summary Report
> ---
> tests/basic/sdfs-sanity.t (Wstat: 0 Tests: 7 Failed: 1)
>   Failed test:  7
> Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  0.58 cusr
> 0.67 csys =  1.27 CPU)
> Result: FAIL
>
>
>>
>> Following patches will not be taken in if CentOS regression does not
>> pass by tomorrow morning Eastern TZ,
>> (Pranith/KingLongMee) - cluster-syncop: avoid duplicate unlock of
>> inodelk/entrylk
>>   https://review.gluster.org/c/glusterfs/+/22385
>> (Aravinda) - geo-rep: IPv6 support
>>   https://review.gluster.org/c/glusterfs/+/22488
>> (Aravinda) - geo-rep: fix integer config validation
>>   https://review.gluster.org/c/glusterfs/+/22489
>>
>> Tracker bug status:
>> (Ravi) - Bug 1693155 - Excessive AFR messages from gluster showing in
>> RHGSWA.
>>   All patches are merged, but none of the patches adds the "Fixes"
>> keyword, assume this is an oversight and that the bug is fixed in this
>> release.
>>
>> (Atin) - Bug 1698131 - multiple glusterfsd processes being launched for
>> the same brick, causing transport endpoint not connected
>>   No work has occurred post logs upload to bug, restart of bircks and
>> possibly glusterd is the existing workaround when the bug is hit. Moving
>> this out of the tracker for 6.1.
>>
>> (Xavi) - Bug 1699917 - I/O error on writes to a disperse volume when
>> replace-brick is executed
>>   Very recent bug (15th April), does not seem to have any critical data
>> corruption or service availability issues, planning on not waiting for
>> the fix in 6.1
>>
>> - Shyam
>> On 4/6/19 4:38 AM, Atin Mukherjee wrote:
>> > Hi Mohit,

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee
On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
wrote:

> Status: Tagging pending
>
> Waiting on patches:
> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>   https://review.gluster.org/c/glusterfs/+/22579


The regression doesn't pass for the mainline patch. I believe master is
broken now. With latest master sdfs-sanity.t always fail. We either need to
fix it or mark it as bad test.

root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
tests/basic/sdfs-sanity.t ..
1..7
ok 1, LINENUM:8
ok 2, LINENUM:9
ok 3, LINENUM:11
ok 4, LINENUM:12
ok 5, LINENUM:13
ok 6, LINENUM:16
mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid argument
stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
not ok 7 , LINENUM:20
FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
Failed 1/7 subtests

Test Summary Report
---
tests/basic/sdfs-sanity.t (Wstat: 0 Tests: 7 Failed: 1)
  Failed test:  7
Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  0.58 cusr  0.67
csys =  1.27 CPU)
Result: FAIL


>
> Following patches will not be taken in if CentOS regression does not
> pass by tomorrow morning Eastern TZ,
> (Pranith/KingLongMee) - cluster-syncop: avoid duplicate unlock of
> inodelk/entrylk
>   https://review.gluster.org/c/glusterfs/+/22385
> (Aravinda) - geo-rep: IPv6 support
>   https://review.gluster.org/c/glusterfs/+/22488
> (Aravinda) - geo-rep: fix integer config validation
>   https://review.gluster.org/c/glusterfs/+/22489
>
> Tracker bug status:
> (Ravi) - Bug 1693155 - Excessive AFR messages from gluster showing in
> RHGSWA.
>   All patches are merged, but none of the patches adds the "Fixes"
> keyword, assume this is an oversight and that the bug is fixed in this
> release.
>
> (Atin) - Bug 1698131 - multiple glusterfsd processes being launched for
> the same brick, causing transport endpoint not connected
>   No work has occurred post logs upload to bug, restart of bircks and
> possibly glusterd is the existing workaround when the bug is hit. Moving
> this out of the tracker for 6.1.
>
> (Xavi) - Bug 1699917 - I/O error on writes to a disperse volume when
> replace-brick is executed
>   Very recent bug (15th April), does not seem to have any critical data
> corruption or service availability issues, planning on not waiting for
> the fix in 6.1
>
> - Shyam
> On 4/6/19 4:38 AM, Atin Mukherjee wrote:
> > Hi Mohit,
> >
> > https://review.gluster.org/22495 should get into 6.1 as it’s a
> > regression. Can you please attach the respective bug to the tracker Ravi
> > pointed out?
> >
> >
> > On Sat, 6 Apr 2019 at 12:00, Ravishankar N  > <mailto:ravishan...@redhat.com>> wrote:
> >
> > Tracker bug is https://bugzilla.redhat.com/show_bug.cgi?id=1692394,
> in
> > case anyone wants to add blocker bugs.
> >
> >
> > On 05/04/19 8:03 PM, Shyam Ranganathan wrote:
> > > Hi,
> > >
> > > Expected tagging date for release-6.1 is on April, 10th, 2019.
> > >
> > > Please ensure required patches are backported and also are passing
> > > regressions and are appropriately reviewed for easy merging and
> > tagging
> > > on the date.
> > >
> > > Thanks,
> > > Shyam
> > > ___
> > > Gluster-devel mailing list
> > > Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
> > > https://lists.gluster.org/mailman/listinfo/gluster-devel
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> > --
> > - Atin (atinm)
> >
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] test failure reports for last 15 days

2019-04-10 Thread Atin Mukherjee
And now for last 15 days:

https://fstat.gluster.org/summary?start_date=2019-03-25&end_date=2019-04-10

./tests/bitrot/bug-1373520.t 18  ==> Fixed through
https://review.gluster.org/#/c/glusterfs/+/22481/, I don't see this failing
in brick mux post 5th April
./tests/bugs/ec/bug-1236065.t 17  ==> happens only in brick mux, needs
analysis.
./tests/basic/uss.t 15  ==> happens in both brick mux and non
brick mux runs, test just simply times out. Needs urgent analysis.
./tests/basic/ec/ec-fix-openfd.t 13  ==> Fixed through
https://review.gluster.org/#/c/22508/ , patch merged today.
./tests/basic/volfile-sanity.t  8  ==> Some race, though this succeeds
in second attempt every time.

There're plenty more with 5 instances of failure from many tests. We need
all maintainers/owners to look through these failures and fix them, we
certainly don't want to get into a stage where master is unstable and we
have to lock down the merges till all these failures are resolved. So
please help.

(Please note fstat stats show up the retries as failures too which in a way
is right)


On Tue, Feb 26, 2019 at 5:27 PM Atin Mukherjee  wrote:

> [1] captures the test failures report since last 30 days and we'd need
> volunteers/component owners to see why the number of failures are so high
> against few tests.
>
> [1]
> https://fstat.gluster.org/summary?start_date=2019-01-26&end_date=2019-02-25&job=all
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] SHD crash in https://build.gluster.org/job/centos7-regression/5510/consoleFull

2019-04-10 Thread Atin Mukherjee
Rafi mentioned to me earlier that this will be fixed through
https://review.gluster.org/22468 . This crash is more often seen in the
nightly regression these days. Patch needs review and I'd request the
respective maintainers to take a look at it.

On Wed, Apr 10, 2019 at 5:08 PM Nithya Balachandran 
wrote:

> Hi,
>
> My patch is unlikely to have caused this as the changes are only in dht.
> Can someone take a look?
>
> Thanks,
> Nithya
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-06 Thread Atin Mukherjee
Hi Mohit,

https://review.gluster.org/22495 should get into 6.1 as it’s a regression.
Can you please attach the respective bug to the tracker Ravi pointed out?


On Sat, 6 Apr 2019 at 12:00, Ravishankar N  wrote:

> Tracker bug is https://bugzilla.redhat.com/show_bug.cgi?id=1692394, in
> case anyone wants to add blocker bugs.
>
>
> On 05/04/19 8:03 PM, Shyam Ranganathan wrote:
> > Hi,
> >
> > Expected tagging date for release-6.1 is on April, 10th, 2019.
> >
> > Please ensure required patches are backported and also are passing
> > regressions and are appropriately reviewed for easy merging and tagging
> > on the date.
> >
> > Thanks,
> > Shyam
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
> --
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-infra] rebal-all-nodes-migrate.t always fails now

2019-04-04 Thread Atin Mukherjee
Thanks misc. I have always seen a pattern that on a reattempt (recheck
centos) the same builder is picked up many time even though it's promised
to pick up the builders in a round robin manner.

On Thu, Apr 4, 2019 at 7:24 PM Michael Scherer  wrote:

> Le jeudi 04 avril 2019 à 15:19 +0200, Michael Scherer a écrit :
> > Le jeudi 04 avril 2019 à 13:53 +0200, Michael Scherer a écrit :
> > > Le jeudi 04 avril 2019 à 16:13 +0530, Atin Mukherjee a écrit :
> > > > Based on what I have seen that any multi node test case will fail
> > > > and
> > > > the
> > > > above one is picked first from that group and If I am correct
> > > > none
> > > > of
> > > > the
> > > > code fixes will go through the regression until this is fixed. I
> > > > suspect it
> > > > to be an infra issue again. If we look at
> > > > https://review.gluster.org/#/c/glusterfs/+/22501/ &
> > > > https://build.gluster.org/job/centos7-regression/5382/ peer
> > > > handshaking is
> > > > stuck as 127.1.1.1 is unable to receive a response back, did we
> > > > end
> > > > up
> > > > having firewall and other n/w settings screwed up? The test never
> > > > fails
> > > > locally.
> > >
> > > The firewall didn't change, and since the start has a line:
> > > "-A INPUT -i lo -j ACCEPT", so all traffic on the localhost
> > > interface
> > > work. (I am not even sure that netfilter do anything meaningful on
> > > the
> > > loopback interface, but maybe I am wrong, and not keen on looking
> > > kernel code for that).
> > >
> > >
> > > Ping seems to work fine as well, so we can exclude a routing issue.
> > >
> > > Maybe we should look at the socket, does it listen to a specific
> > > address or not ?
> >
> > So, I did look at the 20 first ailure, removed all not related to
> > rebal-all-nodes-migrate.t and seen all were run on builder203, who
> > was
> > freshly reinstalled. As Deepshika noticed today, this one had a issue
> > with ipv6, the 2nd issue we were tracking.
> >
> > Summary, rpcbind.socket systemd unit listen on ipv6 despites ipv6
> > being
> > disabled, and the fix is to reload systemd. We have so far no idea on
> > why it happen, but suspect this might be related to the network issue
> > we did identify, as that happen only after a reboot, that happen only
> > if a build is cancelled/crashed/aborted.
> >
> > I apply the workaround on builder203, so if the culprit is that
> > specific issue, guess that's fixed.
> >
> > I started a test to see how it go:
> > https://build.gluster.org/job/centos7-regression/5383/
>
> The test did just pass, so I would assume the problem was local to
> builder203. Not sure why it was always selected, except because this
> was the only one that failed, so was always up for getting new jobs.
>
> Maybe we should increase the number of builder so this doesn't happen,
> as I guess the others builders were busy at that time ?
>
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] rebal-all-nodes-migrate.t always fails now

2019-04-04 Thread Atin Mukherjee
Based on what I have seen that any multi node test case will fail and the
above one is picked first from that group and If I am correct none of the
code fixes will go through the regression until this is fixed. I suspect it
to be an infra issue again. If we look at
https://review.gluster.org/#/c/glusterfs/+/22501/ &
https://build.gluster.org/job/centos7-regression/5382/ peer handshaking is
stuck as 127.1.1.1 is unable to receive a response back, did we end up
having firewall and other n/w settings screwed up? The test never fails
locally.

*15:51:21* Number of Peers: 2*15:51:21* *15:51:21* Hostname:
127.1.1.2*15:51:21* Uuid:
0e689ca8-d522-4b2f-b437-9dcde3579401*15:51:21* State: Accepted peer
request (Connected)*15:51:21* *15:51:21* Hostname: 127.1.1.3*15:51:21*
Uuid: a83a3bfa-729f-4a1c-8f9a-ae7d04ee4544*15:51:21* State: Accepted
peer request (Connected)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] shd multiplexing patch has introduced coverity defects

2019-04-03 Thread Atin Mukherjee
Based on yesterday's coverity scan report, 6 defects are introduced because
of the shd multiplexing patch. Could you address them, Rafi?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] is_nfs_export_available from nfs.rc failing too often?

2019-04-03 Thread Atin Mukherjee
On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan  wrote:

> Hi,
>
> is_nfs_export_available is just a wrapper around "showmount" command AFAIR.
> I saw following messages in console output.
>  mount.nfs: rpc.statd is not running but is required for remote locking.
> 05:06:55 mount.nfs: Either use '-o nolock' to keep locks local, or start
> statd.
> 05:06:55 mount.nfs: an incorrect mount option was specified
>
> For me it looks rpcbind may not be running on the machine.
> Usually rpcbind starts automatically on machines, don't know whether it
> can happen or not.
>

That's precisely what the question is. Why suddenly we're seeing this
happening too frequently. Today I saw atleast 4 to 5 such failures already.

Deepshika - Can you please help in inspecting this?


> Regards,
> Jiffin
>
>
> - Original Message -
> From: "Atin Mukherjee" 
> To: "gluster-infra" , "Gluster Devel" <
> gluster-devel@gluster.org>
> Sent: Wednesday, April 3, 2019 10:46:51 AM
> Subject: [Gluster-devel] is_nfs_export_available from nfs.rc failing too
>   often?
>
> I'm observing the above test function failing too often because of which
> arbiter-mount.t test fails in many regression jobs. Such frequency of
> failures wasn't there earlier. Does anyone know what has changed recently
> to cause these failures in regression? I also hear when such failure
> happens a reboot is required, is that true and if so why?
>
> One of the reference :
> https://build.gluster.org/job/centos7-regression/5340/consoleFull
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] is_nfs_export_available from nfs.rc failing too often?

2019-04-02 Thread Atin Mukherjee
I'm observing the above test function failing too often because of which
arbiter-mount.t test fails in many regression jobs. Such frequency of
failures wasn't there earlier. Does anyone know what has changed recently
to cause these failures in regression? I also hear when such failure
happens a reboot is required, is that true and if so why?

One of the reference :
https://build.gluster.org/job/centos7-regression/5340/consoleFull
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Backporting important fixes in release branches

2019-04-02 Thread Atin Mukherjee
Off late my observation has been that we're missing to backport
critical/important fixes into the release branches and we do a course of
correction when users discover the problems which isn't a great experience.
I request all developers and maintainers to pay some attention on (a)
deciding on which patches from mainline should be backported to what
release branches & (b) do the same right away once the patches are merged
in mainline branch instead of waiting to do them later.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [ovirt-users] oVirt Survey 2019 results

2019-04-02 Thread Atin Mukherjee
Thanks Sahina for including Gluster community mailing lists.

As Sahina already mentioned we had a strong focus on upgrade testing path
before releasing glusterfs-6. We conducted test day and along with
functional pieces, tested upgrade paths like from 3.12, 4 & 5 to release-6,
we encountered problems but we fixed them before releasing glusterfs-6. So
overall this experience should definitely improve with glusterfs-6.

On Tue, 2 Apr 2019 at 15:16, Sahina Bose  wrote:

>
>
> On Tue, Apr 2, 2019 at 12:07 PM Sandro Bonazzola 
> wrote:
>
>> Thanks to the 143 participants to oVirt Survey 2019!
>> The survey is now closed and results are publicly available at
>> https://bit.ly/2JYlI7U
>> We'll analyze collected data in order to improve oVirt thanks to your
>> feedback.
>>
>> As a first step after reading the results I'd like to invite the 30
>> persons who replied they're willing to contribute code to send an email to
>> de...@ovirt.org introducing themselves: we'll be more than happy to
>> welcome them and helping them getting started.
>>
>> I would also like to invite the 17 people who replied they'd like to help
>> organizing oVirt events in their area to either get in touch with me or
>> introduce themselves to us...@ovirt.org so we can discuss about events
>> organization.
>>
>> Last but not least I'd like to invite the 38 people willing to contribute
>> documentation and the one willing to contribute localization to introduce
>> themselves to de...@ovirt.org.
>>
>
> Thank you all for the feedback.
> I was looking at the feedback specific to Gluster. While it's
> disheartening to see "Gluster weakest link in oVirt", I can understand
> where the feedback and frustration is coming from.
>
> Over the past month and in this survey, the common themes that have come up
> - Ensure smoother upgrades for the hyperconverged deployments with
> GlusterFS.  The oVirt 4.3 release with upgrade to gluster 5.3 caused
> disruption for many users and we want to ensure this does not happen again.
> To this end, we are working on adding upgrade tests to OST based CI .
> Contributions are welcome.
>
> - improve performance on gluster storage domain. While we have seen
> promising results with gluster 6 release this is an ongoing effort. Please
> help this effort with inputs on the specific workloads and usecases that
> you run, gathering data and running tests.
>
> - deployment issues. We have worked to improve the deployment flow in 4.3
> by adding pre-checks and changing to gluster-ansible role based deployment.
> We would love to hear specific issues that you're facing around this -
> please raise bugs if you haven't already (
> https://bugzilla.redhat.com/enter_bug.cgi?product=cockpit-ovirt)
>
>
>
>> Thanks!
>> --
>>
>> SANDRO BONAZZOLA
>>
>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>>
>> Red Hat EMEA 
>>
>> sbona...@redhat.com
>> 
>> ___
>> Users mailing list -- us...@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/us...@ovirt.org/message/4N5DYCXY2S6ZAUI7BWD4DEKZ6JL6MSGN/
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Quick update on glusterd's volume scalability improvements

2019-03-29 Thread Atin Mukherjee
On Sat, 30 Mar 2019 at 08:06, Vijay Bellur  wrote:

>
>
> On Fri, Mar 29, 2019 at 6:42 AM Atin Mukherjee 
> wrote:
>
>> All,
>>
>> As many of you already know that the design logic with which GlusterD
>> (here on to be referred as GD1) was implemented has some fundamental
>> scalability bottlenecks at design level, especially around it's way of
>> handshaking configuration meta data and replicating them across all the
>> peers. While the initial design was adopted with a factor in mind that GD1
>> will have to deal with just few tens of nodes/peers and volumes, the
>> magnitude of the scaling bottleneck this design can bring in was never
>> realized and estimated.
>>
>> Ever since Gluster has been adopted in container storage land as one of
>> the storage backends, the business needs have changed. From tens of
>> volumes, the requirements have translated to hundreds and now to thousands.
>> We introduced brick multiplexing which had given some relief to have a
>> better control on the memory footprint when having many number of
>> bricks/volumes hosted in the node, but this wasn't enough. In one of our (I
>> represent Red Hat) customer's deployment  we had seen on a 3 nodes cluster,
>> whenever the number of volumes go beyond ~1500 and for some reason if one
>> of the storage pods get rebooted, the overall time it takes to complete the
>> overall handshaking (not only in a factor of n X n peer handshaking but
>> also the number of volume iterations, building up the dictionary and
>> sending it over the write) consumes a huge time as part of the handshaking
>> process, the hard timeout of an rpc request which is 10 minutes gets
>> expired and we see cluster going into a state where none of the cli
>> commands go through and get stuck.
>>
>> With such problem being around and more demand of volume scalability, we
>> started looking into these areas in GD1 to focus on improving (a) volume
>> scalability (b) node scalability. While (b) is a separate topic for some
>> other day we're going to focus on more on (a) today.
>>
>> While looking into this volume scalability problem with a deep dive, we
>> realized that most of the bottleneck which was causing the overall delay in
>> the friend handshaking and exchanging handshake packets between peers in
>> the cluster was iterating over the in-memory data structures of the
>> volumes, putting them into the dictionary sequentially. With 2k like
>> volumes the function glusterd_add_volumes_to_export_dict () was quite
>> costly and most time consuming. From pstack output when glusterd instance
>> was restarted in one of the pods, we could always see that control was
>> iterating in this function. Based on our testing on a 16 vCPU, 32 GB RAM 3
>> nodes cluster, this function itself took almost *7.5 minutes . *The
>> bottleneck is primarily because of sequential iteration of volumes,
>> sequentially updating the dictionary with lot of (un)necessary keys.
>>
>> So what we tried out was making this loop to work on a worker thread
>> model so that multiple threads can process a range of volume list and not
>> all of them so that we can get more parallelism within glusterd. But with
>> that we still didn't see any improvement and the primary reason for that
>> was our dictionary APIs need locking. So the next idea was to actually make
>> threads work on multiple dictionaries and then once all the volumes are
>> iterated the subsequent dictionaries to be merged into a single one. Along
>> with these changes there are few other improvements done on skipping
>> comparison of snapshots if there's no snap available, excluding tiering
>> keys if the volume type is not tier. With this enhancement [1] we see the
>> overall time it took to complete building up the dictionary from the
>> in-memory structure is *2 minutes 18 seconds* which is close*  ~3x*
>> improvement. We firmly believe that with this improvement, we should be
>> able to scale up to 2000 volumes on a 3 node cluster and that'd help our
>> users to get benefited with supporting more PVCs/volumes.
>>
>> Patch [1] is still in testing and might undergo few minor changes. But we
>> welcome you for review and comment on it. We plan to get this work
>> completed, tested and release in glusterfs-7.
>>
>> Last but not the least, I'd like to give a shout to Mohit Agrawal (In cc)
>> for all the work done on this for last few days. Thank you Mohit!
>>
>>
>
> This sounds good! Thank you for the update on this work.
>
> Did you ever consider using etcd with G

[Gluster-devel] Quick update on glusterd's volume scalability improvements

2019-03-29 Thread Atin Mukherjee
All,

As many of you already know that the design logic with which GlusterD (here
on to be referred as GD1) was implemented has some fundamental scalability
bottlenecks at design level, especially around it's way of handshaking
configuration meta data and replicating them across all the peers. While
the initial design was adopted with a factor in mind that GD1 will have to
deal with just few tens of nodes/peers and volumes, the magnitude of the
scaling bottleneck this design can bring in was never realized and
estimated.

Ever since Gluster has been adopted in container storage land as one of the
storage backends, the business needs have changed. From tens of volumes,
the requirements have translated to hundreds and now to thousands. We
introduced brick multiplexing which had given some relief to have a better
control on the memory footprint when having many number of bricks/volumes
hosted in the node, but this wasn't enough. In one of our (I represent Red
Hat) customer's deployment  we had seen on a 3 nodes cluster, whenever the
number of volumes go beyond ~1500 and for some reason if one of the storage
pods get rebooted, the overall time it takes to complete the overall
handshaking (not only in a factor of n X n peer handshaking but also the
number of volume iterations, building up the dictionary and sending it over
the write) consumes a huge time as part of the handshaking process, the
hard timeout of an rpc request which is 10 minutes gets expired and we see
cluster going into a state where none of the cli commands go through and
get stuck.

With such problem being around and more demand of volume scalability, we
started looking into these areas in GD1 to focus on improving (a) volume
scalability (b) node scalability. While (b) is a separate topic for some
other day we're going to focus on more on (a) today.

While looking into this volume scalability problem with a deep dive, we
realized that most of the bottleneck which was causing the overall delay in
the friend handshaking and exchanging handshake packets between peers in
the cluster was iterating over the in-memory data structures of the
volumes, putting them into the dictionary sequentially. With 2k like
volumes the function glusterd_add_volumes_to_export_dict () was quite
costly and most time consuming. From pstack output when glusterd instance
was restarted in one of the pods, we could always see that control was
iterating in this function. Based on our testing on a 16 vCPU, 32 GB RAM 3
nodes cluster, this function itself took almost *7.5 minutes . *The
bottleneck is primarily because of sequential iteration of volumes,
sequentially updating the dictionary with lot of (un)necessary keys.

So what we tried out was making this loop to work on a worker thread model
so that multiple threads can process a range of volume list and not all of
them so that we can get more parallelism within glusterd. But with that we
still didn't see any improvement and the primary reason for that was our
dictionary APIs need locking. So the next idea was to actually make threads
work on multiple dictionaries and then once all the volumes are iterated
the subsequent dictionaries to be merged into a single one. Along with
these changes there are few other improvements done on skipping comparison
of snapshots if there's no snap available, excluding tiering keys if the
volume type is not tier. With this enhancement [1] we see the overall time
it took to complete building up the dictionary from the in-memory structure
is *2 minutes 18 seconds* which is close*  ~3x* improvement. We firmly
believe that with this improvement, we should be able to scale up to 2000
volumes on a 3 node cluster and that'd help our users to get benefited with
supporting more PVCs/volumes.

Patch [1] is still in testing and might undergo few minor changes. But we
welcome you for review and comment on it. We plan to get this work
completed, tested and release in glusterfs-7.

Last but not the least, I'd like to give a shout to Mohit Agrawal (In cc)
for all the work done on this for last few days. Thank you Mohit!

[1] https://review.gluster.org/#/c/glusterfs/+/22445/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] requesting review available gluster* plugins in sos

2019-03-22 Thread Atin Mukherjee
On Fri, 22 Mar 2019 at 20:07, Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Wed, Mar 20, 2019 at 10:00 AM Atin Mukherjee 
> wrote:
> >
> > From glusterd perspective couple of enhancements I'd propose to be added
> (a) to capture get-state dump and make it part of sosreport . Off late, we
> have seen get-state dump has been very helpful in debugging few cases apart
> from it's original purpose of providing source of cluster/volume
> information for tendrl (b) capture glusterd statedump
> >
>
> How large can these payloads be? One of the challenges I've heard is
> that users are often challenged when attempting to push large ( > 5GB)
> payloads making the total size of the sos archive fairly big.


get-state and glusterd statedump are just mere text files of few KBs . Your
example is referring to brick statedump file having many multiplexed bricks
in a single process.


> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] GF_CALLOC to GF_MALLOC conversion - is it safe?

2019-03-21 Thread Atin Mukherjee
All,

In the last few releases of glusterfs, with stability as a primary theme of
the releases, there has been lots of changes done on the code optimization
with an expectation that such changes will have gluster to provide better
performance. While many of these changes do help, but off late we have
started seeing some diverse effects of them, one especially being the
calloc to malloc conversions. While I do understand that malloc syscall
will eliminate the extra memset bottleneck which calloc bears, but with
recent kernels having in-built strong compiler optimizations I am not sure
whether that makes any significant difference, but as I mentioned earlier
certainly if this isn't done carefully it can potentially introduce lot of
bugs and I'm writing this email to share one of such experiences.

Sanju & I were having troubles for last two days to figure out why
https://review.gluster.org/#/c/glusterfs/+/22388/ wasn't working in Sanju's
system but it had no problems running the same fix in my gluster
containers. After spending a significant amount of time, what we now
figured out is that a malloc call [1] (which was a calloc earlier) is the
culprit here. As you all can see, in this function we allocate txn_id and
copy the event->txn_id into it through gf_uuid_copy () . But when we were
debugging this step wise through gdb, txn_id wasn't exactly copied with the
exact event->txn_id and it had some junk values which made the
glusterd_clear_txn_opinfo to be invoked with a wrong txn_id later on
resulting the leaks to remain the same which was the original intention of
the fix.

This was quite painful to debug and we had to spend some time to figure
this out. Considering we have converted many such calls in past, I'd urge
that we review all such conversions and see if there're any side effects to
it. Otherwise we might end up running into many potential memory related
bugs later on. OTOH, going forward I'd request every patch
owners/maintainers to pay some special attention to these conversions and
see they are really beneficial and error free. IMO, general guideline
should be - for bigger buffers, malloc would make better sense but has to
be done carefully, for smaller size, we stick to calloc.

What do others think about it?

[1]
https://github.com/gluster/glusterfs/blob/master/xlators/mgmt/glusterd/src/glusterd-op-sm.c#L5681
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] requesting review available gluster* plugins in sos

2019-03-19 Thread Atin Mukherjee
>From glusterd perspective couple of enhancements I'd propose to be added
(a) to capture get-state dump and make it part of sosreport . Off late, we
have seen get-state dump has been very helpful in debugging few cases apart
from it's original purpose of providing source of cluster/volume
information for tendrl (b) capture glusterd statedump

Sanju - if you want to volunteer for sending this enhancement, please grab
it :-) . Note that this both the dumps are generated in /var/run/gluster,
so please check if we capture /var/run/gluster content in sosreport (which
I doubt) and if not you can always have the dump captured in a specific
file as you are already aware of.

On Wed, Mar 20, 2019 at 7:25 AM Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Tue, Mar 19, 2019 at 8:30 PM Soumya Koduri  wrote:
> > On 3/19/19 9:49 AM, Sankarshan Mukhopadhyay wrote:
> > >  is (as might just be widely known)
> > > an extensible, portable, support data collection tool primarily aimed
> > > at Linux distributions and other UNIX-like operating systems.
> > >
> > > At present there are 2 plugins
> > > 
> > > and <
> https://github.com/sosreport/sos/blob/master/sos/plugins/gluster_block.py>
> > > I'd like to request that the maintainers do a quick review that this
> > > sufficiently covers topics to help diagnose issues.
> >
> > There is one plugin available for nfs-ganesha as well -
> > https://github.com/sosreport/sos/blob/master/sos/plugins/nfsganesha.py
> >
> > It needs a minor update. Sent a pull request for the same -
> > https://github.com/sosreport/sos/pull/1593
> >
>
> Thanks Soumya!
>
> Other Gluster maintainers - review and respond please.
>
> > Kindly review.
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] GlusterFS - 6.0RC - Test days (27th, 28th Feb)

2019-03-07 Thread Atin Mukherjee
I am not sure how BZ 1683815
<https://bugzilla.redhat.com/show_bug.cgi?id=1683815> can be a blocker at
RC. We have a fix ready, but to me it doesn't look like a blocker. Vijay -
any objections?

Also the bugzilla dependency of all bugs attached to the release-6 is sort
of messed up. I see most of the times a mainline bug along with its clones
are attached to the tracker which is unnecessary. This has happened because
of default clone but I request every bugzilla assignees to spend few
additional seconds to establish the right dependency.

I have tried to correct few of them and will do the rest by next Monday.
That’d help us to filter out the unnecessary ones and get to know how many
actual blockers we have.

On Tue, Mar 5, 2019 at 11:51 PM Shyam Ranganathan 
wrote:

> On 3/4/19 12:33 PM, Shyam Ranganathan wrote:
> > On 3/4/19 10:08 AM, Atin Mukherjee wrote:
> >>
> >>
> >> On Mon, 4 Mar 2019 at 20:33, Amar Tumballi Suryanarayan
> >> mailto:atumb...@redhat.com>> wrote:
> >>
> >> Thanks to those who participated.
> >>
> >> Update at present:
> >>
> >> We found 3 blocker bugs in upgrade scenarios, and hence have marked
> >> release
> >> as pending upon them. We will keep these lists updated about
> progress.
> >>
> >>
> >> I’d like to clarify that upgrade testing is blocked. So just fixing
> >> these test blocker(s) isn’t enough to call release-6 green. We need to
> >> continue and finish the rest of the upgrade tests once the respective
> >> bugs are fixed.
> >
> > Based on fixes expected by tomorrow for the upgrade fixes, we will build
> > an RC1 candidate on Wednesday (6-Mar) (tagging early Wed. Eastern TZ).
> > This RC can be used for further testing.
>
> There have been no backports for the upgrade failures, request folks
> working on the same to post a list of bugs that need to be fixed, to
> enable tracking the same. (also, ensure they are marked against the
> release-6 tracker [1])
>
> Also, we need to start writing out the upgrade guide for release-6, any
> volunteers for the same?
>
> Thanks,
> Shyam
>
> [1] Release-6 tracker bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0
>
-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] GlusterFS - 6.0RC - Test days (27th, 28th Feb)

2019-03-04 Thread Atin Mukherjee
On Mon, 4 Mar 2019 at 20:33, Amar Tumballi Suryanarayan 
wrote:

> Thanks to those who participated.
>
> Update at present:
>
> We found 3 blocker bugs in upgrade scenarios, and hence have marked release
> as pending upon them. We will keep these lists updated about progress.


I’d like to clarify that upgrade testing is blocked. So just fixing these
test blocker(s) isn’t enough to call release-6 green. We need to continue
and finish the rest of the upgrade tests once the respective bugs are fixed.


>
> -Amar
>
> On Mon, Feb 25, 2019 at 11:41 PM Amar Tumballi Suryanarayan <
> atumb...@redhat.com> wrote:
>
> > Hi all,
> >
> > We are calling out our users, and developers to contribute in validating
> > ‘glusterfs-6.0rc’ build in their usecase. Specially for the cases of
> > upgrade, stability, and performance.
> >
> > Some of the key highlights of the release are listed in release-notes
> > draft
> > <
> https://github.com/gluster/glusterfs/blob/release-6/doc/release-notes/6.0.md
> >.
> > Please note that there are some of the features which are being dropped
> out
> > of this release, and hence making sure your setup is not going to have an
> > issue is critical. Also the default lru-limit option in fuse mount for
> > Inodes should help to control the memory usage of client processes. All
> the
> > good reason to give it a shot in your test setup.
> >
> > If you are developer using gfapi interface to integrate with other
> > projects, you also have some signature changes, so please make sure your
> > project would work with latest release. Or even if you are using a
> project
> > which depends on gfapi, report the error with new RPMs (if any). We will
> > help fix it.
> >
> > As part of test days, we want to focus on testing the latest upcoming
> > release i.e. GlusterFS-6, and one or the other gluster volunteers would
> be
> > there on #gluster channel on freenode to assist the people. Some of the
> key
> > things we are looking as bug reports are:
> >
> >-
> >
> >See if upgrade from your current version to 6.0rc is smooth, and works
> >as documented.
> >- Report bugs in process, or in documentation if you find mismatch.
> >-
> >
> >Functionality is all as expected for your usecase.
> >- No issues with actual application you would run on production etc.
> >-
> >
> >Performance has not degraded in your usecase.
> >- While we have added some performance options to the code, not all of
> >   them are turned on, as they have to be done based on usecases.
> >   - Make sure the default setup is at least same as your current
> >   version
> >   - Try out few options mentioned in release notes (especially,
> >   --auto-invalidation=no) and see if it helps performance.
> >-
> >
> >While doing all the above, check below:
> >- see if the log files are making sense, and not flooding with some
> >   “for developer only” type of messages.
> >   - get ‘profile info’ output from old and now, and see if there is
> >   anything which is out of normal expectation. Check with us on the
> numbers.
> >   - get a ‘statedump’ when there are some issues. Try to make sense
> >   of it, and raise a bug if you don’t understand it completely.
> >
> >
> > <
> https://hackmd.io/YB60uRCMQRC90xhNt4r6gA?both#Process-expected-on-test-days
> >Process
> > expected on test days.
> >
> >-
> >
> >We have a tracker bug
> >[0]
> >- We will attach all the ‘blocker’ bugs to this bug.
> >-
> >
> >Use this link to report bugs, so that we have more metadata around
> >given bugzilla.
> >- Click Here
> >   <
> https://bugzilla.redhat.com/enter_bug.cgi?blocked=1672818&bug_severity=high&component=core&priority=high&product=GlusterFS&status_whiteboard=gluster-test-day&version=6
> >
> >   [1]
> >-
> >
> >The test cases which are to be tested are listed here in this sheet
> ><
> https://docs.google.com/spreadsheets/d/1AS-tDiJmAr9skK535MbLJGe_RfqDQ3j1abX1wtjwpL4/edit?usp=sharing
> >[2],
> >please add, update, and keep it up-to-date to reduce duplicate efforts

-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] test failure reports for last 30 days

2019-02-26 Thread Atin Mukherjee
[1] captures the test failures report since last 30 days and we'd need
volunteers/component owners to see why the number of failures are so high
against few tests.

[1]
https://fstat.gluster.org/summary?start_date=2019-01-26&end_date=2019-02-25&job=all
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression health for release-5.next and release-6

2019-01-16 Thread Atin Mukherjee
On Tue, Jan 15, 2019 at 2:13 PM Atin Mukherjee 
wrote:

> Interesting. I’ll do a deep dive at it sometime this week.
>
> On Tue, 15 Jan 2019 at 14:05, Xavi Hernandez  wrote:
>
>> On Mon, Jan 14, 2019 at 11:08 AM Ashish Pandey 
>> wrote:
>>
>>>
>>> I downloaded logs of regression runs 1077 and 1073 and tried to
>>> investigate it.
>>> In both regression ec/bug-1236065.t is hanging on TEST 70  which is
>>> trying to get the online brick count
>>>
>>> I can see that in mount/bricks and glusterd logs it has not move forward
>>> after this test.
>>> glusterd.log  -
>>>
>>> [2019-01-06 16:27:51.346408]:++
>>> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 70 5 online_brick_count
>>> ++
>>> [2019-01-06 16:27:51.645014] I [MSGID: 106499]
>>> [glusterd-handler.c:4404:__glusterd_handle_status_volume] 0-management:
>>> Received status volume req for volume patchy
>>> [2019-01-06 16:27:51.646664] I [dict.c:2745:dict_get_str_boolean]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x4a6c3)
>>> [0x7f4c37fe06c3]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x43b3a)
>>> [0x7f4c37fd9b3a]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_str_boolean+0x170)
>>> [0x7f4c433d83fb] ) 0-dict: key nfs.disable, integer type asked, has string
>>> type [Invalid argument]
>>> [2019-01-06 16:27:51.647177] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick0.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647227] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick1.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647292] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick2.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647333] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick3.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647371] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick4.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647409] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/install/lib/libglusterfs.so.0(dict_get_strn+0x179)
>>> [0x7f4c433d7673] ) 0-dict: key brick5.rdma_port, string type asked, has
>>> integer type [Invalid argument]
>>> [2019-01-06 16:27:51.647447] I [dict.c:2361:dict_get_strn]
>>> (-->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0xffa32)
>>> [0x7f4c38095a32]
>>> -->/build/install/lib/glusterfs/6dev/xlator/mgmt/glusterd.so(+0x474ac)
>>> [0x7f4c37fdd4ac]
>>> -->/build/

Re: [Gluster-devel] Regression health for release-5.next and release-6

2019-01-15 Thread Atin Mukherjee
83767fbbb0, dict=dict@entry=0x0, 
>>> key=key@entry=0x7f83add06a28
>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0,
>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680
>>> #2  0x7f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030960,
>>> child=, loc=0x7f83767fbbb0, full=) at
>>> ec-heald.c:161
>>> #3  0x7f83add0325b in ec_shd_full_heal (subvol=0x7f83a8010af0,
>>> entry=, parent=0x7f83767fbde0, data=0x7f83a8030960) at
>>> ec-heald.c:294
>>> #4  0x7f83bc930ac2 in syncop_ftw (subvol=0x7f83a8010af0,
>>> loc=loc@entry=0x7f83767fbde0, pid=pid@entry=-6, 
>>> data=data@entry=0x7f83a8030960,
>>> fn=fn@entry=0x7f83add03140 ) at syncop-utils.c:125
>>> #5  0x7f83add03534 in ec_shd_full_sweep 
>>> (healer=healer@entry=0x7f83a8030960,
>>> inode=) at ec-heald.c:311
>>> #6  0x7f83add0367b in ec_shd_full_healer (data=0x7f83a8030960) at
>>> ec-heald.c:372
>>> #7  0x7f83bb709e25 in start_thread () from /usr/lib64/libpthread.so.0
>>> #8  0x7f83bafd634d in clone () from /usr/lib64/libc.so.6
>>> Thread 5 (Thread 0x7f8375ffb700 (LWP 2)):
>>> #0  0x7f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from
>>> /usr/lib64/libpthread.so.0
>>> #1  0x7f83bc910e5b in syncop_getxattr (subvol=,
>>> loc=loc@entry=0x7f8375ffabb0, dict=dict@entry=0x0, 
>>> key=key@entry=0x7f83add06a28
>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0,
>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680
>>> #2  0x7f83add02f27 in ec_shd_selfheal (healer=0x7f83a80309d0,
>>> child=, loc=0x7f8375ffabb0, full=) at
>>> ec-heald.c:161
>>> #3  0x7f83add0325b in ec_shd_full_heal (subvol=0x7f83a80144d0,
>>> entry=, parent=0x7f8375ffade0, data=0x7f83a80309d0) at
>>> ec-heald.c:294
>>> #4  0x7f83bc930ac2 in syncop_ftw (subvol=0x7f83a80144d0,
>>> loc=loc@entry=0x7f8375ffade0, pid=pid@entry=-6, 
>>> data=data@entry=0x7f83a80309d0,
>>> fn=fn@entry=0x7f83add03140 ) at syncop-utils.c:125
>>> #5  0x7f83add03534 in ec_shd_full_sweep 
>>> (healer=healer@entry=0x7f83a80309d0,
>>> inode=) at ec-heald.c:311
>>> #6  0x7f83add0367b in ec_shd_full_healer (data=0x7f83a80309d0) at
>>> ec-heald.c:372
>>> #7  0x7f83bb709e25 in start_thread () from /usr/lib64/libpthread.so.0
>>> #8  0x7f83bafd634d in clone () from /usr/lib64/libc.so.6
>>> Thread 4 (Thread 0x7f83757fa700 (LWP 25556)):
>>> #0  0x7f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from
>>> /usr/lib64/libpthread.so.0
>>> #1  0x7f83bc910e5b in syncop_getxattr (subvol=,
>>> loc=loc@entry=0x7f83757f9bb0, dict=dict@entry=0x0, 
>>> key=key@entry=0x7f83add06a28
>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0,
>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680
>>> #2  0x7f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030a40,
>>> child=, loc=0x7f83757f9bb0, full=) at
>>> ec-heald.c:161
>>> #3  0x7f83add0325b in ec_shd_full_heal (subvol=0x7f83a8017eb0,
>>> entry=, parent=0x7f83757f9de0, data=0x7f83a8030a40) at
>>> ec-heald.c:294
>>> #4  0x7f83bc930ac2 in syncop_ftw (subvol=0x7f83a8017eb0,
>>> loc=loc@entry=0x7f83757f9de0, pid=pid@entry=-6, 
>>> data=data@entry=0x7f83a8030a40,
>>> fn=fn@entry=0x7f83add03140 ) at syncop-utils.c:125
>>> #5  0x7f83add03534 in ec_shd_full_sweep 
>>> (healer=healer@entry=0x7f83a8030a40,
>>> inode=) at ec-heald.c:311
>>> #6  0x7f83add0367b in ec_shd_full_healer (data=0x7f83a8030a40) at
>>> ec-heald.c:372
>>> #7  0x7f83bb709e25 in start_thread () from /usr/lib64/libpthread.so.0
>>> #8  0x7f83bafd634d in clone () from /usr/lib64/libc.so.6
>>> Thread 3 (Thread 0x7f8374ff9700 (LWP 25557)):
>>> #0  0x7f83bb70d945 in pthread_cond_wait@@GLIBC_2.3.2 () from
>>> /usr/lib64/libpthread.so.0
>>> #1  0x7f83bc910e5b in syncop_getxattr (subvol=,
>>> loc=loc@entry=0x7f8374ff8bb0, dict=dict@entry=0x0, 
>>> key=key@entry=0x7f83add06a28
>>> "trusted.ec.heal", xdata_in=xdata_in@entry=0x0,
>>> xdata_out=xdata_out@entry=0x0) at syncop.c:1680
>>> #2  0x7f83add02f27 in ec_shd_selfheal (healer=0x7f83a8030ab0,
>>> child=, loc=0x7f8374ff8bb0, full=) at
>>> ec-heald.c:161
>>> #3  0x7f83add0325b in ec_shd_full_heal (subvol=0x7f83a801b890,
>>> entry=, parent=0x7f8374ff8de0, data=0x7f83a8030ab0) at
>>> ec-heald.c:294
>>> #4  0x000

[Gluster-devel] GCS 0.5 release

2019-01-10 Thread Atin Mukherjee
Today, we are announcing the availability of GCS (Gluster Container
Storage) 0.5.

Highlights and updates since v0.4:


- GCS environment updated to kube 1.13
- CSI deployment moved to 1.0
- Integrated Anthill deployment
- Kube & etcd metrics added to prometheus
- Tuning of etcd to increase stability
- GD2 bug fixes from scale testing effort.


Included components:


- Glusterd2: https://github.com/gluster/glusterd2

- Gluster CSI driver: https://github.com/gluster/gluster-csi-driver

- Gluster-prometheus: https://github.com/gluster/gluster-prometheus

- Anthill - https://github.com/gluster/anthill/

- Gluster-Mixins - https://github.com/gluster/gluster-mixins/


For more details on the specific content of this release please refer [3].

If you are interested in contributing, please see [4] or contact the
gluster-devel mailing list. We’re always interested in any bugs that you
find, pull requests for new features and your feedback.

Regards,

Team GCS

[1] https://github.com/gluster/gcs/releases

[2] https://github.com/gluster/gcs/tree/master/deploy

[3] 
https://waffle.io/gluster/gcs?label=GCS%2F0.5 - search for ‘Done’ lane

[4] https://github.com/gluster/gcs 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression health for release-5.next and release-6

2019-01-10 Thread Atin Mukherjee
Mohit, Sanju - request you to investigate the failures related to glusterd
and brick-mux and report back to the list.

On Thu, Jan 10, 2019 at 12:25 AM Shyam Ranganathan 
wrote:

> Hi,
>
> As part of branching preparation next week for release-6, please find
> test failures and respective test links here [1].
>
> The top tests that are failing/dumping-core are as below and need
> attention,
> - ec/bug-1236065.t
> - glusterd/add-brick-and-validate-replicated-volume-options.t
> - readdir-ahead/bug-1390050.t
> - glusterd/brick-mux-validation.t
> - bug-1432542-mpx-restart-crash.t
>
> Others of interest,
> - replicate/bug-1341650.t
>
> Please file a bug if needed against the test case and report the same
> here, in case a problem is already addressed, then do send back the
> patch details that addresses this issue as a response to this mail.
>
> Thanks,
> Shyam
>
> [1] Regression failures: https://hackmd.io/wsPgKjfJRWCP8ixHnYGqcA?view
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1060

2018-12-30 Thread Atin Mukherjee
tests/bugs/ec/bug-1236065.t is failing quite regularly in brick
multiplexing regression jobs. Request to get this fixed at earliest.

-- Forwarded message -
From: 
Date: Mon, Dec 31, 2018 at 6:30 AM
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #1060
To: 


See <
https://build.gluster.org/job/regression-test-with-multiplex/1060/display/redirect
>

--
[...truncated 976.10 KB...]
./tests/basic/ec/ec-root-heal.t  -  9 second
./tests/basic/afr/arbiter-statfs.t  -  9 second
./tests/performance/open-behind.t  -  8 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  8 second
./tests/bugs/shard/bug-1468483.t  -  8 second
./tests/bugs/replicate/bug-1325792.t  -  8 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  8 second
./tests/bugs/quota/bug-1243798.t  -  8 second
./tests/bugs/posix/bug-1360679.t  -  8 second
./tests/bugs/md-cache/bug-1211863.t  -  8 second
./tests/bugs/fuse/bug-985074.t  -  8 second
./tests/bugs/fuse/bug-963678.t  -  8 second
./tests/bugs/ec/bug-1179050.t  -  8 second
./tests/bugs/distribute/bug-882278.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/changelog/bug-1208470.t  -  8 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  8
second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  8 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  8 second
./tests/basic/fop-sampling.t  -  8 second
./tests/basic/ec/ec-read-policy.t  -  8 second
./tests/basic/ec/ec-anonymous-fd.t  -  8 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  8 second
./tests/basic/afr/ta-shd.t  -  8 second
./tests/basic/afr/gfid-mismatch.t  -  8 second
./tests/gfid2path/block-mount-access.t  -  7 second
./tests/features/lock-migration/lkmigration-set-option.t  -  7 second
./tests/bugs/upcall/bug-1458127.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/quota/bug-1104692.t  -  7 second
./tests/bugs/nfs/bug-915280.t  -  7 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  7 second
./tests/bugs/io-stats/bug-1598548.t  -  7 second
./tests/bugs/io-cache/bug-858242.t  -  7 second
./tests/bugs/gfapi/bug-1630804/gfapi-bz1630804.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/core/bug-908146.t  -  7 second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/volume-status.t  -  7 second
./tests/basic/gfapi/upcall-cache-invalidate.t  -  7 second
./tests/basic/gfapi/glfs_xreaddirplus_r.t  -  7 second
./tests/basic/gfapi/gfapi-dup.t  -  7 second
./tests/basic/gfapi/anonymous_fd.t  -  7 second
./tests/basic/distribute/throttle-rebal.t  -  7 second
./tests/basic/distribute/file-create.t  -  7 second
./tests/basic/ctime/ctime-noatime.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/arbiter-remove-brick.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  6 second
./tests/features/flock_interrupt.t  -  6 second
./tests/bugs/upcall/bug-upcall-stat.t  -  6 second
./tests/bugs/upcall/bug-1369430.t  -  6 second
./tests/bugs/snapshot/bug-1064768.t  -  6 second
./tests/bugs/shard/bug-1342298.t  -  6 second
./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
./tests/bugs/replicate/bug-1561129-enospc.t  -  6 second
./tests/bugs/replicate/bug-1365455.t  -  6 second
./tests/bugs/replicate/bug-1101647.t  -  6 second
./tests/bugs/quota/bug-1287996.t  -  6 second
./tests/bugs/nfs/bug-877885.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
./tests/bugs/nfs/bug-1116503.t  -  6 second
./tests/bugs/io-cache/bug-read-hang.t  -  6 second
./tests/bugs/glusterfs/bug-902610.t  -  6 second
./tests/bugs/glusterfs/bug-861015-log.t  -  6 second
./tests/bugs/glusterfs/bug-848251.t  -  6 second
./tests/bugs/glusterd/bug-948729/bug-948729.t  -  6 second
./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  6 second
./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t  -  6 second
./tests/bugs/fuse/bug-1030208.t  -  6 second
./tests/bugs/distribute/bug-1368012.t  -  6 second
./tests/bugs/core/io-stats-1322825.t  -  6 second
./tests/bugs/core/bug-986429.t  -  6 second
./tests/bugs/core/bug-834465.t  -  6 second
./tests/bugs/cli/bug-982174.t  -  6 second
./tests/bugs/cli/bug-1022905.t  -  6 second
./tests/bugs/bug-1258069.t  -  6 second
./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t  -  6 second
./tests/bitrot/bug-1221914.t  -  6 second
./tests/basic/posix/zero-fill-enospace.t  -  6 second
./tests/basic/playground/template-xlator-sanity.t  -  6 second
./tests/basic/md-cache/bug-1317785.t  -  6 second
./tests/basic/gfapi/glfd-lkowner.t  -  6 second
./tests/basic/gfapi/bug-1241104.t  -  6 second
./tests/basic/ec/nfs.t  -  6 second
./tests/basic/ec

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4293

2018-12-30 Thread Atin Mukherjee
Can we please check the reason of the failures?

-- Forwarded message -
From: 
Date: Sat, 29 Dec 2018 at 23:48
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4293
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4293/display/redirect?page=changes
>

Changes:

[Amar Tumballi] mgmt/glusterd: fix clang warning

[Amar Tumballi] glusterd: NULL pointer dereferencing clang fix

--
[...truncated 978.49 KB...]
./tests/basic/afr/stale-file-lookup.t  -  9 second
./tests/basic/afr/afr-up.t  -  9 second
./tests/features/lock-migration/lkmigration-set-option.t  -  8 second
./tests/bugs/upcall/bug-1458127.t  -  8 second
./tests/bugs/snapshot/bug-1260848.t  -  8 second
./tests/bugs/snapshot/bug-1064768.t  -  8 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  8 second
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  8 second
./tests/bugs/replicate/bug-1325792.t  -  8 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  8 second
./tests/bugs/quota/bug-1243798.t  -  8 second
./tests/bugs/io-stats/bug-1598548.t  -  8 second
./tests/bugs/glusterfs-server/bug-904300.t  -  8 second
./tests/bugs/glusterfs/bug-861015-index.t  -  8 second
./tests/bugs/fuse/bug-985074.t  -  8 second
./tests/bugs/fuse/bug-963678.t  -  8 second
./tests/bugs/distribute/bug-961615.t  -  8 second
./tests/bugs/distribute/bug-882278.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  8 second
./tests/bitrot/br-stub.t  -  8 second
./tests/basic/gfapi/mandatory-lock-optimal.t  -  8 second
./tests/basic/fop-sampling.t  -  8 second
./tests/basic/ec/ec-anonymous-fd.t  -  8 second
./tests/basic/distribute/file-create.t  -  8 second
./tests/basic/afr/ta-shd.t  -  8 second
./tests/basic/afr/gfid-mismatch.t  -  8 second
./tests/basic/afr/arbiter-remove-brick.t  -  8 second
./tests/gfid2path/get-gfid-to-path.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-1365455.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/quota/bug-1104692.t  -  7 second
./tests/bugs/posix/bug-1034716.t  -  7 second
./tests/bugs/nfs/bug-915280.t  -  7 second
./tests/bugs/io-cache/bug-858242.t  -  7 second
./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
./tests/bugs/glusterd/bug-1242875-do-not-pass-volinfo-quota.t  -  7 second
./tests/bugs/gfapi/bug-1630804/gfapi-bz1630804.t  -  7 second
./tests/bugs/gfapi/bug-1447266/1460514.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/ec/bug-1179050.t  -  7 second
./tests/bugs/core/bug-986429.t  -  7 second
./tests/bugs/core/bug-908146.t  -  7 second
./tests/bugs/cli/bug-982174.t  -  7 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  7
second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/basic/volume-status.t  -  7 second
./tests/basic/gfapi/upcall-cache-invalidate.t  -  7 second
./tests/basic/gfapi/glfd-lkowner.t  -  7 second
./tests/basic/gfapi/gfapi-dup.t  -  7 second
./tests/basic/gfapi/bug-1241104.t  -  7 second
./tests/basic/ec/ec-read-policy.t  -  7 second
./tests/basic/distribute/throttle-rebal.t  -  7 second
./tests/basic/ctime/ctime-noatime.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/gfid2path/block-mount-access.t  -  6 second
./tests/features/readdir-ahead.t  -  6 second
./tests/features/flock_interrupt.t  -  6 second
./tests/bugs/upcall/bug-1369430.t  -  6 second
./tests/bugs/transport/bug-873367.t  -  6 second
./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
./tests/bugs/quota/bug-1287996.t  -  6 second
./tests/bugs/nfs/bug-847622.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
./tests/bugs/nfs/bug-1116503.t  -  6 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  6 second
./tests/bugs/md-cache/afr-stale-read.t  -  6 second
./tests/bugs/io-cache/bug-read-hang.t  -  6 second
./tests/bugs/glusterfs-server/bug-864222.t  -  6 second
./tests/bugs/glusterfs/bug-902610.t  -  6 second
./tests/bugs/glusterfs/bug-848251.t  -  6 second
./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  6 second
./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t  -  6 second
./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t
-  6 second
./tests/bugs/fuse/bug-1030208.t  -  6 second
./tests/bugs/distribute/bug-912564.t  -  6 second
./tests/bugs/distribute/bug-1368012.t  -  6 second
./tests/bugs/distribute/bug-1088231.t  -  6 second
./tests/bugs/core/bug-834465.t  -  6 second
./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  6 second
./tests/bugs/cli/bug-1022905.t  -  6 second
./tests/bugs/bug-1371806_1.t  -  6 second
./tests/bugs/bug-1258069.t  -  6 second
./tests/bugs/bitrot/bug-1229134-bitd-not-su

[Gluster-devel] Update on GCS 0.5 release

2018-12-24 Thread Atin Mukherjee
We've decided to delay GCS 0.5 release and postpone by few days (new date :
1st week of Jan) considering (a) most of the team members are out on
holidays (b) some of the critical issues/PRs are yet to be addressed from
[1] .

Regards,
GCS team

[1] https://waffle.io/gluster/gcs?label=GCS%2F0.5
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] GCS 0.4 release

2018-12-12 Thread Atin Mukherjee
Today, we are announcing the availability of GCS (Gluster Container
Storage) 0.4. The release was bit delayed to address some of the critical
issues identified. This release brings in a good amount of bug fixes along
with some key feature enhancements in GlusterD2. We’d request all of you to
try this out and provide feedback.

Highlights and updates since v0.3:

- Brick Multiplexing support in GD2

- Better distributed and resilient transaction engine in GD2

- replace brick, volume profile API support in GD2

- Critical memory leak in net/rpc package of GD2

- Support of bytes as unit in volume size for volume create request in
csi-driver

- Self heal, snapshot brick count, volume profile metrics in
gluster-prometheus

- Initial SDK skeleton in anthill (gluster-operator)

Included components:

- Glusterd2: https://github.com/gluster/glusterd2

- Gluster CSI driver: https://github.com/gluster/gluster-csi-driver

- Gluster-prometheus: https://github.com/gluster/gluster-prometheus

- Anthill - https://github.com/gluster/anthill/

For more details on the specific content of this release please refer [3].

If you are interested in contributing, please see [4] or contact the
gluster-devel mailing list. We’re always interested in any bugs that you
find, pull requests for new features and your feedback.

Regards,

Team GCS

[1] https://github.com/gluster/gcs/releases

[2] https://github.com/gluster/gcs/tree/master/deploy

[3] https://waffle.io/gluster/gcs?label=GCS%2F0.4 - search for ‘Done’ lane

[4] https://github.com/gluster/gcs 
-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Shard test failing more commonly on master

2018-12-04 Thread Atin Mukherjee
 We can't afford to keep a bad test hanging for more than a day which
penalizes other fixes to be blocked (I see atleast 4-5 more patches failed
on the same test today). I thought we already had a rule to mark a test bad
at earliest in such occurrences. Not sure why we haven't done that yet. In
any case, I have marked this test as bad through
https://review.gluster.org/#/c/glusterfs/+/21800/ , please review and merge.

On Tue, Dec 4, 2018 at 7:46 PM Shyam Ranganathan 
wrote:

> Test: ./tests/bugs/shard/zero-flag.t
>
> Runs:
>   - https://build.gluster.org/job/centos7-regression/3942/console
>   - https://build.gluster.org/job/centos7-regression/3941/console
>   - https://build.gluster.org/job/centos7-regression/3938/console
>
> Failures seem to occur at common points across the tests like so,
>
> 09:52:34 stat: missing operand
> 09:52:34 Try 'stat --help' for more information.
> 09:52:34 not ok 17 Got "" instead of "2097152", LINENUM:40
> 09:52:34 FAILED COMMAND: 2097152 echo
>
> 09:52:34 stat: cannot stat
> ‘/d/backends/patchy*/.shard/41fed5c6-636e-44d6-b6ed-068b941843cd.2’: No
> such file or directory
> 09:52:34 not ok 27 , LINENUM:64
> 09:52:34 FAILED COMMAND: stat
> /d/backends/patchy*/.shard/41fed5c6-636e-44d6-b6ed-068b941843cd.2
> 09:52:34 stat: missing operand
> 09:52:34 Try 'stat --help' for more information.
> 09:52:34 not ok 28 Got "" instead of "1048602", LINENUM:66
> 09:52:34 FAILED COMMAND: 1048602 echo
>
> Krutika, is this something you are already chasing down?
>
> Thanks,
> Shyam
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] GD2 & glusterfs smoke issue

2018-11-08 Thread Atin Mukherjee
On Thu, 8 Nov 2018 at 15:07, Yaniv Kaul  wrote:

>
>
> On Tue, Nov 6, 2018 at 11:34 AM Atin Mukherjee 
> wrote:
>
>> We have enabled GD2 smoke results as a mandatory vote in glusterfs smoke
>> since yesterday through BZ [1], however we just started seeing GD2 smoke
>> failing which means glusterfs smoke on all the patches will not go through
>> at this moment.. GD2 dev is currently working on it and trying to rectify
>> the same. We'll keep you posted once it's resolved.
>>
>
> Can we enable them, but not make them a mandatory vote, for the time being?
>

Yes, we did revert the change couple of hours back now, so no more the vote
is mandatory. One of the snapshot tests is failing spuriously and Rafi is
looking into it. Once this failure is addressed, we will monitor this for
another two weeks to see if we have 100% success and then this voting will
be made as mandate again.


Y.
>
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1645776
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
> --
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Fwd: New Defects reported by Coverity Scan for gluster/glusterfs

2018-11-06 Thread Atin Mukherjee
new defects introduced in posix xlator.

-- Forwarded message -
From: 
Date: Tue, Nov 6, 2018 at 8:44 PM
Subject: New Defects reported by Coverity Scan for gluster/glusterfs

Hi,

Please find the latest report on new defect(s) introduced to
gluster/glusterfs found with Coverity Scan.

2 new defect(s) introduced to gluster/glusterfs found with Coverity Scan.
1 defect(s), reported by Coverity Scan earlier, were marked fixed in the
recent build analyzed by Coverity Scan.

New defect(s) Reported-by: Coverity Scan
Showing 2 of 2 defect(s)


** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()



*** CID 1396581:  Program hangs  (LOCK)
/xlators/features/locks/src/posix.c: 2952 in pl_metalk()
2946 gf_msg(this->name, GF_LOG_WARNING, EINVAL, 0,
2947"More than one meta-lock can not be granted on"
2948"the inode");
2949 ret = -1;
2950 }
2951 }
>>> CID 1396581:  Program hangs  (LOCK)
>>> "pthread_mutex_lock" locks "pl_inode->mutex" while it is locked.
2952 pthread_mutex_lock(&pl_inode->mutex);
2953
2954 if (ret == -1) {
2955 goto out;
2956 }
2957

** CID 1396580:  Program hangs  (LOCK)
/libglusterfs/src/iobuf.c: 370 in iobuf_pool_new()



*** CID 1396580:  Program hangs  (LOCK)
/libglusterfs/src/iobuf.c: 370 in iobuf_pool_new()
364 index = gf_iobuf_get_arena_index(page_size);
365 if (index == -1) {
366 gf_msg("iobuf", GF_LOG_ERROR, 0,
LG_MSG_PAGE_SIZE_EXCEEDED,
367"page_size (%zu) of iobufs in arena being
added is "
368"greater than max available",
369page_size);
>>> CID 1396580:  Program hangs  (LOCK)
>>> Returning without unlocking "iobuf_pool->mutex".
370 return NULL;
371 }
372
373 __iobuf_pool_add_arena(iobuf_pool, page_size,
num_pages, index);
374
375 arena_size += page_size * num_pages;



To view the defects in Coverity Scan visit,
https://u2389337.ct.sendgrid.net/wf/click?upn=08onrYu34A-2BWcWUl-2F-2BfV0V05UPxvVjWch-2Bd2MGckcRZBK54bFWohdObZ6wlkeK264nDC24cnLwH4MTOSDXRjQcO27-2F6DmQXPB4g4Mz-2BEJJ0-3D_MGdSxOtVesORpvKsy8XkEUz8gK23WuwInCh-2FVRcDCRF-2Fj7GTDPoxcf-2B4XMSxuE0IKzpwOxTNSoVvntYrckk7boSm-2BfDbr6brN0Go6MddV5Ve8QVPSgbNWXWvZ7l1kLdk6GNEGmAZfIuiGDDWWN8sx8sq3vuMKn14pdt5Dv916hJB1YBm6se3B-2B0HzsV8OiB1EdmLAFIpB4junx8QKmFhR-2FUt5daF2sDOaZ1cRkYuqrw-3D

  To manage Coverity Scan email notifications for "
atin.mukherje...@gmail.com", click
https://u2389337.ct.sendgrid.net/wf/click?upn=08onrYu34A-2BWcWUl-2F-2BfV0V05UPxvVjWch-2Bd2MGckcRbVDbis712qZDP-2FA8y06Nq4F4Na18V6TzekbRgLfnxbftCtNrSI0AdVE2H7Oze59ZO0QossEy3LBj8V8EoFBmLcCGWfAfPSpkvjpvSyEnHW4SE-2Fd5u6fIUaVdSUke9RseU-3D_MGdSxOtVesORpvKsy8XkEUz8gK23WuwInCh-2FVRcDCRF-2Fj7GTDPoxcf-2B4XMSxuE0IKzpwOxTNSoVvntYrckk7br4BV1h3eau0ij-2FPQQMBwjWfb83EERG8p56mMUZaClbKCzp3OMlM3vA-2B7Ttl8BucMYLPJn8w88855lzmJEHYTRQuUCNjxuwJwiDEl33K-2BkAuTC-2BA454wNMDJ0c-2FuhgLZOKALtX5AcGsYqywXMXpLzAc-3D
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] GD2 & glusterfs smoke issue

2018-11-06 Thread Atin Mukherjee
We have enabled GD2 smoke results as a mandatory vote in glusterfs smoke
since yesterday through BZ [1], however we just started seeing GD2 smoke
failing which means glusterfs smoke on all the patches will not go through
at this moment.. GD2 dev is currently working on it and trying to rectify
the same. We'll keep you posted once it's resolved.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1645776
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Whats latest on Glusto + GD2 integration?

2018-11-04 Thread Atin Mukherjee
Thank you Rahul for the report. This does help to keep community up to date
on the effort being put up here and understand where the things stand. Some
comments inline.

On Sun, Nov 4, 2018 at 8:01 PM Rahul Hinduja  wrote:

> Hello,
>
> Over past few weeks, few folks are engaged in integrating gd2 with
> existing glusto infrastructure/cases. This email is an attempt to provide
> the high level view of the work that's done so far and next.
>
>
> *Whats Done.*
>
>- Libraries incorporated / under review:
>   - Gluster Base Class and setup.py file required to read config file
>   and install all the packages
>   - Exception and lib-utils file required for all basic test cases
>   - Common rest methods(Post, Get,  Delete), to handle rest api’s
>   - Peer management libraries
>   - Basic Volume management libraries
>   - Basic Snapshot libraries
>   - Self-heal libraries
>   - Glusterd init
>   - Mount operations
>   - Device operations
>
> *Note:* I request you all to provide review comments on the libraries
> that are submitted. Over this week, Akarsha and Vaibhavi will try to get
> the review comments incorporated and to get these libraries to closure.
>
>- Where is the repo?
>
> [1] https://review.gluster.org/#/q/project:glusto-libs
>
>- Are we able to consume gd1 cases into gd2?
>   - We tried POC to run glusterd and snapshot test cases (one-by-one)
>   via modified automation and libraries. Following are the highlights:
>  - We were able to run 20 gd1 cases out of which 8 passed and 12
>  failed.
>  - We were able to run 11 snapshot cases out of which 7 passed
>  and 4 failed.
>   - Reason for failures:
>  - Because of different volume options with gd1/gd2
>
> Just to clarify here, we have an open GD2 issue
https://github.com/gluster/glusterd2/issues/739 which is being worked on
and that should help us to achieve this backward compatibility.

>
>-
>  - Due to different error or output format between gd1/gd2
>
>
We need to move towards parsing error codes than the error messages. I'm
aware that with GD1/CLI such infra was missing, but now that GD2 offers
specific error codes, all command failures need to be parsed through
error/ret codes in GD2. I believe the library/tests need to be modified
accordingly to cater to this need to handle both GD1/GD2 based failures.


>- For more detail which test cases is passed or failed and reasons for
>  the failures [2]
> - [2]
> 
> https://docs.google.com/spreadsheets/d/1O9JXQ2IgRIg5uZjCacybk3BMIjMmMeZsiv3-x_RTHWg/edit?usp=sharing
>
>
> *What's next?*
>
>- We have identified few gaps when we triggered glusterd and snapshot
>cases. Details in column C of  [2]. We are in the process of closing those
>gaps so that we don't have to hard-code or skip any functions in the test
>cases.
>- Develop additional/Modify existing libraries for the cases which got
>skipped.
>- Need to check on the volume options and error message or output
>format. This is being brought up in gd2 standup to freeze on the parity and
>rework at functional code level or automation code level.
>- I am aiming to provide the bi-weekly report on this integration work
>to the mailing list
>
>
>-
>
> *For more information/collaboration, please reach-out to:*
>
>- Shrivaibavi Raghaventhiran (sragh...@redhat.com)
>- Akarsha Rai (ak...@redhat.com)
>- Rahul Hinduja (rhind...@redhat.com)
>
> Regards,
> Rahul Hinduja
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Thin Arbiter Volume : Ready to Use/Trial

2018-10-29 Thread Atin Mukherjee
While the ground up work is ready, we need to make sure (a) thin arbiter
can be configured in GCS environment (b) a thin-arbiter volume can be
provisioned through csi driver.

On Mon, 29 Oct 2018 at 16:56, Ashish Pandey  wrote:

>
> Hi,
>
> We have completed and merged the patch for write transaction for
> thin-arbiter volume.
> With this, thin-arbiter volume are ready for trial. We have done some auto
> and manual testing for this volume type and it is working fine.
> A user document to setup thin-arbiter can be found here [1]. Creation and
> management of thin-arbiter volume can only be done using GD2.
>
> [1]
> https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/
>
> Please share you feedback/comments/ideas on this to improve it further.
>
> ---
> Ashish
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Update on GCS 0.2 release

2018-10-29 Thread Atin Mukherjee
GCS 0.2 release is being bit delayed and we expect to have this out by this
week. The primary reason being one of the issue filed under GCS which was
highlighted as a  critical issue at [1] . Team is working on this issue on
active basis to understand if this has something to do with etcd-operator.
Once we figure out the problem and a possible workaround, we'll cut out a
0.2 release.

[1]
https://lists.gluster.org/pipermail/gluster-devel/2018-October/055609.html
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gluster Weekly Report : Static Analyser

2018-10-26 Thread Atin Mukherjee
On Fri, 26 Oct 2018 at 21:17, Sunny Kumar  wrote:

> Hello folks,
>
> The current status of static analyser is below:
>
> Coverity scan status:
> Last week we started from 145 and now its 135 (26st Oct scan) and 3
> new defects got introduced. We fixed all 3 of them.
> Major contributors - Sunny (3 patches containing 6 fixes) and
> Bhumika(1 patch containing 2 fixes).


Can you pass me the complete list of contributors for this week please?


>
> Clang-scan status:
> Last week we started 92 and today its 90 (build #496).
> Contributors- Harpreet, Sheetal sent 1 patch each.
>
> If you want to contribute in fixing coverity and clang-scan fixes
> please follow these instruction:
> * for coverity scan fixes:
> https://lists.gluster.org/pipermail/gluster-devel/2018-August/055155.html
>  * for clang-scan:
> https://lists.gluster.org/pipermail/gluster-devel/2018-August/055338.html
>
>
> Regards,
> Sunny kumar
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Fwd: New Defects reported by Coverity Scan for gluster/glusterfs

2018-10-12 Thread Atin Mukherjee
Write behind related changes introduced new defects.

-- Forwarded message -
From: 
Date: Fri, 12 Oct 2018 at 20:43
Subject: New Defects reported by Coverity Scan for gluster/glusterfs
To: 


Hi,

Please find the latest report on new defect(s) introduced to
gluster/glusterfs found with Coverity Scan.

2 new defect(s) introduced to gluster/glusterfs found with Coverity Scan.
3 defect(s), reported by Coverity Scan earlier, were marked fixed in the
recent build analyzed by Coverity Scan.

New defect(s) Reported-by: Coverity Scan
Showing 2 of 2 defect(s)


** CID 1396102:  Null pointer dereferences  (NULL_RETURNS)
/xlators/performance/write-behind/src/write-behind.c: 2474 in
wb_mark_readdirp_start()



*** CID 1396102:  Null pointer dereferences  (NULL_RETURNS)
/xlators/performance/write-behind/src/write-behind.c: 2474 in
wb_mark_readdirp_start()
2468 wb_mark_readdirp_start(xlator_t *this, inode_t *directory)
2469 {
2470 wb_inode_t *wb_directory_inode = NULL;
2471
2472 wb_directory_inode = wb_inode_create(this, directory);
2473
>>> CID 1396102:  Null pointer dereferences  (NULL_RETURNS)
>>> Dereferencing a null pointer "wb_directory_inode".
2474 if (!wb_directory_inode->lock.spinlock)
2475 return;
2476
2477 LOCK(&wb_directory_inode->lock);
2478 {
2479 GF_ATOMIC_INC(wb_directory_inode->readdirps);

** CID 1396101:  Null pointer dereferences  (NULL_RETURNS)
/xlators/performance/write-behind/src/write-behind.c: 2494 in
wb_mark_readdirp_end()



*** CID 1396101:  Null pointer dereferences  (NULL_RETURNS)
/xlators/performance/write-behind/src/write-behind.c: 2494 in
wb_mark_readdirp_end()
2488 {
2489 wb_inode_t *wb_directory_inode = NULL, *wb_inode = NULL, *tmp
= NULL;
2490 int readdirps = 0;
2491
2492 wb_directory_inode = wb_inode_ctx_get(this, directory);
2493
>>> CID 1396101:  Null pointer dereferences  (NULL_RETURNS)
>>> Dereferencing a null pointer "wb_directory_inode".
2494 if (!wb_directory_inode->lock.spinlock)
2495 return;
2496
2497 LOCK(&wb_directory_inode->lock);
2498 {
2499 readdirps = GF_ATOMIC_DEC(wb_directory_inode->readdirps);



To view the defects in Coverity Scan visit,
https://u2389337.ct.sendgrid.net/wf/click?upn=08onrYu34A-2BWcWUl-2F-2BfV0V05UPxvVjWch-2Bd2MGckcRZBK54bFWohdObZ6wlkeK264nDC24cnLwH4MTOSDXRjQcO27-2F6DmQXPB4g4Mz-2BEJJ0-3D_MGdSxOtVesORpvKsy8XkEUz8gK23WuwInCh-2FVRcDCRGI3dzUd2Ukeqo7jOkDVtDwdofsVY7aGvZQg7zRE31MpIpZfuKb72GMUDqgUubcYrIu5oXcyupFTk-2BbhUXFdLHUSfe4AbOPNeG8BbDwGUW1v07zqQu8VKIaMFyP-2BoYbiYsfmt7-2FPg8uG5gutfCHZL61I0rptYdI3rhGJ6h55uDbGL4twf-2Fi-2F-2FuWXuVz4tE-2BiLw-3D

  To manage Coverity Scan email notifications for "
atin.mukherje...@gmail.com", click
https://u2389337.ct.sendgrid.net/wf/click?upn=08onrYu34A-2BWcWUl-2F-2BfV0V05UPxvVjWch-2Bd2MGckcRbVDbis712qZDP-2FA8y06Nq4F4Na18V6TzekbRgLfnxbftCtNrSI0AdVE2H7Oze59ZO0QossEy3LBj8V8EoFBmLcCGWfAfPSpkvjpvSyEnHW4SE-2Fd5u6fIUaVdSUke9RseU-3D_MGdSxOtVesORpvKsy8XkEUz8gK23WuwInCh-2FVRcDCRGI3dzUd2Ukeqo7jOkDVtDwdofsVY7aGvZQg7zRE31MpHd29gt9gjsKh3qRlC2RNFOu5d1QLlY3kA1t3-2BZa7JxqLa9L0-2FbeQCY21g0-2BWD9nj7BVPc7SCSBZSdLtNp0BxH2zEpj2wcPymqs8Yua6j-2FpBNb5CGYrqE-2F1elotYuozHtizG6MZ7T8-2FFr6hkCGYystU-3D

-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Missing option documentation (need inputs)

2018-10-10 Thread Atin Mukherjee
On Wed, 10 Oct 2018 at 20:30, Shyam Ranganathan  wrote:

> The following options were added post 4.1 and are part of 5.0 as the
> first release for the same. They were added in as part of bugs, and
> hence looking at github issues to track them as enhancements did not
> catch the same.
>
> We need to document it in the release notes (and also the gluster doc.
> site ideally), and hence I would like a some details on what to write
> for the same (or release notes commits) for them.
>
> Option: cluster.daemon-log-level
> Attention: @atin
> Review: https://review.gluster.org/c/glusterfs/+/20442


This option has to be used based on extreme need basis and this is why it
has been mentioned as GLOBAL_NO_DOC. So ideally this shouldn't be
documented.

Do we still want to capture it in the release notes?


>
> Option: ctime-invalidation
> Attention: @Du
> Review: https://review.gluster.org/c/glusterfs/+/20286
>
> Option: shard-lru-limit
> Attention: @krutika
> Review: https://review.gluster.org/c/glusterfs/+/20544
>
> Option: shard-deletion-rate
> Attention: @krutika
> Review: https://review.gluster.org/c/glusterfs/+/19970
>
> Please send in the required text ASAP, as we are almost towards the end
> of the release.
>
> Thanks,
> Shyam
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Nightly build status (week of 01 - 07 Oct, 2018)

2018-10-09 Thread Atin Mukherjee
On Wed, Oct 10, 2018 at 4:20 AM Shyam Ranganathan 
wrote:

> We have a set of 4 cores which seem to originate from 2 bugs as filed
> and referenced below.
>
> Bug 1: https://bugzilla.redhat.com/show_bug.cgi?id=1636570
> Cleanup sequence issues in posix xlator. Mohit/Xavi/Du/Pranith are we
> handling this as a part of addressing cleanup in brick mux, or should
> we? Instead of piece meal fixes?
>
> Bug 2: https://bugzilla.redhat.com/show_bug.cgi?id=1637743
> Initial analysis seems to point to glusterd starting the same brick
> instance twice (non-mux case). Request GlusterD folks to take a look.
>

Sanju or I will take a look at it and will get back here at earliest.
Currently we have some high priority tasks in our plate to complete, so
please expect a bit of delay on this. From the initial scan, it seems to
match one of the issues reported through
https://bugzilla.redhat.com/show_bug.cgi?id=1574298 .


> 1) Release-5
>
> Link: ttps://build.gluster.org/job/nightly-release-5/
>
> Failures:
> a)
>
> https://build.gluster.org/job/regression-test-with-multiplex/886/consoleText
>   - Bug and RCA: https://bugzilla.redhat.com/show_bug.cgi?id=1636570
>
> 2) Master
>
> Link: https://build.gluster.org/job/nightly-master/
>
> Failures:
> a) Failed job line-coverage:
> https://build.gluster.org/job/line-coverage/530/consoleText
>   - Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1637743 (initial
> analysis)
>   - Core generated
>   - Test:
>
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
>
> b) Failed job regression:
> https://build.gluster.org/job/regression-test-burn-in/4127/consoleText
>   - Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1637743 (initial
> analysis) (same as 2.a)
>   - Core generated
>   - Test: ./tests/bugs/glusterd/quorum-validation.t
>
> c) Failed job regression-with-mux:
>
> https://build.gluster.org/job/regression-test-with-multiplex/889/consoleText
>   - Bug and RCA: https://bugzilla.redhat.com/show_bug.cgi?id=1636570
> (same as 1.a)
>   - Core generated
>   - Test: ./tests/basic/ec/ec-5-2.t
>
> NOTE: All night-lies failed in distributed-regression tests as well, but
> as these are not yet stable not calling these out.
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] GCS 0.1 release!

2018-10-09 Thread Atin Mukherjee
== Overview

Today, we are announcing the availability of GCS (Gluster Container
Storage) 0.1. This initial release is designed to provide a platform for
community members to try out and provide feedback on the new Gluster
container storage stack. This new stack is a collaboration across a number
of repositories, currently including the main GCS repository [1], core
glusterfs [2], glusterd2 [3], and gluster-csi-driver [4].

== Getting started

The GCS repository provides a VM-based (Vagrant) environment that makes it
easy to install and take GCS for a test-drive. See
https://github.com/gluster/gcs/tree/master/deploy#local-cluster-using-vagrant
for a set of instructions to bring up a multi-node cluster with GCS
installed. The Ansible-based deploy scripts create a Kubernetes cluster
using kubespray, then deploy the GCS components. These playbooks can also
be used to bring up GCS on other Kubernetes clusters as well.

== Current features

This is the initial release of the GCS stack. It allows dynamic
provisioning of persistent volumes using the CSI interface. Supported
features include:

   -

   1x3 (3-way replicated) volumes
   -

   GCS should be able to recover from restarts of any individual
   GCS-related pod. Since this is the initial version, bugs or feedback on
   improvements will be appreciated in a form of github issue in the
   respective repos.


== Next steps

   -

   Adding e2e testing for nightly validation of entire system
   -

   Will be adding gluster-prometheus for metrics. The work under this can
   be tracked at the gluster-prometheus repo [5]
   -

   Starting work on operator to deploy and manage the stack through anthill
   [6]
   -

   Bi-weekly update to the community on the progress made on GCS.


== GCS project management

   -

   GCS and the other associated repos are coordinated via waffle.io for
   planning and tracking deliverables over sprints.


   -

   Cross-repo coordination of milestones and sprints will be tracked
   through a common set of labels, prefixed with “GCS/”. For example, we
   already have labels for major milestones defined like ‘GCS/alpha1’ ,
   ‘GCS/beta0’. Additional labels like 'GCS/0.2'/'GCS/0.3'/... will be created
   for each sprints/releases so that the respective teams can tag planned
   deliverables in a common way.


== Collaboration opportunities

   -

   Improving install experience
   -

   Helping w/ E2E testing framework
   -

   Testing and opening bug reports


== Relationship to Heketi and glusterd (the legacy stack)

While GCS is shaping the future stack for Gluster in containers, the
traditional method for deploying container-based storage with Gluster (and
current GlusterD) and Heketi is still available, and it remains the
preferred method for production usage. To find out more about Heketi and
this production-ready stack, visit the gluster-kubernetes repo [7].

Regards,

Team GCS

[1] https://github.com/gluster/gcs

[2] https://github.com/gluster/glusterfs

[3] https://github.com/gluster/glusterd2

[4] https://github.com/gluster/gluster-csi-driver/

[5] https://github.com/gluster/gluster-prometheus

[6] https://github.com/gluster/anthill

[7] https://github.com/gluster/gluster-kubernetes
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Branched and further dates

2018-10-05 Thread Atin Mukherjee
On Fri, 5 Oct 2018 at 20:29, Shyam Ranganathan  wrote:

> On 10/04/2018 11:33 AM, Shyam Ranganathan wrote:
> > On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> >> RC1 would be around 24th of Sep. with final release tagging around 1st
> >> of Oct.
> >
> > RC1 now stands to be tagged tomorrow, and patches that are being
> > targeted for a back port include,
>
> We still are awaiting release notes (other than the bugs section) to be
> closed.
>
> There is one new bug that needs attention from the replicate team.
> https://bugzilla.redhat.com/show_bug.cgi?id=1636502
>
> The above looks important to me to be fixed before the release, @ravi or
> @pranith can you take a look?
>
> >
> > 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> > mux cases)
> >
> > @RaBhat working on this.
>
> Done
>
> >
> > 2) Py3 corrections in master
> >
> > @Kotresh are all changes made to master backported to release-5 (may not
> > be merged, but looking at if they are backported and ready for merge)?
>
> Done, release notes amend pending
>
> >
> > 3) Release notes review and updates with GD2 content pending
> >
> > @Kaushal/GD2 team can we get the updates as required?
> > https://review.gluster.org/c/glusterfs/+/21303
>
> Still awaiting this.


Kaushal has added a comment into the patch providing the content today
morning IST. Any additional details are you looking for?


>
> >
> > 4) This bug [2] was filed when we released 4.0.
> >
> > The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> > missing and hence post-upgrade clients failing the mount). This is
> > possibly the last chance to fix it.
> >
> > Glusterd and protocol maintainers, can you chime in, if this bug needs
> > to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>
> Release notes to be corrected to call this out.
>
> >
> > The tracker bug [1] does not have any other blockers against it, hence
> > assuming we are not tracking/waiting on anything other than the set
> above.
> >
> > Thanks,
> > Shyam
> >
> > [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> > [2] Potential upgrade bug:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1540659
> > ___
> > maintainers mailing list
> > maintain...@gluster.org
> > https://lists.gluster.org/mailman/listinfo/maintainers
> >
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Atin Mukherjee
Deepshika,

Please keep us posted on if you see the particular glusterd test failing
again.  It’ll be great to see this nightly job green sooner than later :-) .

On Thu, 4 Oct 2018 at 15:07, Deepshikha Khandelwal 
wrote:

> On Thu, Oct 4, 2018 at 6:10 AM Sanju Rakonde  wrote:
> >
> >
> >
> > On Wed, Oct 3, 2018 at 3:26 PM Deepshikha Khandelwal <
> dkhan...@redhat.com> wrote:
> >>
> >> Hello folks,
> >>
> >> Distributed-regression job[1] is now a part of Gluster's
> >> nightly-master build pipeline. The following are the issues we have
> >> resolved since we started working on this:
> >>
> >> 1) Collecting gluster logs from servers.
> >> 2) Tests failed due to infra-related issues have been fixed.
> >> 3) Time taken to run regression testing reduced to ~50-60 minutes.
> >>
> >> To get time down to 40 minutes needs your help!
> >>
> >> Currently, there is a test that is failing:
> >>
> >> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
> >>
> >> This needs fixing first.
> >
> >
> > Where can I get the logs of this test case? In
> https://build.gluster.org/job/distributed-regression/264/console I see
> this test case is failed and re-attempted. But I couldn't find logs.
> There's a link in the end of console output where you can look for the
> logs of failed tests.
> We had a bug in the setup and the logs were not getting saved. We've
> fixed this and future jobs should have the logs at the log collector's
> link show up in the console output.
>
> >>
> >>
> >> There's a test that takes 14 minutes to complete -
> >> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
> >> 14 minutes is not something we can distribute. Can we look at how we
> >> can speed this up[2]? When this test fails, it is re-attempted,
> >> further increasing the time. This happens in the regular
> >> centos7-regression job as well.
> >>
> >> If you see any other issues, please file a bug[3].
> >>
> >> [1]: https://build.gluster.org/job/distributed-regression
> >> [2]: https://build.gluster.org/job/distributed-regression/264/console
> >> [3]:
> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs&component=project-infrastructure
> >>
> >> Thanks,
> >> Deepshikha Khandelwal
> >> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
> wrote:
> >> >>
> >> >>
> >> >>
> >> >>> There are currently a few known issues:
> >> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
> >> >>
> >> >>
> >> >> If I look at the activities involved with regression failures, this
> can wait.
> >> >
> >> >
> >> > Well, we can't debug the current failures without having the logs. So
> this has to be fixed first.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * A few tests fail due to infra-related issues like geo-rep tests.
> >> >>
> >> >>
> >> >> Please open bugs for this, so we can track them, and take it to
> closure.
> >> >
> >> >
> >> > These are failing due to infra reasons. Most likely subtle
> differences in the setup of these nodes vs our normal nodes. We'll only be
> able to debug them once we get the logs. I know the geo-rep ones are easy
> to fix. The playbook for setting up geo-rep correctly just didn't make it
> over to the playbook used for these images.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60
> minutes)
> >> >>
> >> >>
> >> >> Time can change with more tests added, and also please plan to have
> number of server as 1 to n.
> >> >
> >> >
> >> > While the n is configurable, however it will be fixed to a single
> digit number for now. We will need to place *some* limitation somewhere or
> else we'll end up not being able to control our cloud bills.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * We've only tested plain regressions. ASAN and Valgrind are
> currently untested.
> >> >>
> >> >>
> >> >> Great to have it running not 'per patch', but as nightly, or weekly
> to start with.
> >> >
> >> >
> >> > This is currently not targeted until we phase out current regressions.
> >> >
> >> >>>
> >> >>>
> >> >>> Before bringing it into production, we'll run this job nightly and
> >> >>> watch it for a month to debug the other failures.
> >> >>>
> >> >>
> >> >> I would say, bring it to production sooner, say 2 weeks, and also
> plan to have the current regression as is with a special command like 'run
> regression in-one-machine' in gerrit (or something similar) with voting
> rights, so we can fall back to this method if something is broken in
> parallel testing.
> >> >>
> >> >> I have seen that regardless of amount of time we put some scripts in
> testing, the day we move to production, some thing would be broken. So, let
> that happen earlier than later, so it would help next release branching
> out. Don't want to be stuck for branching due to infra failures.
> >> >
> >> >
> >> > Having two regression jobs that can vote is going to cause more
> confusion than it's worth. There 

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Atin Mukherjee
On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
wrote:

> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> > RC1 would be around 24th of Sep. with final release tagging around 1st
> > of Oct.
>
> RC1 now stands to be tagged tomorrow, and patches that are being
> targeted for a back port include,
>
> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> mux cases)
>
> @RaBhat working on this.
>
> 2) Py3 corrections in master
>
> @Kotresh are all changes made to master backported to release-5 (may not
> be merged, but looking at if they are backported and ready for merge)?
>
> 3) Release notes review and updates with GD2 content pending
>
> @Kaushal/GD2 team can we get the updates as required?
> https://review.gluster.org/c/glusterfs/+/21303
>
> 4) This bug [2] was filed when we released 4.0.
>
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
>
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>

This is a bad bug to live with. OTOH, I do not have an immediate solution
in my mind on how to make sure (a) these options when reintroduced are made
no-ops, especially they will be disallowed to tune (with out dirty option
check hacks at volume set staging code) . If we're to tag RC1 tomorrow, I
wouldn't be able to take a risk to commit this change.

Can we actually have a note in our upgrade guide to document that if you're
upgrading to 4.1 or higher version make sure to disable these options
before the upgrade to mitigate this?


> The tracker bug [1] does not have any other blockers against it, hence
> assuming we are not tracking/waiting on anything other than the set above.
>
> Thanks,
> Shyam
>
> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> [2] Potential upgrade bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Status update : Brick Mux threads reduction

2018-10-03 Thread Atin Mukherjee
I have rebased [1] and triggered brick-mux regression as we fixed one
genuine snapshot test failure in brick mux through
https://review.gluster.org/#/c/glusterfs/+/21314/ which got merged today.

On Thu, Oct 4, 2018 at 10:39 AM Poornima Gurusiddaiah 
wrote:

> Hi,
>
> For each brick, we create atleast 20+ threads, hence in a brick mux use
> case, where we load multiple bricks in the same process, there will 100s of
> threads resulting in perf issues, memory usage increase.
>
> IO-threads :  Make it global, to the process, and ref count the resource.
> patch [1], has failures in brick mux regression, likey not related to the
> patch, need to get it passed.
>
> Posix- threads : Janitor, Helper, Fsyncer, instead of using one thread per
> task, use synctask framework instead. In the future use thread pool in
> patch [2]. Patches are posted[1], fixing some regression failures.
>
> Posix, bitrot aio-thread : This thread cannot be replaced to just use
> synctask/thread pool as there cannot be a delay in recieving notifications
> and acting on it. Hence, create a global aio event receiver thread for the
> process. This is WIP and is not yet posted upstream.
>
> Threads in changelog/bitrot xlator Mohit posted a patch where default
> xlator does not need to start a thread if xlator is not enabled
> https://review.gluster.org/#/c/glusterfs/+/21304/ (it can save 6 thread
> per brick in default option)
>
> Pending: Create a build of these patches, run perf tests with these
> patches and analyze the same.
>
>
> [1] https://review.gluster.org/#/c/glusterfs/+/20761/
> [2] https://review.gluster.org/#/c/glusterfs/+/20636/
>
> Regards,
> Poornima
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Proposal to change Gerrit -> Bugzilla updates

2018-09-11 Thread Atin Mukherjee
On Mon, Sep 10, 2018 at 7:09 PM Shyam Ranganathan 
wrote:

> On 09/10/2018 08:37 AM, Nigel Babu wrote:
> > Hello folks,
> >
> > We now have review.gluster.org  as an
> > external tracker on Bugzilla. Our current automation when there is a
> > bugzilla attached to a patch is as follows:
> >
> > 1. When a new patchset has "Fixes: bz#1234" or "Updates: bz#1234", we
> > will post a comment to the bug with a link to the patch and change the
> > status to POST. 2. When the patchset is merged, if the commit said
> > "Fixes", we move the status to MODIFIED.
> >
> > I'd like to propose the following improvements:
> > 1. Add the Gerrit URL as an external tracker to the bug.
>
> My assumption here is that for each patch that mentions a BZ, an
> additional tracker would be added to the tracker list, right?
>
> Further assumption (as I have not used trackers before) is that this
> would reduce noise as comments in the bug itself, right?
>
> In the past we have reduced noise by not commenting on the bug (or
> github issue) every time the patch changes, so we get 2 comments per
> patch currently, with the above change we would just get one and that
> too as a terse external reference (see [1], based on my
> test/understanding).
>
> What we would lose is the commit details when the patch is merged in the
> BZ, as far as I can tell based on the changes below. These are useful
> and would like these to be retained in case they are not.
>

The commit at the bugzilla has been extremely helpful, in fact I could
refer to the commit details to understand what has been fixed for the bug
when r.g.o was down in couple of instances. So my vote would be to stick to
the same.


> > 2. When a patch is merged, only change state of the bug if needed. If
> > there is no state change, do not add an additional message. The external
> > tracker state should change reflecting the state of the review.
>
> I added a tracker to this bug [1], but not seeing the tracker state
> correctly reflected in BZ, is this work that needs to be done?
>
> > 3. Assign the bug to the committer. This has edge cases, but it's best
> > to at least handle the easy ones and then figure out edge cases later.
> > The experience is going to be better than what it is right now.
>

Assign the bug to the committer - When? Is it when the first patch set is
posted or is it when the patch(es) are merged and bug is moved to MODIFIED?


> Is the above a reference to just the "assigned to", or overall process?
> If overall can you elaborate a little more on why this would be better
> (I am not saying it is not, attempting to understand how you see it).
>
> >
> > Please provide feedback/comments by end of day Friday. I plan to add
> > this activity to the next Infra team sprint that starts on Monday (Sep
> 17).
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1619423
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterd.log file - few observations

2018-09-09 Thread Atin Mukherjee
As highlighted in the last maintainers meeting that I'm seeing some log
entries in the glusterd log file which are (a) informative logs in one way
but can cause excessive logging and potentially may run an user with out of
space issue (b) some logs might not be errors or be avoided.
Even though these logs are captured in glusterd.log file, but they don't
originate from glusterd APIs. So request all of devs to go through the
below. I will eventually convert this to a BZ to track this better, so this
is a heads up for now.

I believe as a practice working on code changes, we need to look at the
aspects of balancing out the log entries i.e. (a) don't log everything &
(b) log only meaningful things.

===
[2018-09-10 03:55:19.236387] I [dict.c:2838:dict_get_str_boolean]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_reconnect+0xc2) [0x7ff7a83d0452]
-->/usr/local/lib/glusterfs/4.2dev/rpc-transport/socket.so(+0x65b0)
[0x7ff7a06cf5b0]
-->/usr/local/lib/libglusterfs.so.0(dict_get_str_boolean+0xcf)
[0x7ff7a85fc58f] ) 0-dict: key transport.socket.ignore-enoent, integer type
asked, has string type [Invalid argument] <==* seen in volume start*

[2018-09-10 03:55:21.583508] I [run.c:241:runner_log]
(-->/usr/local/lib/glusterfs/4.2dev/xlator/mgmt/glusterd.so(+0xd166a)
[0x7ff7a34d766a]
-->/usr/local/lib/glusterfs/4.2dev/xlator/mgmt/glusterd.so(+0xd119c)
[0x7ff7a34d719c] -->/usr/local/lib/libglusterfs.so.0(runner_log+0x105)
[0x7ff7a8651805] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/start/post/S29CTDBsetup.sh --volname=test-vol1
--first=yes --version=1 --volume-op=start --gd-workdir=/var/lib/glusterd
<== *seen in volume start*

[2018-09-10 03:55:13.647675] E [MSGID: 101191]
[event-epoll.c:689:event_dispatch_epoll_worker] 0-epoll: Failed to dispatch
handler <== *seen in almost all the volume operations*


[2018-09-10 03:57:14.370861] E [MSGID: 106061]
[glusterd-utils.c:10584:glusterd_max_opversion_use_rsp_dict] 0-management:
Maximum supported op-version not set in destination dictionary
<=== *seen while running gluster v get all all*

[2018-09-10 03:58:24.307305] W [MSGID: 101095]
[xlator.c:181:xlator_volopt_dynload] 0-xlator:
/usr/local/lib/glusterfs/4.2dev/xlator/features/cloudsync.so: cannot open
shared object file: No such file or directory
[2018-09-10 03:58:24.307322] E [MSGID: 106434]
[glusterd-utils.c:13301:glusterd_get_value_for_vme_entry] 0-management:
xlator_volopt_dynload error (-1)
[2018-09-10 03:58:24.307336] W [MSGID: 101095]
[xlator.c:181:xlator_volopt_dynload] 0-xlator:
/usr/local/lib/glusterfs/4.2dev/xlator/features/cloudsync.so: cannot open
shared object file: No such file or directory
[2018-09-10 03:58:24.307340] E [MSGID: 106434]
[glusterd-utils.c:13301:glusterd_get_value_for_vme_entry] 0-management:
xlator_volopt_dynload error (-1)
[2018-09-10 03:58:24.307350] W [MSGID: 101095]
[xlator.c:181:xlator_volopt_dynload] 0-xlator:
/usr/local/lib/glusterfs/4.2dev/xlator/features/cloudsync.so: cannot open
shared object file: No such file or directory
[2018-09-10 03:58:24.307355] E [MSGID: 106434]
[glusterd-utils.c:13301:glusterd_get_value_for_vme_entry] 0-management:
xlator_volopt_dynload error (-1)
[2018-09-10 03:58:24.307365] W [MSGID: 101095]
[xlator.c:181:xlator_volopt_dynload] 0-xlator:
/usr/local/lib/glusterfs/4.2dev/xlator/features/cloudsync.so: cannot open
shared object file: No such file or directory
[2018-09-10 03:58:24.307369] E [MSGID: 106434]
[glusterd-utils.c:13301:glusterd_get_value_for_vme_entry] 0-management:
xlator_volopt_dynload error (-1)
[2018-09-10 03:58:24.307378] W [MSGID: 101095]
[xlator.c:181:xlator_volopt_dynload] 0-xlator:
/usr/local/lib/glusterfs/4.2dev/xlator/features/cloudsync.so: cannot open
shared object file: No such file or directory

<===* seen while running gluster v get  all, cloudsync xlator
seems to be the culprit here*.


[2018-09-10 05:00:16.082968] I [MSGID: 101097]
[xlator.c:334:xlator_dynload_newway] 0-xlator: dlsym(xlator_api) on
/usr/local/lib/glusterfs/4.2dev/xlator/storage/posix.so: undefined symbol:
xlator_api. Fall back to old symbols
[2018-09-10 05:00:16.083388] I [MSGID: 101097]
[xlator.c:334:xlator_dynload_newway] 0-xlator: dlsym(xlator_api) on
/usr/local/lib/glusterfs/4.2dev/xlator/features/trash.so: undefined symbol:
xlator_api. Fall back to old symbols
[2018-09-10 05:00:16.084201] I [MSGID: 101097]
[xlator.c:334:xlator_dynload_newway] 0-xlator: dlsym(xlator_api) on
/usr/local/lib/glusterfs/4.2dev/xlator/features/changetimerecorder.so:
undefined symbol: xlator_api. Fall back to old symbols
[2018-09-10 05:00:16.084597] I [MSGID: 101097]
[xlator.c:334:xlator_dynload_newway] 0-xlator: dlsym(xlator_api) on
/usr/local/lib/glusterfs/4.2dev/xlator/features/changelog.so: undefined
symbol: xlator_api. Fall back to old symbols
[2018-09-10 05:00:16.084917] I [MSGID: 101097]
[xlator.c:334:xlator_dynload_newway] 0-xlator: dlsym(xlator_api) on
/usr/local/lib/glusterfs/4.2dev/xlator/features/bitrot-stub.so:

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4067

2018-08-17 Thread Atin Mukherjee
C7 nightly has a crash too.

-- Forwarded message -
From: 
Date: Sat, 18 Aug 2018 at 00:01
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4067
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4067/display/redirect?page=changes
>

Changes:

[Amar Tumballi] jbr : fix coverity issues in jbr

[Amar Tumballi] statedump : fix coverity issues

[Amar Tumballi] glusterd: coverity defects fix introduced by commit 1f3bfe7

[Amar Tumballi] features/acl: Fix a possible null dereference

[Amar Tumballi] meta : fix coverity in meta-helpers.c

[Amar Tumballi] nfs-server-mount : fix coverity issues in mount3.c

[Amar Tumballi] posix: FORWARD_NULL coverity fix

[Amar Tumballi] locks: FORWARD_NULL coverity fix

[Amar Tumballi] doc: Add details around xlator categories

[Amar Tumballi] features/changelog: Fix missing unlocks

--
[...truncated 981.94 KB...]
this = 0x7f6f2c007e50
priv = 0x7f6f2c0ab440
stub = 0x0
tmp = 0x0
list = {next = 0x7f6f0f7fde90, prev = 0x7f6f0f7fde90}
count = 0
do_fsync = true
__FUNCTION__ = "posix_fsyncer"
#3  0x7f6f3d561e25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x7f6f3cc26bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 10 (Thread 0x7f6f24233700 (LWP 28244)):
#0  0x7f6f3d565d42 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7f6f2a810737 in iot_worker (data=0x7f6f2c068f40) at <
https://build.gluster.org/job/regression-test-burn-in/ws/xlators/performance/io-threads/src/io-threads.c
>:195
conf = 0x7f6f2c068f40
this = 0x7f6f2c020020
stub = 0x0
sleep_till = {tv_sec = 1534527479, tv_nsec = 542650613}
ret = 0
pri = -1
bye = false
__FUNCTION__ = "iot_worker"
#2  0x7f6f3d561e25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7f6f3cc26bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 9 (Thread 0x7f6f3ea2a780 (LWP 28136)):
#0  0x7f6f3d562f97 in pthread_join () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x7f6f3e589159 in event_dispatch_epoll (event_pool=0x770c30) at <
https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/event-epoll.c
>:750
i = 1
t_id = 140115546580736
pollercount = 1
ret = 0
ev_data = 0x7bc6c0
thread_name = "epoll000\000\000"
__FUNCTION__ = "event_dispatch_epoll"
#2  0x7f6f3e546b4a in event_dispatch (event_pool=0x770c30) at <
https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/event.c
>:124
ret = -1
__FUNCTION__ = "event_dispatch"
#3  0x0040b539 in ?? ()
No symbol table info available.
#4  0x in ?? ()
No symbol table info available.

Thread 8 (Thread 0x7f6f35afe700 (LWP 28137)):
#0  0x7f6f3d568f3d in nanosleep () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x7f6f3e520ef4 in gf_timer_proc (data=0x7796f0) at <
https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/timer.c
>:202
now = 516473032615399
now_ts = {tv_sec = 516473, tv_nsec = 32615399}
reg = 0x7796f0
sleepts = {tv_sec = 1, tv_nsec = 0}
event = 0x779700
tmp = 0x779700
old_THIS = 0x7f6f3e81a2c0 
#2  0x7f6f3d561e25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7f6f3cc26bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 7 (Thread 0x7f6f24274700 (LWP 28202)):
#0  0x7f6f3d565d42 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7f6f2a810737 in iot_worker (data=0x7f6f2c068f40) at <
https://build.gluster.org/job/regression-test-burn-in/ws/xlators/performance/io-threads/src/io-threads.c
>:195
conf = 0x7f6f2c068f40
this = 0x7f6f2c020020
stub = 0x0
sleep_till = {tv_sec = 1534527479, tv_nsec = 542650613}
ret = 0
pri = -1
bye = false
__FUNCTION__ = "iot_worker"
#2  0x7f6f3d561e25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7f6f3cc26bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 6 (Thread 0x7f6f2647a700 (LWP 28150)):
#0  0x7f6f3cc1dc73 in select () from /lib64/libc.so.6
No symbol table info available.
#1  0x7f6f2b92aace in changelog_ev_dispatch (data=0x7f6f2c08c0a8) at <
https://build.gluster.org/job/regression-test-burn-in/ws/xlators/features/changelog/src/changelog-ev-handle.c
>:350
ret = 3
opaque = 0x0
this = 0x7f6f2c0116b0
c_clnt = 0x7f6f2c08c0a8
tv = {tv_sec = 0, tv_usec = 128007}
__FUNCTION__ 

[Gluster-devel] Fwd: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #831

2018-08-17 Thread Atin Mukherjee
This is the first nightly job failure since we reopened the master branch.
Crash seems to be from fini () code path. Need investigation and RCA here.

-- Forwarded message -
From: 
Date: Fri, 17 Aug 2018 at 23:54
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #831
To: , 


See <
https://build.gluster.org/job/regression-test-with-multiplex/831/display/redirect?page=changes
>

Changes:

[Amar Tumballi] jbr : fix coverity issues in jbr

[Amar Tumballi] statedump : fix coverity issues

[Amar Tumballi] glusterd: coverity defects fix introduced by commit 1f3bfe7

[Amar Tumballi] features/acl: Fix a possible null dereference

[Amar Tumballi] meta : fix coverity in meta-helpers.c

[Amar Tumballi] nfs-server-mount : fix coverity issues in mount3.c

[Amar Tumballi] posix: FORWARD_NULL coverity fix

[Amar Tumballi] locks: FORWARD_NULL coverity fix

[Amar Tumballi] doc: Add details around xlator categories

[Amar Tumballi] features/changelog: Fix missing unlocks

--
[...truncated 1.01 MB...]
top = 0x0
victim = 0x0
trav_p = 0x0
count = 0
victim_found = false
ctx = 0x16cc010
__FUNCTION__ = "posix_health_check_thread_proc"
#3  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x7fba8d753bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 8 (Thread 0x7fba2b5f0700 (LWP 31062)):
#0  0x7fba8d71a56d in nanosleep () from /lib64/libc.so.6
No symbol table info available.
#1  0x7fba8d71a404 in sleep () from /lib64/libc.so.6
No symbol table info available.
#2  0x7fba816f2081 in posix_disk_space_check_thread_proc
(data=0x7fba505beff0) at <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/storage/posix/src/posix-helpers.c
>:2286
this = 0x7fba505beff0
priv = 0x7fba53fe9490
interval = 5
ret = 0
__FUNCTION__ = "posix_disk_space_check_thread_proc"
#3  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4  0x7fba8d753bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 7 (Thread 0x7fba703b3700 (LWP 31055)):
#0  0x7fba8e092d42 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7fba7b345737 in iot_worker (data=0x7fba53ba86b0) at <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/performance/io-threads/src/io-threads.c
>:195
conf = 0x7fba53ba86b0
this = 0x7fba53277620
stub = 0x0
sleep_till = {tv_sec = 1534518836, tv_nsec = 345143975}
ret = 0
pri = -1
bye = false
__FUNCTION__ = "iot_worker"
#2  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7fba8d753bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 6 (Thread 0x7fba704f5700 (LWP 31054)):
#0  0x7fba8e092995 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7fba7aae8308 in index_worker (data=0x7fba5327d8a0) at <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/index/src/index.c
>:218
priv = 0x7fba53b980f0
this = 0x7fba5327d8a0
stub = 0x0
bye = false
#2  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7fba8d753bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 5 (Thread 0x7fba1d4d5700 (LWP 30825)):
#0  0x7fba8e092995 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7fba8edf9da0 in rpcsvc_request_handler (arg=0x7fba7c046210) at <
https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-lib/src/rpcsvc.c
>:1983
program = 0x7fba7c046210
req = 0x0
actor = 0x0
done = false
ret = 0
__FUNCTION__ = "rpcsvc_request_handler"
#2  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7fba8d753bad in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 4 (Thread 0x7fba1dcd6700 (LWP 30824)):
#0  0x7fba8e092995 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
No symbol table info available.
#1  0x7fba8edf9da0 in rpcsvc_request_handler (arg=0x7fba7c0465d0) at <
https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-lib/src/rpcsvc.c
>:1983
program = 0x7fba7c0465d0
req = 0x0
actor = 0x7fba7a449700 
done = false
ret = 0
__FUNCTION__ = "rpcsvc_request_handler"
#2  0x7fba8e08ee25 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x7fba8d753bad in clone () from /lib64/libc.so

Re: [Gluster-devel] Master branch is closed

2018-08-13 Thread Atin Mukherjee
Nigel,

Now that mater branch is reopened, can you please revoke the commit access
restrictions?

On Mon, 6 Aug 2018 at 09:12, Nigel Babu  wrote:

> Hello folks,
>
> Master branch is now closed. Only a few people have commit access now and
> it's to be exclusively used to merge fixes to make master stable again.
>
>
> --
> nigelb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
- Atin (atinm)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] tests/basic/afr/sparse-file-self-heal.t - crash generated

2018-08-11 Thread Atin Mukherjee
https://build.gluster.org/job/regression-on-demand-multiplex/217/consoleFull

tests/basic/afr/sparse-file-self-heal.t crashed
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Out of regression builders

2018-08-11 Thread Atin Mukherjee
As both Shyam & I are running multiple flavours of manually triggered
regression jobs (lcov, centos-7, brick-mux) on top of
https://review.gluster.org/#/c/glusterfs/+/20637/ , we'd need to occupy
most the builders.

I have currently run out of builders to trigger some of the runs and have
observed one of the patches occupying it and the patch doesn't come under
stabilization bucket. Where there's no harm in keeping your patch up to
date with the regression, but given the most critical part is to make
upstream regression suits back to green asap, I've no choice but to kill
such jobs. However I will add a note to the respective patches before
killing such jobs.

Inconvenience regretted.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Fri, August 9th)

2018-08-10 Thread Atin Mukherjee
I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well. In both the cases the common pattern is if a test was retried but
overall the job succeeded. Is this a bug which got introduced recently? At
the moment, this is blocking us to debug any tests which has been retried
but the job overall succeeded.

*01:54:20* Archiving artifacts*01:54:21* ‘glusterfs-logs.tgz’ doesn’t
match anything*01:54:21* No artifacts found that match the file
pattern "glusterfs-logs.tgz". Configuration error?*01:54:21* Finished:
SUCCESS

I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well.


On Sat, Aug 11, 2018 at 9:40 AM Ravishankar N 
wrote:

>
>
> On 08/11/2018 07:29 AM, Shyam Ranganathan wrote:
> > ./tests/bugs/replicate/bug-1408712.t (one retry)
> I'll take a look at this. But it looks like archiving the artifacts
> (logs) for this run
> (
> https://build.gluster.org/job/regression-on-demand-full-run/44/consoleFull)
>
> was a failure.
> Thanks,
> Ravi
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] tests/bugs/core/multiplex-limit-issue-151.t timed out

2018-08-10 Thread Atin Mukherjee
https://build.gluster.org/job/line-coverage/455/consoleFull

1 test failed:
tests/bugs/core/multiplex-limit-issue-151.t (timed out)

The last job https://build.gluster.org/job/line-coverage/454/consoleFull
took only 21 secs, so we're not anyway near to breaching the threshold of
the timeout secs. Possibly a hang?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Thu, August 09th)

2018-08-10 Thread Atin Mukherjee
Pranith,

https://review.gluster.org/c/glusterfs/+/20685 seems to have caused
multiple failure runs out of
https://review.gluster.org/c/glusterfs/+/20637/8 out of yesterday's report.
Did you get a chance to look at it?

On Fri, Aug 10, 2018 at 1:03 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Fri, Aug 10, 2018 at 6:34 AM Shyam Ranganathan 
> wrote:
>
>> Today's test results are updated in the spreadsheet in sheet named "Run
>> patch set 8".
>>
>> I took in patch https://review.gluster.org/c/glusterfs/+/20685 which
>> caused quite a few failures, so not updating new failures as issue yet.
>>
>> Please look at the failures for tests that were retried and passed, as
>> the logs for the initial runs should be preserved from this run onward.
>>
>> Otherwise nothing else to report on the run status, if you are averse to
>> spreadsheets look at this comment in gerrit [1].
>>
>> Shyam
>>
>> [1] Patch set 8 run status:
>>
>> https://review.gluster.org/c/glusterfs/+/20637/8#message-54de30fa384fd02b0426d9db6d07fad4eeefcf08
>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>> > Deserves a new beginning, threads on the other mail have gone deep
>> enough.
>> >
>> > NOTE: (5) below needs your attention, rest is just process and data on
>> > how to find failures.
>> >
>> > 1) We are running the tests using the patch [2].
>> >
>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>> > Failures" use a search to find a failing test and the corresponding run
>> > that it failed in.
>> >
>> > 3) Patches that are fixing issues can be found here [1], if you think
>> > you have a patch out there, that is not in this list, shout out.
>> >
>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>> > your name against the test, and also update other details as needed (as
>> > comments, as edit rights to the sheet are restricted).
>> >
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> > attention)
>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> > (Atin)
>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>> >
>> > Here are some newer failures, but mostly one-off failures except cores
>> > in ec-5-2.t. All of the following need attention as these are new.
>> >
>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> > ./tests/basic/stats-dump.t
>> > ./tests/bugs/bug-1110262.t
>> >
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>> > ./tests/basic/ec/ec-data-heal.t
>> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>>
>
> Sent https://review.gluster.org/c/glusterfs/+/20697 for the test above.
>
>
>> >
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
>> > ./tests/basic/ec/ec-5-2.t
>> >
>> > 6) Tests that are addressed or are not occurring anymore are,
>> >
>> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> > ./tests/bitrot/bug-1373520.t
>> > ./tests/bugs/distribute/bug-1117851.t
>> > ./tests/bugs/glusterd/quorum-validation.t
>> > ./tests/bugs/distribute/bug-1042725.t
>> >
>> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
>> > ./tests/bugs/quota/bug-1293601.t
>> > ./tests/bugs/bug-1368312.t
>> > ./tests/bugs/distribute/bug-1122443.t
>> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>> >
>> > Shyam (and Atin)
>> >
>> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
>> >> Health on master as of the last nightly run [4] is still the same.
>> >>
>> >> Potential patches that rectify the situation (as in [1]) are bunched in
>> >> a patch [2] that Atin and myself have put through several regressions
>> >> (mux, normal and line coverage) and these have also not passed.
>> >>
>> >> Till we rectify the situation we are locking down master branch commit
>> >> rights to the following people, Amar, Atin, Shyam, Vijay.
>> >>
>> >> The intention is to stabilize master and not add more patches that my
>> >> destabilize it.
>> >>
>> >> Test cases that are tracked as failures and need action are present
>> here
>> >> [3].
>> >>

[Gluster-devel] tests/bugs/glusterd/quorum-validation.t ==> glusterfsd core

2018-08-08 Thread Atin Mukherjee
See https://build.gluster.org/job/line-coverage/435/consoleFull . core file
can be extracted from [1]

The core] seems to be coming from changelog xlator. Please note line-cov
doesn't run with brick mux enabled.

[1]
http://builder100.cloud.gluster.org/archived_builds/build-install-line-coverage-435.tar.bz2
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status (Wed, August 08th)

2018-08-08 Thread Atin Mukherjee
On Thu, 9 Aug 2018 at 06:34, Shyam Ranganathan  wrote:

> Today's patch set 7 [1], included fixes provided till last evening IST,
> and its runs can be seen here [2] (yay! we can link to comments in
> gerrit now).
>
> New failures: (added to the spreadsheet)
> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
> ./tests/bugs/quick-read/bug-846240.t
>
> Older tests that had not recurred, but failed today: (moved up in the
> spreadsheet)
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>
> Other issues;
> Test ./tests/basic/ec/ec-5-2.t core dumped again




> Few geo-rep failures, Kotresh should have more logs to look at with
> these runs
> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again


>
> Atin/Amar, we may need to merge some of the patches that have proven to
> be holding up and fixing issues today, so that we do not leave
> everything to the last. Check and move them along or lmk.


Ack. I’ll be merging those patches.


>
> Shyam
>
> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
> [2] Runs against patch set 7 and its status (incomplete as some runs
> have not completed):
>
> https://review.gluster.org/c/glusterfs/+/20637/7#message-37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
> (also updated in the spreadsheet)
>
> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > Deserves a new beginning, threads on the other mail have gone deep
> enough.
> >
> > NOTE: (5) below needs your attention, rest is just process and data on
> > how to find failures.
> >
> > 1) We are running the tests using the patch [2].
> >
> > 2) Run details are extracted into a separate sheet in [3] named "Run
> > Failures" use a search to find a failing test and the corresponding run
> > that it failed in.
> >
> > 3) Patches that are fixing issues can be found here [1], if you think
> > you have a patch out there, that is not in this list, shout out.
> >
> > 4) If you own up a test case failure, update the spreadsheet [3] with
> > your name against the test, and also update other details as needed (as
> > comments, as edit rights to the sheet are restricted).
> >
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> > attention)
> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> > (Atin)
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> > ./tests/basic/ec/ec-1468261.t (needs attention)
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
> >
> > Here are some newer failures, but mostly one-off failures except cores
> > in ec-5-2.t. All of the following need attention as these are new.
> >
> > ./tests/00-geo-rep/00-georep-verify-setup.t
> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> > ./tests/basic/stats-dump.t
> > ./tests/bugs/bug-1110262.t
> >
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> > ./tests/basic/ec/ec-data-heal.t
> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
> >
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> > ./tests/basic/ec/ec-5-2.t
> >
> > 6) Tests that are addressed or are not occurring anymore are,
> >
> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> > ./tests/bitrot/bug-1373520.t
> > ./tests/bugs/distribute/bug-1117851.t
> > ./tests/bugs/glusterd/quorum-validation.t
> > ./tests/bugs/distribute/bug-1042725.t
> >
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> > ./tests/bugs/quota/bug-1293601.t
> > ./tests/bugs/bug-1368312.t
> > ./tests/bugs/distribute/bug-1122443.t
> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> >
> > Shyam (and Atin)
> >
> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the situation we are locking down master branch commit
> >> rights to the following people, Amar, Atin, Shyam, Vijay.
> >>
> >> The intention is to stabilize master and not a

Re: [Gluster-devel] [Gluster-Maintainers] Master branch lock down status

2018-08-07 Thread Atin Mukherjee
On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
wrote:

> Deserves a new beginning, threads on the other mail have gone deep enough.
>
> NOTE: (5) below needs your attention, rest is just process and data on
> how to find failures.
>
> 1) We are running the tests using the patch [2].
>
> 2) Run details are extracted into a separate sheet in [3] named "Run
> Failures" use a search to find a failing test and the corresponding run
> that it failed in.
>
> 3) Patches that are fixing issues can be found here [1], if you think
> you have a patch out there, that is not in this list, shout out.
>
> 4) If you own up a test case failure, update the spreadsheet [3] with
> your name against the test, and also update other details as needed (as
> comments, as edit rights to the sheet are restricted).
>
> 5) Current test failures
> We still have the following tests failing and some without any RCA or
> attention, (If something is incorrect, write back).
>
> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> attention)
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Atin)
>

This one is fixed through https://review.gluster.org/20651  as I see no
failures from this patch in the latest report from patch set 6.

./tests/bugs/ec/bug-1236065.t (Ashish)
> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> ./tests/basic/ec/ec-1468261.t (needs attention)
> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>
> Here are some newer failures, but mostly one-off failures except cores
> in ec-5-2.t. All of the following need attention as these are new.
>
> ./tests/00-geo-rep/00-georep-verify-setup.t
> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> ./tests/basic/stats-dump.t
> ./tests/bugs/bug-1110262.t
>
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>

This failed because of https://review.gluster.org/20584. I believe there's
some timing issue introduced from this patch. As I highlighted in
https://review.gluster.org/#/c/20637 as a comment I'd request you to revert
this change and include https://review.gluster.org/20658

./tests/basic/ec/ec-data-heal.t
> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> ./tests/basic/ec/ec-5-2.t
>
> 6) Tests that are addressed or are not occurring anymore are,
>
> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bitrot/bug-1373520.t
> ./tests/bugs/distribute/bug-1117851.t
> ./tests/bugs/glusterd/quorum-validation.t
> ./tests/bugs/distribute/bug-1042725.t
>
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/quota/bug-1293601.t
> ./tests/bugs/bug-1368312.t
> ./tests/bugs/distribute/bug-1122443.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>
> Shyam (and Atin)
>
> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> > Health on master as of the last nightly run [4] is still the same.
> >
> > Potential patches that rectify the situation (as in [1]) are bunched in
> > a patch [2] that Atin and myself have put through several regressions
> > (mux, normal and line coverage) and these have also not passed.
> >
> > Till we rectify the situation we are locking down master branch commit
> > rights to the following people, Amar, Atin, Shyam, Vijay.
> >
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> > Test cases that are tracked as failures and need action are present here
> > [3].
> >
> > @Nigel, request you to apply the commit rights change as you see this
> > mail and let the list know regarding the same as well.
> >
> > Thanks,
> > Shyam
> >
> > [1] Patches that address regression failures:
> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >
> > [2] Bunched up patch against which regressions were run:
> > https://review.gluster.org/#/c/20637
> >
> > [3] Failing tests list:
> >
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> >
> > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> maintainers mailing list
> maintain...@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t

2018-08-07 Thread Atin Mukherjee
+Mohit

Requesting Mohit for help.

On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan  wrote:

> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>
> Ashish/Atin, the above test failed in run:
>
> https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull
>
> The above run is based on patchset 4 of
> https://review.gluster.org/#/c/20637/4
>
> The logs look as below, and as Ashish is unable to reproduce this, and
> all failures are on line 78 with a heal outstanding of 105, looks like
> this run may provide some possibilities on narrowing it down.
>
> The problem seems to be glustershd not connecting to one of the bricks
> that is restarted, and hence failing to heal that brick. This also looks
> like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t
>
> ==
> Test times from: cat ./glusterd.log | grep TEST
> [2018-08-06 20:56:28.177386]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script
> --wignore volume heal patchy full ++
> [2018-08-06 20:56:28.767209]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count
> patchy ++
> [2018-08-06 20:57:48.957136]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o
> 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o
> ++
> ==
> Repeated connection failure to client-3 in glustershd.log:
> [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig]
> 0-patchy-client-3: changing port to 49152 (from 0)
> [2018-08-06 20:56:30.222738] W [MSGID: 114043]
> [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed
> to set the volume [Resource temporarily unavailable]
> [2018-08-06 20:56:30.222788] W [MSGID: 114007]
> [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed
> to get 'process-uuid' from reply dict [Invalid argument]
> [2018-08-06 20:56:30.222813] E [MSGID: 114044]
> [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3:
> SETVOLUME on remote-host failed: cleanup flag is set for xlator.  Try
> again later [Resource tempor
> arily unavailable]
> [2018-08-06 20:56:30.222845] I [MSGID: 114051]
> [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3:
> sending CHILD_CONNECTING event
> [2018-08-06 20:56:30.222919] I [MSGID: 114018]
> [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from
> patchy-client-3. Client process will keep trying to connect to glusterd
> until brick's port is
>  available
> ==
> Repeated connection messages close to above retries in
> d-backends-patchy0.log:
> [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict:
> key 'trusted.ec.version' is would not be sent on wire in future [Invalid
> argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and
>  [2018-08-06 20:56:37.933084]
> [2018-08-06 20:56:38.530067] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.540555] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.552494] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy4: allowed = "*", received 

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-05 Thread Atin Mukherjee
On Mon, 6 Aug 2018 at 06:09, Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Mon, Aug 6, 2018 at 5:17 AM, Amye Scavarda  wrote:
> >
> >
> > On Sun, Aug 5, 2018 at 3:24 PM Shyam Ranganathan 
> > wrote:
> >>
> >> On 07/31/2018 07:16 AM, Shyam Ranganathan wrote:
> >> > On 07/30/2018 03:21 PM, Shyam Ranganathan wrote:
> >> >> On 07/24/2018 03:12 PM, Shyam Ranganathan wrote:
> >> >>> 1) master branch health checks (weekly, till branching)
> >> >>>   - Expect every Monday a status update on various tests runs
> >> >> See https://build.gluster.org/job/nightly-master/ for a report on
> >> >> various nightly and periodic jobs on master.
> >> > Thinking aloud, we may have to stop merges to master to get these test
> >> > failures addressed at the earliest and to continue maintaining them
> >> > GREEN for the health of the branch.
> >> >
> >> > I would give the above a week, before we lockdown the branch to fix
> the
> >> > failures.
> >> >
> >> > Let's try and get line-coverage and nightly regression tests addressed
> >> > this week (leaving mux-regression open), and if addressed not lock the
> >> > branch down.
> >> >
> >>
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the situation we are locking down master branch commit
> >> rights to the following people, Amar, Atin, Shyam, Vijay.
> >>
> >> The intention is to stabilize master and not add more patches that my
> >> destabilize it.
> >>
> >> Test cases that are tracked as failures and need action are present here
> >> [3].
> >>
> >> @Nigel, request you to apply the commit rights change as you see this
> >> mail and let the list know regarding the same as well.
> >>
> >> Thanks,
> >> Shyam
> >>
> >> [1] Patches that address regression failures:
> >> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >>
> >> [2] Bunched up patch against which regressions were run:
> >> https://review.gluster.org/#/c/20637
> >>
> >> [3] Failing tests list:
> >>
> >>
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> >>
> >> [4] Nightly run dashboard:
> https://build.gluster.org/job/nightly-master/
>
> >
> > Locking master is fine, this seems like there's been ample notice and
> > conversation.
> > Do we have test criteria to indicate when we're unlocking master? X
> amount
> > of tests passing, Y amount of bugs?
>
> The "till we rectify" might just include 3 days of the entire set of
> tests passing - thinking out loud here.


3 days = 3 nightly regressions isn’t enough as most failures are spurious
in nature (IMHO). What Shyam and I are doing is retriggering various
regressions on top of patch [2] . We’re looking for atleast 10 iterations
to go through with out any tests retry and failures.


>
>
> --
> sankarshan mukhopadhyay
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] validation-server-quorum.t crash

2018-08-04 Thread Atin Mukherjee
The patch [1] addresses the $Subject and it needs to get in the master to
address the frequent failures. Request for your reviews.

[1] https://review.gluster.org/#/c/20584/
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Atin Mukherjee
New addition - tests/basic/volume.t - failed twice atleast with shd core.

One such ref - https://build.gluster.org/job/centos7-regression/2058/console


On Thu, Aug 2, 2018 at 6:28 PM Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Thu, Aug 2, 2018 at 5:48 PM, Kotresh Hiremath Ravishankar
>  wrote:
> > I am facing different issue in softserve machines. The fuse mount itself
> is
> > failing.
> > I tried day before yesterday to debug geo-rep failures. I discussed with
> > Raghu,
> > but could not root cause it. So none of the tests were passing. It
> happened
> > on
> > both machine instances I tried.
> >
>
> Ugh! -infra team should have an issue to work with and resolve this.
>
>
> --
> sankarshan mukhopadhyay
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Atin Mukherjee
On Thu, Aug 2, 2018 at 4:37 PM Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

>
>
> On Thu, Aug 2, 2018 at 3:49 PM, Xavi Hernandez 
> wrote:
>
>> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee 
>>> wrote:
>>>
>>>> I just went through the nightly regression report of brick mux runs and
>>>> here's what I can summarize.
>>>>
>>>>
>>>> =
>>>> Fails only with brick-mux
>>>>
>>>> =
>>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after
>>>> 400 secs. Refer
>>>> https://fstat.gluster.org/failure/209?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all,
>>>> specifically the latest report
>>>> https://build.gluster.org/job/regression-test-burn-in/4051/consoleText
>>>> . Wasn't timing out as frequently as it was till 12 July. But since 27
>>>> July, it has timed out twice. Beginning to believe commit
>>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
>>>> secs isn't sufficient enough (Mohit?)
>>>>
>>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>>> (Ref -
>>>> https://build.gluster.org/job/regression-test-with-multiplex/814/console)
>>>> -  Test fails only in brick-mux mode, AI on Atin to look at and get back.
>>>>
>>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
>>>> - Seems like failed just twice in last 30 days as per
>>>> https://fstat.gluster.org/failure/251?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all.
>>>> Need help from AFR team.
>>>>
>>>> tests/bugs/quota/bug-1293601.t (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/812/console)
>>>> - Hasn't failed after 26 July and earlier it was failing regularly. Did we
>>>> fix this test through any patch (Mohit?)
>>>>
>>>> tests/bitrot/bug-1373520.t - (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/811/console)
>>>> - Hasn't failed after 27 July and earlier it was failing regularly. Did we
>>>> fix this test through any patch (Mohit?)
>>>>
>>>
>>> I see this has failed in day before yesterday's regression run as well
>>> (and I could reproduce it locally with brick mux enabled). The test fails
>>> in healing a file within a particular time period.
>>>
>>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED 
>>> COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>>
>>> Need EC dev's help here.
>>>
>>
>> I'm not sure where the problem is exactly. I've seen that when the test
>> fails, self-heal is attempting to heal the file, but when the file is
>> accessed, an Input/Output error is returned, aborting heal. I've checked
>> that a heal is attempted every time the file is accessed, but it fails
>> always. This error seems to come from bit-rot stub xlator.
>>
>> When in this situation, if I stop and start the volume, self-heal
>> immediately heals the files. It seems like an stale state that is kept by
>> the stub xlator, preventing the file from being healed.
>>
>> Adding bit-rot maintainers for help on this one.
>>
>
> Bitrot-stub marks the file as corrupted in inode_ctx. But when the file
> and it's hardlink are deleted from that brick and a lookup is done
> on the file, it cleans up the marker on getting ENOENT. This is part of
> recovery steps, and only md-cache is disabled during the process.
> Is there any other perf xlators that needs to be disabled for this
> scenario to expect a lookup/revalidate on the brick where
> the back end file is deleted?
>

But the same test doesn't fail with brick multiplexing not enabled. Do we
know why?


>
>> Xavi
>>
>>
>>
>>>
>>>> tests/bugs/glusterd/remove-brick-testcase

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-01 Thread Atin Mukherjee
On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee  wrote:

> I just went through the nightly regression report of brick mux runs and
> here's what I can summarize.
>
>
> =
> Fails only with brick-mux
>
> =
> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after 400
> secs. Refer
> https://fstat.gluster.org/failure/209?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all,
> specifically the latest report
> https://build.gluster.org/job/regression-test-burn-in/4051/consoleText .
> Wasn't timing out as frequently as it was till 12 July. But since 27 July,
> it has timed out twice. Beginning to believe commit
> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
> secs isn't sufficient enough (Mohit?)
>
> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Ref -
> https://build.gluster.org/job/regression-test-with-multiplex/814/console)
> -  Test fails only in brick-mux mode, AI on Atin to look at and get back.
>
> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
> - Seems like failed just twice in last 30 days as per
> https://fstat.gluster.org/failure/251?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all.
> Need help from AFR team.
>
> tests/bugs/quota/bug-1293601.t (
> https://build.gluster.org/job/regression-test-with-multiplex/812/console)
> - Hasn't failed after 26 July and earlier it was failing regularly. Did we
> fix this test through any patch (Mohit?)
>
> tests/bitrot/bug-1373520.t - (
> https://build.gluster.org/job/regression-test-with-multiplex/811/console)
> - Hasn't failed after 27 July and earlier it was failing regularly. Did we
> fix this test through any patch (Mohit?)
>

I see this has failed in day before yesterday's regression run as well (and
I could reproduce it locally with brick mux enabled). The test fails in
healing a file within a particular time period.

*15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19*
FAILED COMMAND: 512 path_size /d/backends/patchy5/FILE1

Need EC dev's help here.


> tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core,
> not sure if related to brick mux or not, so not sure if brick mux is
> culprit here or not. Ref -
> https://build.gluster.org/job/regression-test-with-multiplex/806/console
> . Seems to be a glustershd crash. Need help from AFR folks.
>
>
> =
> Fails for non-brick mux case too
>
> =
> tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup very
> often, with out brick mux as well. Refer
> https://build.gluster.org/job/regression-test-burn-in/4050/consoleText .
> There's an email in gluster-devel and a BZ 1610240 for the same.
>
> tests/bugs/bug-1368312.t - Seems to be recent failures (
> https://build.gluster.org/job/regression-test-with-multiplex/815/console)
> - seems to be a new failure, however seen this for a non-brick-mux case too
> - https://build.gluster.org/job/regression-test-burn-in/4039/consoleText
> . Need some eyes from AFR folks.
>
> tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to brick
> mux, have seen this failing at multiple default regression runs. Refer
> https://fstat.gluster.org/failure/392?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
> . We need help from geo-rep dev to root cause this earlier than later
>
> tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
> mux, have seen this failing at multiple default regression runs. Refer
> https://fstat.gluster.org/failure/393?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
> . We need help from geo-rep dev to root cause this earlier than later
>
> tests/bugs/glusterd/validating-server-quorum.t (
> https://build.gluster.org/job/regression-test-with-multiplex/810/console)
> - Fails for non-brick-mux cases too,
> https://fstat.gluster.org/failure/580?state=2&start_date=2018-06-30&end_date

Re: [Gluster-devel] tests/bugs/distribute/bug-1122443.t - spurious failure

2018-08-01 Thread Atin Mukherjee
On Thu, 2 Aug 2018 at 07:05, Susant Palai  wrote:

> Will have a look at it and update.
>

There’s already a patch from Mohit for this.


> Susant
>
> On Wed, 1 Aug 2018, 18:58 Krutika Dhananjay,  wrote:
>
>> Same here - https://build.gluster.org/job/centos7-regression/2024/console
>>
>> -Krutika
>>
>> On Sun, Jul 29, 2018 at 1:53 PM, Atin Mukherjee 
>> wrote:
>>
>>> tests/bugs/distribute/bug-1122443.t fails my set up (3 out of 5 times)
>>> running with master branch. As per my knowledge I've not seen this test
>>> failing earlier. Looks like some recent changes has caused it. One of such
>>> instance is https://build.gluster.org/job/centos7-regression/1955/ .
>>>
>>> Request the component owners to take a look at it.
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
--Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-07-31 Thread Atin Mukherjee
I just went through the nightly regression report of brick mux runs and
here's what I can summarize.

=
Fails only with brick-mux
=
tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after 400
secs. Refer
https://fstat.gluster.org/failure/209?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all,
specifically the latest report
https://build.gluster.org/job/regression-test-burn-in/4051/consoleText .
Wasn't timing out as frequently as it was till 12 July. But since 27 July,
it has timed out twice. Beginning to believe commit
9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
secs isn't sufficient enough (Mohit?)

tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t (Ref
- https://build.gluster.org/job/regression-test-with-multiplex/814/console)
-  Test fails only in brick-mux mode, AI on Atin to look at and get back.

tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
https://build.gluster.org/job/regression-test-with-multiplex/813/console) -
Seems like failed just twice in last 30 days as per
https://fstat.gluster.org/failure/251?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all.
Need help from AFR team.

tests/bugs/quota/bug-1293601.t (
https://build.gluster.org/job/regression-test-with-multiplex/812/console) -
Hasn't failed after 26 July and earlier it was failing regularly. Did we
fix this test through any patch (Mohit?)

tests/bitrot/bug-1373520.t - (
https://build.gluster.org/job/regression-test-with-multiplex/811/console)
- Hasn't failed after 27 July and earlier it was failing regularly. Did we
fix this test through any patch (Mohit?)

tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core, not
sure if related to brick mux or not, so not sure if brick mux is culprit
here or not. Ref -
https://build.gluster.org/job/regression-test-with-multiplex/806/console .
Seems to be a glustershd crash. Need help from AFR folks.

=
Fails for non-brick mux case too
=
tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup very
often, with out brick mux as well. Refer
https://build.gluster.org/job/regression-test-burn-in/4050/consoleText .
There's an email in gluster-devel and a BZ 1610240 for the same.

tests/bugs/bug-1368312.t - Seems to be recent failures (
https://build.gluster.org/job/regression-test-with-multiplex/815/console) -
seems to be a new failure, however seen this for a non-brick-mux case too -
https://build.gluster.org/job/regression-test-burn-in/4039/consoleText .
Need some eyes from AFR folks.

tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to brick
mux, have seen this failing at multiple default regression runs. Refer
https://fstat.gluster.org/failure/392?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
. We need help from geo-rep dev to root cause this earlier than later

tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
mux, have seen this failing at multiple default regression runs. Refer
https://fstat.gluster.org/failure/393?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
. We need help from geo-rep dev to root cause this earlier than later

tests/bugs/glusterd/validating-server-quorum.t (
https://build.gluster.org/job/regression-test-with-multiplex/810/console) -
Fails for non-brick-mux cases too,
https://fstat.gluster.org/failure/580?state=2&start_date=2018-06-30&end_date=2018-07-31&branch=all
.  Atin has a patch https://review.gluster.org/20584 which resolves it but
patch is failing regression for a different test which is unrelated.

tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
(Ref -
https://build.gluster.org/job/regression-test-with-multiplex/809/console) -
fails for non brick mux case too -
https://build.gluster.org/job/regression-test-burn-in/4049/consoleText -
Need some eyes from AFR folks.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

  1   2   3   4   5   6   7   8   9   >