from:"Atin Mukherjee"

Re: [Gluster-Maintainers] [Gluster-devel] Modifying gluster's logging mechanism

2019-11-21 Thread Atin Mukherjee

This is definitely a good start. In fact the experiment you have done which
indicates a 20% improvement of run time perf with out logger does put this
work for a ‘worth a try’ category for sure. The only thing we need to be
mindful here is the ordering of the logs to be provided, either through a
tool or the logger itself taking care of it.

On Thu, 21 Nov 2019 at 18:34, Barak Sason Rofman 
wrote:

> Hello Gluster community,
>
> My name is Barak and I’ve joined RH gluster development in August.
> Shortly after my arrival, I’ve identified a potential problem with
> gluster’s logging mechanism and I’d like to bring the matter up for
> discussion.
>
> The general concept of the current mechanism is that every worker thread
> that needs to log a message has to contend for a mutex which guards the log
> file, write the message and, flush the data and then release the mutex.
> I see two design / implementation problems with that mechanism:
>
>1.
>
>The mutex that guards the log file is likely under constant contention.
>2.
>
>The fact that each worker thread perform the IO by himself, thus
>slowing his "real" work.
>
>
> Initial tests, done by *removing logging from the regression testing,
> shows an improvement of about 20% in run time*. This indicates we’re
> taking a pretty heavy performance hit just because of the logging activity.
>
> In addition to these problems, the logging module is due for an upgrade:
>
>1.
>
>There are dozens of APIs in the logger, much of them are deprecated -
>this makes it very hard for new developers to keep evolving the project.
>2.
>
>One of the key points for Gluster-X, presented in October at
>Bangalore, is the switch to a structured logging all across gluster.
>
>
> Given these points, I believe we’re in a position that allows us to
> upgrade the logging mechanism by both switching to structured logging
> across the project AND replacing the logging system itself, thus “killing
> two birds with one stone”.
>
> Moreover, if the upgrade is successful, the new logger mechanism might be
> adopted by other teams in Red Hat, which lead to uniform logging activity
> across different products.
>
> I’d like to propose a logging utility I’ve been working on for the past
> few weeks.
> This project is still a work in progress (and still much work needs to be
> done in it), but I’d like to bring this matter up now so if the community
> will want to advance on that front, we could collaborate and shape the
> logger to best suit the community’s needs.
>
> An overview of the system:
>
> The logger provides several (number and size are user-defined)
> pre-allocated buffers which threads can 'register' to and receive a private
> buffer. In addition, a single, shared buffer is also pre-allocated (size is
> user-defined). The number of buffers and their size is modifiable at
> runtime (not yet implemented).
>
> Worker threads write messages in one of 3 ways that will be described
> next, and an internal logger threads constantly iterates the existing
> buffers and drains the data to the log file.
>
> As all allocations are allocated at the initialization stage, no special
> treatment it needed for "out of memory" cases.
>
> The following writing levels exist:
>
>1.
>
>Level 1 - Lockless writing: Lockless writing is achieved by assigning
>each thread a private ring buffer. A worker threads write to that buffer
>and the logger thread drains that buffer into a log file.
>
> In case the private ring buffer is full and not yet drained, or in case
> the worker thread has not registered for a private buffer, we fall down to
> the following writing methods:
>
>1.
>
>Level 2 - Shared buffer writing: The worker thread will write it's
>data into a buffer that's shared across all threads. This is done in a
>synchronized manner.
>
> In case the private ring buffer is full and not yet drained AND the shared
> ring buffer is full and not yet drained, or in case the worker thread has
> not registered for a private buffer, we fall down to the last writing
> method:
>
>1.
>
>Level 3 - Direct write: This is the slowest form of writing - the
>worker thread directly write to the log file.
>
> The idea behind this utility is to reduce as much as possible the impact
> of logging on runtime. Part of this reduction comes at the cost of having
> to parse and reorganize the messages in the log files using a dedicated
> tool (yet to be implemented) as there is no guarantee on the order of
> logged messages.
>
> The full logger project is hosted on:
> https://github.com/BarakSason/Lockless_Logger
>
> For project documentation visit:
> https://baraksason.github.io/Lockless_Logger/
>
> I thank you all for reading through my suggestion and I’m looking forward
> to your feedback,
> --
> *Barak Sason Rofman*
>
> Gluster Storage Development
>
> Red Hat Israel 
>
> 34 Jerusalem rd. Ra'anana, 43501
>
>

Re: [Gluster-Maintainers] Proposal to make Mohit and Sanju as cli/glusterd maintainer

2019-10-22 Thread Atin Mukherjee

On Tue, Oct 22, 2019 at 1:40 PM Niels de Vos  wrote:

> On Tue, Oct 22, 2019 at 01:22:33PM +0530, Atin Mukherjee wrote:
> > I’d like to propose Mohit Agrawal & Sanju Rakonde to be made CLI/Glusterd
> > maintainers. If there’s no objection on this proposal can we please get
> > this done by end of this month?
>
> +1 from me.
>
> Please send a patch for the MAINTAINERS file and have the current
> maintainers approve that together with Mohit and Sanju for their
> acceptance.
>

https://review.gluster.org/#/c/glusterfs/+/23601/


> In case they are not aware yet, have them go through
>
> https://docs.gluster.org/en/latest/Contributors-Guide/Guidelines-For-Maintainers/
> and subscribe to this list (should probably be part of that doc).
>
> Thanks,
> Niels
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] Proposal to make Mohit and Sanju as cli/glusterd maintainer

2019-10-22 Thread Atin Mukherjee

I’d like to propose Mohit Agrawal & Sanju Rakonde to be made CLI/Glusterd
maintainers. If there’s no objection on this proposal can we please get
this done by end of this month?
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-burn-in #4710

2019-08-26 Thread Atin Mukherjee

Since last few days I was trying to understand the nightly failures we were
seeing even after addressing the port already in use issue. So here's the
analysis:

>From console output of
https://build.gluster.org/job/regression-test-burn-in/4710/consoleFull



*19:51:56* Started by upstream project "nightly-master
" build number 843
*19:51:56*
originally caused by:*19:51:56*  Started by timer*19:51:56* Running as
SYSTEM*19:51:57* Building remotely on builder209.aws.gluster.org

(centos7) in workspace
/home/jenkins/root/workspace/regression-test-burn-in*19:51:58* No
credentials specified*19:51:58*  > git rev-parse --is-inside-work-tree
# timeout=10*19:51:58* Fetching changes from the remote Git
repository*19:51:58*  > git config remote.origin.url
git://review.gluster.org/glusterfs.git # timeout=10*19:51:58* Fetching
upstream changes from git://review.gluster.org/glusterfs.git*19:51:58*
 > git --version # timeout=10*19:51:58*  > git fetch --tags --progress
git://review.gluster.org/glusterfs.git refs/heads/master #
timeout=10*19:52:01*  > git rev-parse origin/master^{commit} #
timeout=10*19:52:01* Checking out Revision
a31fad885c30cbc1bea652349c7d52bac1414c08 (origin/master)*19:52:01*  >
git config core.sparsecheckout # timeout=10*19:52:01  > git checkout
-f a31fad885c30cbc1bea652349c7d52bac1414c08 # timeout=10
19:52:02 Commit message: "tests: heal-info add --xml option for more
coverage"**19:52:02*  > git rev-list --no-walk
a31fad885c30cbc1bea652349c7d52bac1414c08 # timeout=10*19:52:02*
[regression-test-burn-in] $ /bin/bash
/tmp/jenkins7274529097702336737.sh*19:52:02* Start time Mon Aug 26
14:22:02 UTC 2019




The latest commit which it picked up as part of git checkout is quite old
and hence we continue to see the similar failures in the latest nightly
runs which has been already addressed by commit c370c70

commit c370c70f77079339e2cfb7f284f3a2fb13fd2f97
Author: Mohit Agrawal 
Date:   Tue Aug 13 18:45:43 2019 +0530

rpc: glusterd start is failed and throwing an error Address already in
use

Problem: Some of the .t are failed due to bind is throwing
 an error EADDRINUSE

Solution: After killing all gluster processes .t is trying
  to start glusterd but somehow if kernel has not cleaned
  up resources(socket) then glusterd startup is failed due to
  bind system call failure.To avoid the issue retries to call
  bind 10 times to execute system call succesfully

Change-Id: Ia5fd6b788f7b211c1508c1b7304fc08a32266629
Fixes: bz#1743020
Signed-off-by: Mohit Agrawal 

So the (puzzling) question is - why are we picking up old commit?

In my local setup when I run the following command I do see the latest
commit id being picked up:

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git
rev-parse origin/master^{commit} # timeout=10
7926992e65d0a07fdc784a6e45740306d9b4a9f2

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git show
7926992e65d0a07fdc784a6e45740306d9b4a9f2
commit 7926992e65d0a07fdc784a6e45740306d9b4a9f2 (origin/master,
origin/HEAD, master)
Author: Sanju Rakonde 
Date:   Mon Aug 26 12:38:40 2019 +0530

glusterd: Unused value coverity fix

CID: 1288765
updates: bz#789278

Change-Id: Ie6b01f81339769f44d82fd7c32ad0ed1a697c69c
Signed-off-by: Sanju Rakonde 



-- Forwarded message -
From: 
Date: Mon, Aug 26, 2019 at 11:32 PM
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-burn-in #4710
To: 


See <
https://build.gluster.org/job/regression-test-burn-in/4710/display/redirect>

--
[...truncated 4.18 MB...]
./tests/features/lock-migration/lkmigration-set-option.t  -  7 second
./tests/bugs/upcall/bug-1458127.t  -  7 second
./tests/bugs/transport/bug-873367.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/replicate/bug-986905.t  -  7 second
./tests/bugs/replicate/bug-921231.t  -  7 second
./tests/bugs/replicate/bug-1132102.t  -  7 second
./tests/bugs/replicate/bug-1037501.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/posix/bug-1122028.t  -  7 second
./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
./tests/bugs/fuse/bug-983477.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/distribute/bug-1086228.t  -  7 second
./tests/bugs/cli/bug-1087487.t  -  7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/ctime/ctime-noatime.t  -  7 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  7 second
./tests/basic/afr/ta.t  -  7 second
./tests/basic/afr/ta-shd.t  -  7 second
./tests/basic/afr/root-squash-self-heal.t  -  7 second
./tests/basic/afr/granular-esh/add-brick.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #4649

2019-07-19 Thread Atin Mukherjee

Seems like a shd crash. Could you check if this is due to shd multiplexing?

On Fri, Jul 19, 2019 at 4:04 AM  wrote:

> See <
> https://build.gluster.org/job/regression-test-burn-in/4649/display/redirect
> >
>
> --
> [...truncated 3.99 MB...]
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> .
> [New LWP 24956]
> [New LWP 24948]
> [New LWP 24952]
> [New LWP 24954]
> [New LWP 24978]
> [New LWP 24955]
> [New LWP 24949]
> [New LWP 24950]
> [New LWP 24951]
> [New LWP 24953]
> Core was generated by `/build/install/sbin/glusterfs -s 127.1.1.3
> --volfile-id shd/patchy -p /d/backen'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x7fc0c8f3e70a in ?? ()
> exe = '/build/install/sbin/glusterfs -s 127.1.1.3 --volfile-id shd/patchy
> -p /d/backen'
> ++ gdb -ex 'core-file /glfs_epoll001-24948.core' -ex 'set pagination off'
> -ex 'info proc exe' -ex q
> ++ cut -d ''\''' -f2
> ++ cut -d ' ' -f1
> ++ tail -1
> + executable_name=/build/install/sbin/glusterfs
> ++ which /build/install/sbin/glusterfs
> + executable_path=/build/install/sbin/glusterfs
> + set +x
>
> =
>   Start printing backtrace
>  program name : /build/install/sbin/glusterfs
>  corefile : /glfs_epoll001-24948.core
> =
>
> warning: core file may not match specified executable file.
> [New LWP 24956]
> [New LWP 24948]
> [New LWP 24952]
> [New LWP 24954]
> [New LWP 24978]
> [New LWP 24955]
> [New LWP 24949]
> [New LWP 24950]
> [New LWP 24951]
> [New LWP 24953]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/build/install/sbin/glusterfs -s 127.1.1.3
> --volfile-id shd/patchy -p /d/backen'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x7fc0c8f3e70a in gf_log_flush_extra_msgs (ctx=0xa5aaa0, new=0) at
> <
> https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/logging.c
> >:1670
> 1670list_for_each_entry_safe(iter, tmp, >log.lru_queue,
> msg_list)
>
> Thread 10 (Thread 0x7fc0be503700 (LWP 24953)):
> #0  0x7fc0c7d6cd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> No symbol table info available.
> #1  0x7fc0c8f8fa8a in syncenv_task (proc=0xa5c040) at <
> https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/syncop.c
> >:517
> env = 0xa5bc80
> task = 0x0
> sleep_till = {tv_sec = 1563484608, tv_nsec = 0}
> ret = 0
> #2  0x7fc0c8f8fc7f in syncenv_processor (thdata=0xa5c040) at <
> https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/syncop.c
> >:584
> env = 0xa5bc80
> proc = 0xa5c040
> task = 0x0
> #3  0x7fc0c7d68dd5 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #4  0x7fc0c763002d in clone () from /lib64/libc.so.6
> No symbol table info available.
>
> Thread 9 (Thread 0x7fc0bf505700 (LWP 24951)):
> #0  0x7fc0c75f6fad in nanosleep () from /lib64/libc.so.6
> No symbol table info available.
> #1  0x7fc0c75f6e44 in sleep () from /lib64/libc.so.6
> No symbol table info available.
> #2  0x7fc0c8f76c45 in pool_sweeper (arg=0x0) at <
> https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/mem-pool.c
> >:446
> state = {death_row = {next = 0x0, prev = 0x0}, cold_lists = {0x0
> }, n_cold_lists = 0}
> pool_list = 0x0
> next_pl = 0x0
> pt_pool = 0x0
> i = 0
> poisoned = false
> #3  0x7fc0c7d68dd5 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #4  0x7fc0c763002d in clone () from /lib64/libc.so.6
> No symbol table info available.
>
> Thread 8 (Thread 0x7fc0bfd06700 (LWP 24950)):
> #0  0x7fc0c7d70361 in sigwait () from /lib64/libpthread.so.0
> No symbol table info available.
> #1  0x0040b070 in ?? ()
> No symbol table info available.
> #2  0x in ?? ()
> No symbol table info available.
>
> Thread 7 (Thread 0x7fc0c0507700 (LWP 24949)):
> #0  0x7fc0c7d6cd12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
> No symbol table info available.
> #1  0x7fc0c8f4ef5e in gf_timer_proc (data=0xa5aaa0) at <
> https://build.gluster.org/job/regression-test-burn-in/ws/libglusterfs/src/timer.c
> >:140
> now = {tv_sec = 904856, tv_nsec = 987463007}
> reg = 0xa5aaa0
> event = 0xaa4720
> tmp = 0x0
> old_THIS = 0x0
> #2  0x7fc0c7d68dd5 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #3  0x7fc0c763002d in clone () from

Re: [Gluster-Maintainers] Requesting reviews [Re: [Gluster-devel] Release 7 Branch Created]

2019-07-15 Thread Atin Mukherjee

Please ensure :
1. commit message has the explanation on the motive behind this change.
2. I always feel more confident if a patch has passed regression to kick
start the review. Can you please ensure that verified flag is put up?

On Mon, Jul 15, 2019 at 5:27 PM Jiffin Tony Thottan 
wrote:

> Hi,
>
> The "Add Ganesha HA bits back to glusterfs code repo"[1] is targeted for
> glusterfs-7. Requesting maintainers to review below two patches
>
> [1]
> https://review.gluster.org/#/q/topic:ref-663+(status:open+OR+status:merged)
>
> Regards,
>
> Jiffin
>
> On 15/07/19 5:23 PM, Jiffin Thottan wrote:
> >
> > - Original Message -
> > From: "Rinku Kothiya" 
> > To: maintainers@gluster.org, gluster-de...@gluster.org, "Shyam
> Ranganathan" 
> > Sent: Wednesday, July 3, 2019 10:30:58 AM
> > Subject: [Gluster-devel] Release 7 Branch Created
> >
> > Hi Team,
> >
> > Release 7 branch has been created in upstream.
> >
> > ## Schedule
> >
> > Curretnly the plan working backwards on the schedule, here's what we
> have:
> > - Announcement: Week of Aug 4th, 2019
> > - GA tagging: Aug-02-2019
> > - RC1: On demand before GA
> > - RC0: July-03-2019
> > - Late features cut-off: Week of June-24th, 2018
> > - Branching (feature cutoff date): June-17-2018
> >
> > Regards
> > Rinku
> >
> > ___
> >
> > Community Meeting Calendar:
> >
> > APAC Schedule -
> > Every 2nd and 4th Tuesday at 11:30 AM IST
> > Bridge: https://bluejeans.com/836554017
> >
> > NA/EMEA Schedule -
> > Every 1st and 3rd Tuesday at 01:00 PM EDT
> > Bridge: https://bluejeans.com/486278655
> >
> > Gluster-devel mailing list
> > gluster-de...@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-devel
> >
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359

2019-06-10 Thread Atin Mukherjee

On Fri, Jun 7, 2019 at 10:07 AM Amar Tumballi Suryanarayan <
atumb...@redhat.com> wrote:

> Got time to test subdir-mount.t failing in brick-mux scenario.
>
> I noticed some issues, where I need further help from glusterd team.
>
> subdir-mount.t expects 'hook' script to run after add-brick to make sure
> the required subdirectories are healed and are present in new bricks. This
> is important as subdir mount expects the subdirs to exist for successful
> mount.
>
> But in case of brick-mux setup, I see that in some cases (6/10), hook
> script (add-brick/post-hook/S13-create-subdir-mount.sh) started getting
> executed after 20second of finishing the add-brick command. Due to this,
> the mount which we execute after add-brick failed.
>
> My question is, what is making post hook script to run so late ??
>

It's not only the add-brick in the post hook. Given post hook scripts are
async in nature, I see the respective hook scripts of create/start/set
volume operation have executed quite a late which is very surprising until
and unless some thread has been stuck for quite a while. Unfortunately for
both Mohit and I, the issue isn't reproducible locally. Mohit would give it
a try in softserve infra but at this point of time, there's no conclusive
evidence, the analysis continues.

Amar - would it be possible for you to do a git blame given you can
reproduce this? May 31 nightly (
https://build.gluster.org/job/regression-test-with-multiplex/1359/) is when
this test started failing.


> I can recreate the issues locally on my laptop too.
>
>
> On Sat, Jun 1, 2019 at 4:55 PM Atin Mukherjee  wrote:
>
>> subdir-mount.t has started failing in brick mux regression nightly. This
>> needs to be fixed.
>>
>> Raghavendra - did we manage to get any further clue on uss.t failure?
>>
>> -- Forwarded message -
>> From: 
>> Date: Fri, 31 May 2019 at 23:34
>> Subject: [Gluster-Maintainers] Build failed in Jenkins:
>> regression-test-with-multiplex #1359
>> To: , , ,
>> , 
>>
>>
>> See <
>> https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes
>> >
>>
>> Changes:
>>
>> [atin] glusterd: add an op-version check
>>
>> [atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper
>> function
>>
>> [atin] glusterd/svc: Stop stale process using the glusterd_proc_stop
>>
>> [Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs
>>
>> [Kotresh H R] tests/geo-rep: Add EC volume test case
>>
>> [Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock
>>
>> [Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send
>> reconfigure
>>
>> [Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep
>>
>> [atin] glusterd: Optimize code to copy dictionary in handshake code path
>>
>> --
>> [...truncated 3.18 MB...]
>> ./tests/basic/afr/stale-file-lookup.t  -  9 second
>> ./tests/basic/afr/granular-esh/replace-brick.t  -  9 second
>> ./tests/basic/afr/granular-esh/add-brick.t  -  9 second
>> ./tests/basic/afr/gfid-mismatch.t  -  9 second
>> ./tests/performance/open-behind.t  -  8 second
>> ./tests/features/ssl-authz.t  -  8 second
>> ./tests/features/readdir-ahead.t  -  8 second
>> ./tests/bugs/upcall/bug-1458127.t  -  8 second
>> ./tests/bugs/transport/bug-873367.t  -  8 second
>> ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  8 second
>> ./tests/bugs/replicate/bug-1132102.t  -  8 second
>> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
>> -  8 second
>> ./tests/bugs/quota/bug-1104692.t  -  8 second
>> ./tests/bugs/posix/bug-1360679.t  -  8 second
>> ./tests/bugs/posix/bug-1122028.t  -  8 second
>> ./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
>> ./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
>> ./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
>> ./tests/bugs/glusterd/bug-1696046.t  -  8 second
>> ./tests/bugs/fuse/bug-983477.t  -  8 second
>> ./tests/bugs/ec/bug-1227869.t  -  8 second
>> ./tests/bugs/distribute/bug-1088231.t  -  8 second
>> ./tests/bugs/distribute/bug-1086228.t  -  8 second
>> ./tests/bugs/cli/bug-1087487.t  -  8 second
>> ./tests/bugs/cli/bug-1022905.t  -  8 second
>> ./tests/bugs/bug-1258069.t  -  8 second
>> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
>> -  8 second
>> ./tests/basic/xlator-pass-through-sanity.t  -  8 second
>> ./tests/basic/quota-nfs.t  -

[Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1357

2019-06-01 Thread Atin Mukherjee

Rafi - tests/bugs/glusterd/serializ
e-shd-manager-glusterd-
restart.t seems to be failing often. Can you please investigate the reason
of this spurious failure?

-- Forwarded message -
From: 
Date: Thu, 30 May 2019 at 23:22
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #1357
To: , 


See <
https://build.gluster.org/job/regression-test-with-multiplex/1357/display/redirect?page=changes
>

Changes:

[Xavi Hernandez] tests: add tests for different signal handling

[Xavi Hernandez] marker: remove some unused functions

[Xavi Hernandez] glusterd: coverity fix

--
[...truncated 2.92 MB...]
./tests/basic/ec/ec-root-heal.t  -  9 second
./tests/basic/afr/ta-write-on-bad-brick.t  -  9 second
./tests/basic/afr/ta.t  -  9 second
./tests/basic/afr/gfid-mismatch.t  -  9 second
./tests/performance/open-behind.t  -  8 second
./tests/features/ssl-authz.t  -  8 second
./tests/features/readdir-ahead.t  -  8 second
./tests/features/lock-migration/lkmigration-set-option.t  -  8 second
./tests/bugs/replicate/bug-921231.t  -  8 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/posix/bug-990028.t  -  8 second
./tests/bugs/posix/bug-1360679.t  -  8 second
./tests/bugs/nfs/bug-915280.t  -  8 second
./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
./tests/bugs/glusterfs/bug-872923.t  -  8 second
./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
./tests/bugs/glusterd/bug-1696046.t  -  8 second
./tests/bugs/distribute/bug-1088231.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/bug-1258069.t  -  8 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  8
second
./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
-  8 second
./tests/basic/quota-nfs.t  -  8 second
./tests/basic/ec/statedump.t  -  8 second
./tests/basic/ctime/ctime-noatime.t  -  8 second
./tests/basic/afr/ta-shd.t  -  8 second
./tests/basic/afr/arbiter-remove-brick.t  -  8 second
./tests/line-coverage/cli-peer-and-volume-operations.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  7 second
./tests/gfid2path/block-mount-access.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -  7 second
./tests/bugs/transport/bug-873367.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/snapshot/bug-1064768.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  7 second
./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  7 second
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/replicate/bug-1101647.t  -  7 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  7 second
./tests/bugs/quota/bug-1104692.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/posix/bug-1122028.t  -  7 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  7 second
./tests/bugs/glusterfs/bug-848251.t  -  7 second
./tests/bugs/ec/bug-1227869.t  -  7 second
./tests/bugs/distribute/bug-884597.t  -  7 second
./tests/bugs/distribute/bug-1122443.t  -  7 second
./tests/bugs/changelog/bug-1208470.t  -  7 second
./tests/bugs/bug-1371806_2.t  -  7 second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
./tests/bitrot/bug-1221914.t  -  7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/xlator-pass-through-sanity.t  -  7 second
./tests/basic/trace.t  -  7 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  7 second
./tests/basic/distribute/file-create.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/bugs/shard/bug-1342298.t  -  6 second
./tests/bugs/shard/bug-1272986.t  -  6 second
./tests/bugs/shard/bug-1259651.t  -  6 second
./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
./tests/bugs/replicate/bug-1325792.t  -  6 second
./tests/bugs/readdir-ahead/bug-1670253-consistent-metadata.t  -  6 second
./tests/bugs/quota/bug-1243798.t  -  6 second
./tests/bugs/protocol/bug-1321578.t  -  6 second
./tests/bugs/posix/bug-765380.t  -  6 second
./tests/bugs/nfs/bug-877885.t  -  6 second
./tests/bugs/nfs/bug-847622.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
./tests/bugs/md-cache/bug-1211863_unlink.t  -  6 second
./tests/bugs/io-stats/bug-1598548.t  -  6 second
./tests/bugs/io-cache/bug-858242.t  -  6 second
./tests/bugs/glusterfs/bug-893378.t  -  6 second
./tests/bugs/glusterfs/bug-856455.t  -  6 second

[Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #1359

2019-06-01 Thread Atin Mukherjee

subdir-mount.t has started failing in brick mux regression nightly. This
needs to be fixed.

Raghavendra - did we manage to get any further clue on uss.t failure?

-- Forwarded message -
From: 
Date: Fri, 31 May 2019 at 23:34
Subject: [Gluster-Maintainers] Build failed in Jenkins:
regression-test-with-multiplex #1359
To: , , , <
amukh...@redhat.com>, 


See <
https://build.gluster.org/job/regression-test-with-multiplex/1359/display/redirect?page=changes
>

Changes:

[atin] glusterd: add an op-version check

[atin] glusterd/svc: glusterd_svcs_stop should call individual wrapper
function

[atin] glusterd/svc: Stop stale process using the glusterd_proc_stop

[Amar Tumballi] lcov: more coverage to shard, old-protocol, sdfs

[Kotresh H R] tests/geo-rep: Add EC volume test case

[Amar Tumballi] glusterfsd/cleanup: Protect graph object under a lock

[Mohammed Rafi KC] glusterd/shd: Optimize the glustershd manager to send
reconfigure

[Kotresh H R] tests/geo-rep: Add tests to cover glusterd geo-rep

[atin] glusterd: Optimize code to copy dictionary in handshake code path

--
[...truncated 3.18 MB...]
./tests/basic/afr/stale-file-lookup.t  -  9 second
./tests/basic/afr/granular-esh/replace-brick.t  -  9 second
./tests/basic/afr/granular-esh/add-brick.t  -  9 second
./tests/basic/afr/gfid-mismatch.t  -  9 second
./tests/performance/open-behind.t  -  8 second
./tests/features/ssl-authz.t  -  8 second
./tests/features/readdir-ahead.t  -  8 second
./tests/bugs/upcall/bug-1458127.t  -  8 second
./tests/bugs/transport/bug-873367.t  -  8 second
./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  8 second
./tests/bugs/replicate/bug-1132102.t  -  8 second
./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
-  8 second
./tests/bugs/quota/bug-1104692.t  -  8 second
./tests/bugs/posix/bug-1360679.t  -  8 second
./tests/bugs/posix/bug-1122028.t  -  8 second
./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  8 second
./tests/bugs/glusterfs/bug-861015-log.t  -  8 second
./tests/bugs/glusterd/sync-post-glusterd-restart.t  -  8 second
./tests/bugs/glusterd/bug-1696046.t  -  8 second
./tests/bugs/fuse/bug-983477.t  -  8 second
./tests/bugs/ec/bug-1227869.t  -  8 second
./tests/bugs/distribute/bug-1088231.t  -  8 second
./tests/bugs/distribute/bug-1086228.t  -  8 second
./tests/bugs/cli/bug-1087487.t  -  8 second
./tests/bugs/cli/bug-1022905.t  -  8 second
./tests/bugs/bug-1258069.t  -  8 second
./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
-  8 second
./tests/basic/xlator-pass-through-sanity.t  -  8 second
./tests/basic/quota-nfs.t  -  8 second
./tests/basic/glusterd/arbiter-volume.t  -  8 second
./tests/basic/ctime/ctime-noatime.t  -  8 second
./tests/line-coverage/cli-peer-and-volume-operations.t  -  7 second
./tests/gfid2path/get-gfid-to-path.t  -  7 second
./tests/bugs/upcall/bug-1369430.t  -  7 second
./tests/bugs/snapshot/bug-1260848.t  -  7 second
./tests/bugs/shard/shard-inode-refcount-test.t  -  7 second
./tests/bugs/shard/bug-1258334.t  -  7 second
./tests/bugs/replicate/bug-767585-gfid.t  -  7 second
./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  7 second
./tests/bugs/replicate/bug-1250170-fsync.t  -  7 second
./tests/bugs/posix/bug-1175711.t  -  7 second
./tests/bugs/nfs/bug-915280.t  -  7 second
./tests/bugs/md-cache/setxattr-prepoststat.t  -  7 second
./tests/bugs/md-cache/bug-1211863_unlink.t  -  7 second
./tests/bugs/glusterfs/bug-848251.t  -  7 second
./tests/bugs/distribute/bug-1122443.t  -  7 second
./tests/bugs/changelog/bug-1208470.t  -  7 second
./tests/bugs/bug-1702299.t  -  7 second
./tests/bugs/bug-1371806_2.t  -  7 second
./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  7
second
./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
7 second
./tests/bitrot/br-stub.t  -  7 second
./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
./tests/basic/gfapi/libgfapi-fini-hang.t  -  7 second
./tests/basic/fencing/fencing-crash-conistency.t  -  7 second
./tests/basic/distribute/file-create.t  -  7 second
./tests/basic/afr/tarissue.t  -  7 second
./tests/basic/afr/gfid-heal.t  -  7 second
./tests/bugs/snapshot/bug-1178079.t  -  6 second
./tests/bugs/snapshot/bug-1064768.t  -  6 second
./tests/bugs/shard/bug-1342298.t  -  6 second
./tests/bugs/shard/bug-1259651.t  -  6 second
./tests/bugs/replicate/bug-1686568-send-truncate-on-arbiter-from-shd.t  -
6 second
./tests/bugs/replicate/bug-1626994-info-split-brain.t  -  6 second
./tests/bugs/replicate/bug-1325792.t  -  6 second
./tests/bugs/replicate/bug-1101647.t  -  6 second
./tests/bugs/quota/bug-1243798.t  -  6 second
./tests/bugs/protocol/bug-1321578.t  -  6 second
./tests/bugs/nfs/bug-877885.t  -  6 second
./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second

Re: [Gluster-Maintainers] BZ updates

2019-04-23 Thread Atin Mukherjee

Absolutely agree and I definitely think this would help going forward.

On Wed, Apr 24, 2019 at 8:45 AM Nithya Balachandran 
wrote:

> All,
>
> When working on a bug, please ensure that you update the BZ with any
> relevant information as well as the RCA. I have seen several BZs in the
> past which report crashes, however they do not have a bt or RCA captured.
> Having this information in the BZ makes it much easier to see if a newly
> reported issue has already been fixed.
>
> I propose that maintainers merge patches only if the BZs are updated with
> required information. It will take some time to make this a habit but it
> will pay off in the end.
>
> Regards,
> Nithya
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee

On Wed, Apr 17, 2019 at 12:33 AM Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Tue, Apr 16, 2019 at 10:27 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
>>> wrote:
>>>
>>>> Status: Tagging pending
>>>>
>>>> Waiting on patches:
>>>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>>>   https://review.gluster.org/c/glusterfs/+/22579
>>>
>>>
>>> The regression doesn't pass for the mainline patch. I believe master is
>>> broken now. With latest master sdfs-sanity.t always fail. We either need to
>>> fix it or mark it as bad test.
>>>
>>
>> commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
>> Author: Pranith Kumar K 
>> Date:   Mon Apr 1 11:14:56 2019 +0530
>>
>>features/locks: error-out {inode,entry}lk fops with all-zero lk-owner
>>
>>Problem:
>>Sometimes we find that developers forget to assign lk-owner for an
>>inodelk/entrylk/lk before writing code to wind these fops. locks
>>xlator at the moment allows this operation. This leads to multiple
>>threads in the same client being able to get locks on the inode
>>because lk-owner is same and transport is same. So isolation
>>with locks can't be achieved.
>>
>>Fix:
>>Disallow locks with lk-owner zero.
>>
>>fixes bz#1624701
>>Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
>>Signed-off-by: Pranith Kumar K 
>>
>> With the above commit sdfs-sanity.t started failing. But when I looked at
>> the last regression vote at
>> https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw
>> it voted back positive but the bell rang when I saw the overall regression
>> took less than 2 hours and when I opened the regression link I saw the test
>> actually failed but still this job voted back +1 at gerrit.
>>
>> *Deepshika* - *This is a bad CI bug we have now and have to be addressed
>> at earliest. Please take a look at
>> https://build.gluster.org/job/centos7-regression/5568/consoleFull
>> <https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
>> investigate why the regression vote wasn't negative.*
>>
>> Pranith - I request you to investigate on the sdfs-sanity.t failure
>> because of this patch.
>>
>
> sdfs is supposed to serialize entry fops by doing entrylk, but all the
> locks are being done with all-zero lk-owner. In essence sdfs doesn't
> achieve its goal of mutual exclusion when conflicting operations are
> executed by same client because two locks on same entry with same
> all-zero-owner will get locks. The patch which lead to sdfs-sanity.t
> failure treats inodelk/entrylk/lk fops with all-zero lk-owner as Invalid
> request to prevent these kinds of bugs. So it exposed the bug in sdfs. I
> sent a fix for sdfs @ https://review.gluster.org/#/c/glusterfs/+/22582
>

Since this patch hasn't passed the regression and now that I see
tests/bugs/replicate/bug-1386188-sbrain-fav-child.t hanging and timing out
in the latest nightly regression runs because of the above commit (tested
locally and confirm) I still request that we first revert this commit, get
master back to stable and then put back the required fixes.


>
>> *@Maintainers - Please open up every regression link to see the actual
>> status of the job and don't blindly trust on the +1 vote back at gerrit
>> till this is addressed.*
>>
>> As per the policy, I'm going to revert this commit, watch out for the
>> patch. I request this to be directly pushed with out waiting for the
>> regression vote as we had done before in such breakage. Amar/Shyam - I
>> believe you have this permission?
>>
>
>>
>>> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
>>> tests/basic/sdfs-sanity.t ..
>>> 1..7
>>> ok 1, LINENUM:8
>>> ok 2, LINENUM:9
>>> ok 3, LINENUM:11
>>> ok 4, LINENUM:12
>>> ok 5, LINENUM:13
>>> ok 6, LINENUM:16
>>> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
>>> argument
>>> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
>>> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
>>> not ok 7 , LINENUM:20
>>> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
>>> Failed 1/7 subtests
>>>
>>> Tes

Re: [Gluster-Maintainers] [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee

On Tue, Apr 16, 2019 at 10:26 PM Atin Mukherjee  wrote:

>
>
> On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee 
> wrote:
>
>>
>>
>> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
>> wrote:
>>
>>> Status: Tagging pending
>>>
>>> Waiting on patches:
>>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>>   https://review.gluster.org/c/glusterfs/+/22579
>>
>>
>> The regression doesn't pass for the mainline patch. I believe master is
>> broken now. With latest master sdfs-sanity.t always fail. We either need to
>> fix it or mark it as bad test.
>>
>
> commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
> Author: Pranith Kumar K 
> Date:   Mon Apr 1 11:14:56 2019 +0530
>
>features/locks: error-out {inode,entry}lk fops with all-zero lk-owner
>
>Problem:
>Sometimes we find that developers forget to assign lk-owner for an
>inodelk/entrylk/lk before writing code to wind these fops. locks
>xlator at the moment allows this operation. This leads to multiple
>threads in the same client being able to get locks on the inode
>because lk-owner is same and transport is same. So isolation
>with locks can't be achieved.
>
>Fix:
>Disallow locks with lk-owner zero.
>
>fixes bz#1624701
>Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
>Signed-off-by: Pranith Kumar K 
>
> With the above commit sdfs-sanity.t started failing. But when I looked at
> the last regression vote at
> https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw
> it voted back positive but the bell rang when I saw the overall regression
> took less than 2 hours and when I opened the regression link I saw the test
> actually failed but still this job voted back +1 at gerrit.
>
> *Deepshika* - *This is a bad CI bug we have now and have to be addressed
> at earliest. Please take a look at
> https://build.gluster.org/job/centos7-regression/5568/consoleFull
> <https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
> investigate why the regression vote wasn't negative.*
>
> Pranith - I request you to investigate on the sdfs-sanity.t failure
> because of this patch.
>
> *@Maintainers - Please open up every regression link to see the actual
> status of the job and don't blindly trust on the +1 vote back at gerrit
> till this is addressed.*
>
> As per the policy, I'm going to revert this commit, watch out for the
> patch.
>

https://review.gluster.org/#/c/glusterfs/+/22581/
Please review and merge it.

Also since we're already close to 23:00 in IST timezone, I need help from
folks from other timezone in getting
https://review.gluster.org/#/c/glusterfs/+/22578/ rebased and marked
verified +1 once the above fix is merged. This is a blocker to
glusterfs-6.1 as otherwise ctime feature option tuning isn't honoured.

I request this to be directly pushed with out waiting for the regression
> vote as we had done before in such breakage. Amar/Shyam - I believe you
> have this permission?
>
>
>> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
>> tests/basic/sdfs-sanity.t ..
>> 1..7
>> ok 1, LINENUM:8
>> ok 2, LINENUM:9
>> ok 3, LINENUM:11
>> ok 4, LINENUM:12
>> ok 5, LINENUM:13
>> ok 6, LINENUM:16
>> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
>> argument
>> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
>> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
>> not ok 7 , LINENUM:20
>> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
>> Failed 1/7 subtests
>>
>> Test Summary Report
>> ---
>> tests/basic/sdfs-sanity.t (Wstat: 0 Tests: 7 Failed: 1)
>>   Failed test:  7
>> Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  0.58 cusr
>> 0.67 csys =  1.27 CPU)
>> Result: FAIL
>>
>>
>>>
>>> Following patches will not be taken in if CentOS regression does not
>>> pass by tomorrow morning Eastern TZ,
>>> (Pranith/KingLongMee) - cluster-syncop: avoid duplicate unlock of
>>> inodelk/entrylk
>>>   https://review.gluster.org/c/glusterfs/+/22385
>>> (Aravinda) - geo-rep: IPv6 support
>>>   https://review.gluster.org/c/glusterfs/+/22488
>>> (Aravinda) - geo-rep: fix integer config validation
>>>   https://review.gluster.org/c/glusterfs/+/22489
>>>
>>> Tracker bug status:
>>> (Ravi) - Bug 1693155 - Excessive AFR messages from gluster showing in
>>> RHGSWA.
>>>   All patches are merged, but none of

Re: [Gluster-Maintainers] [Gluster-devel] Release 6.1: Expected tagging on April 10th

2019-04-16 Thread Atin Mukherjee

On Tue, Apr 16, 2019 at 9:19 PM Atin Mukherjee  wrote:

>
>
> On Tue, Apr 16, 2019 at 7:24 PM Shyam Ranganathan 
> wrote:
>
>> Status: Tagging pending
>>
>> Waiting on patches:
>> (Kotresh/Atin) - glusterd: fix loading ctime in client graph logic
>>   https://review.gluster.org/c/glusterfs/+/22579
>
>
> The regression doesn't pass for the mainline patch. I believe master is
> broken now. With latest master sdfs-sanity.t always fail. We either need to
> fix it or mark it as bad test.
>

commit 3883887427a7f2dc458a9773e05f7c8ce8e62301 (HEAD)
Author: Pranith Kumar K 
Date:   Mon Apr 1 11:14:56 2019 +0530

   features/locks: error-out {inode,entry}lk fops with all-zero lk-owner

   Problem:
   Sometimes we find that developers forget to assign lk-owner for an
   inodelk/entrylk/lk before writing code to wind these fops. locks
   xlator at the moment allows this operation. This leads to multiple
   threads in the same client being able to get locks on the inode
   because lk-owner is same and transport is same. So isolation
   with locks can't be achieved.

   Fix:
   Disallow locks with lk-owner zero.

   fixes bz#1624701
   Change-Id: I1c816280cffd150ebb392e3dcd4d21007cdd767f
   Signed-off-by: Pranith Kumar K 

With the above commit sdfs-sanity.t started failing. But when I looked at
the last regression vote at
https://build.gluster.org/job/centos7-regression/5568/consoleFull I saw it
voted back positive but the bell rang when I saw the overall regression
took less than 2 hours and when I opened the regression link I saw the test
actually failed but still this job voted back +1 at gerrit.

*Deepshika* - *This is a bad CI bug we have now and have to be addressed at
earliest. Please take a look at
https://build.gluster.org/job/centos7-regression/5568/consoleFull
<https://build.gluster.org/job/centos7-regression/5568/consoleFull> and
investigate why the regression vote wasn't negative.*

Pranith - I request you to investigate on the sdfs-sanity.t failure because
of this patch.

*@Maintainers - Please open up every regression link to see the actual
status of the job and don't blindly trust on the +1 vote back at gerrit
till this is addressed.*

As per the policy, I'm going to revert this commit, watch out for the
patch. I request this to be directly pushed with out waiting for the
regression vote as we had done before in such breakage. Amar/Shyam - I
believe you have this permission?

> root@a5f81bd447c2:/home/glusterfs# prove -vf tests/basic/sdfs-sanity.t
> tests/basic/sdfs-sanity.t ..
> 1..7
> ok 1, LINENUM:8
> ok 2, LINENUM:9
> ok 3, LINENUM:11
> ok 4, LINENUM:12
> ok 5, LINENUM:13
> ok 6, LINENUM:16
> mkdir: cannot create directory ‘/mnt/glusterfs/1/coverage’: Invalid
> argument
> stat: cannot stat '/mnt/glusterfs/1/coverage/dir': Invalid argument
> tests/basic/rpc-coverage.sh: line 61: test: ==: unary operator expected
> not ok 7 , LINENUM:20
> FAILED COMMAND: tests/basic/rpc-coverage.sh /mnt/glusterfs/1
> Failed 1/7 subtests
>
> Test Summary Report
> ---
> tests/basic/sdfs-sanity.t (Wstat: 0 Tests: 7 Failed: 1)
>   Failed test:  7
> Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  0.58 cusr
> 0.67 csys =  1.27 CPU)
> Result: FAIL
>
>
>>
>> Following patches will not be taken in if CentOS regression does not
>> pass by tomorrow morning Eastern TZ,
>> (Pranith/KingLongMee) - cluster-syncop: avoid duplicate unlock of
>> inodelk/entrylk
>>   https://review.gluster.org/c/glusterfs/+/22385
>> (Aravinda) - geo-rep: IPv6 support
>>   https://review.gluster.org/c/glusterfs/+/22488
>> (Aravinda) - geo-rep: fix integer config validation
>>   https://review.gluster.org/c/glusterfs/+/22489
>>
>> Tracker bug status:
>> (Ravi) - Bug 1693155 - Excessive AFR messages from gluster showing in
>> RHGSWA.
>>   All patches are merged, but none of the patches adds the "Fixes"
>> keyword, assume this is an oversight and that the bug is fixed in this
>> release.
>>
>> (Atin) - Bug 1698131 - multiple glusterfsd processes being launched for
>> the same brick, causing transport endpoint not connected
>>   No work has occurred post logs upload to bug, restart of bircks and
>> possibly glusterd is the existing workaround when the bug is hit. Moving
>> this out of the tracker for 6.1.
>>
>> (Xavi) - Bug 1699917 - I/O error on writes to a disperse volume when
>> replace-brick is executed
>>   Very recent bug (15th April), does not seem to have any critical data
>> corruption or service availability issues, planning on not waiting for
>> the fix in 6.1
>>
>> - Shyam
>> On 4/6/19 4:38 AM, Atin Mukherjee wrote:
>> > Hi Mohit,
>> >
>> > https:

[Gluster-Maintainers] Backporting important fixes in release branches

2019-04-02 Thread Atin Mukherjee

Off late my observation has been that we're missing to backport
critical/important fixes into the release branches and we do a course of
correction when users discover the problems which isn't a great experience.
I request all developers and maintainers to pay some attention on (a)
deciding on which patches from mainline should be backported to what
release branches & (b) do the same right away once the patches are merged
in mainline branch instead of waiting to do them later.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1240

2019-04-01 Thread Atin Mukherjee

On Mon, 1 Apr 2019 at 23:00,  wrote:

> See <
> https://build.gluster.org/job/regression-test-with-multiplex/1240/display/redirect?page=changes
> >
>
> Changes:
>
> [Amar Tumballi] mgmt/shd: Implement multiplexing in self heal daemon
>
> [Amar Tumballi] tests: add statedump to playground
>
> --
> [...truncated 1.07 MB...]
> ./tests/bugs/nfs/bug-915280.t  -  7 second
> ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  7 second
> ./tests/bugs/glusterfs-server/bug-904300.t  -  7 second
> ./tests/bugs/glusterfs/bug-879494.t  -  7 second
> ./tests/bugs/glusterfs/bug-872923.t  -  7 second
> ./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
> ./tests/bugs/fuse/bug-963678.t  -  7 second
> ./tests/bugs/ec/bug-1179050.t  -  7 second
> ./tests/bugs/distribute/bug-1122443.t  -  7 second
> ./tests/bugs/core/bug-949242.t  -  7 second
> ./tests/bugs/changelog/bug-1208470.t  -  7 second
> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
> -  7 second
> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
> ./tests/bitrot/br-stub.t  -  7 second
> ./tests/basic/xlator-pass-through-sanity.t  -  7 second
> ./tests/basic/quota-nfs.t  -  7 second
> ./tests/basic/inode-quota-enforcing.t  -  7 second
> ./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
> ./tests/basic/ec/ec-read-policy.t  -  7 second
> ./tests/basic/distribute/file-create.t  -  7 second
> ./tests/basic/ctime/ctime-noatime.t  -  7 second
> ./tests/basic/afr/tarissue.t  -  7 second
> ./tests/basic/afr/gfid-mismatch.t  -  7 second
> ./tests/gfid2path/block-mount-access.t  -  6 second
> ./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
> ./tests/bugs/replicate/bug-1498570-client-iot-graph-check.t  -  6 second
> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  6 second
> ./tests/bugs/replicate/bug-1365455.t  -  6 second
> ./tests/bugs/replicate/bug-1250170-fsync.t  -  6 second
> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
> -  6 second
> ./tests/bugs/posix/bug-990028.t  -  6 second
> ./tests/bugs/md-cache/setxattr-prepoststat.t  -  6 second
> ./tests/bugs/io-cache/bug-858242.t  -  6 second
> ./tests/bugs/gfapi/bug-1630804/gfapi-bz1630804.t  -  6 second
> ./tests/bugs/fuse/bug-985074.t  -  6 second
> ./tests/bugs/ec/bug-1227869.t  -  6 second
> ./tests/bugs/distribute/bug-884597.t  -  6 second
> ./tests/bugs/distribute/bug-882278.t  -  6 second
> ./tests/bugs/distribute/bug-1088231.t  -  6 second
> ./tests/bugs/core/bug-986429.t  -  6 second
> ./tests/bugs/core/bug-908146.t  -  6 second
> ./tests/bugs/bug-1258069.t  -  6 second
> ./tests/basic/volume-status.t  -  6 second
> ./tests/basic/posix/zero-fill-enospace.t  -  6 second
> ./tests/basic/playground/template-xlator-sanity.t  -  6 second
> ./tests/basic/gfapi/glfs_xreaddirplus_r.t  -  6 second
> ./tests/basic/gfapi/glfd-lkowner.t  -  6 second
> ./tests/basic/gfapi/bug-1241104.t  -  6 second
> ./tests/basic/fencing/fencing-crash-conistency.t  -  6 second
> ./tests/basic/ec/ec-internal-xattrs.t  -  6 second
> ./tests/basic/ec/ec-fallocate.t  -  6 second
> ./tests/basic/ctime/ctime-glfs-init.t  -  6 second
> ./tests/basic/afr/gfid-heal.t  -  6 second
> ./tests/basic/afr/arbiter-remove-brick.t  -  6 second
> ./tests/bugs/upcall/bug-1369430.t  -  5 second
> ./tests/bugs/snapshot/bug-1178079.t  -  5 second
> ./tests/bugs/shard/bug-1468483.t  -  5 second
> ./tests/bugs/shard/bug-1342298.t  -  5 second
> ./tests/bugs/shard/bug-1259651.t  -  5 second
> ./tests/bugs/shard/bug-1258334.t  -  5 second
> ./tests/bugs/replicate/bug-880898.t  -  5 second
> ./tests/bugs/replicate/bug-1101647.t  -  5 second
> ./tests/bugs/quota/bug-1104692.t  -  5 second
> ./tests/bugs/nfs/bug-877885.t  -  5 second
> ./tests/bugs/md-cache/afr-stale-read.t  -  5 second
> ./tests/bugs/io-stats/bug-1598548.t  -  5 second
> ./tests/bugs/io-cache/bug-read-hang.t  -  5 second
> ./tests/bugs/glusterfs-server/bug-873549.t  -  5 second
> ./tests/bugs/glusterfs/bug-902610.t  -  5 second
> ./tests/bugs/glusterfs/bug-895235.t  -  5 second
> ./tests/bugs/glusterfs/bug-848251.t  -  5 second
> ./tests/bugs/glusterd/quorum-value-check.t  -  5 second
> ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t
> -  5 second
> ./tests/bugs/core/bug-834465.t  -  5 second
> ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  5 second
> ./tests/bugs/cli/bug-1022905.t  -  5 second
> ./tests/bugs/bug-1371806_2.t  -  5 second
> ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t  -  5 second
> ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t  -  5
> second
> ./tests/bugs/access-control/bug-1051896.t  -  5 second
> ./tests/bitrot/bug-1221914.t  -  5 second
> ./tests/basic/glusterd/arbiter-volume.t  -  5 second
> ./tests/basic/gfapi/upcall-cache-invalidate.t  -  5 second
> ./tests/basic/gfapi/gfapi-dup.t  -  5 second
> ./tests/basic/gfapi/anonymous_fd.t  -  5

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #1239

2019-04-01 Thread Atin Mukherjee

On Mon, 1 Apr 2019 at 03:40,  wrote:

> See <
> https://build.gluster.org/job/regression-test-with-multiplex/1239/display/redirect
> >
>
> --
> [...truncated 1.04 MB...]
> ./tests/bugs/core/bug-949242.t  -  7 second
> ./tests/bugs/changelog/bug-1208470.t  -  7 second
> ./tests/bugs/bug-1258069.t  -  7 second
> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  7 second
> ./tests/bugs/access-control/bug-958691.t  -  7 second
> ./tests/basic/xlator-pass-through-sanity.t  -  7 second
> ./tests/basic/quota-nfs.t  -  7 second
> ./tests/basic/pgfid-feat.t  -  7 second
> ./tests/basic/inode-quota-enforcing.t  -  7 second
> ./tests/basic/glusterd/arbiter-volume-probe.t  -  7 second
> ./tests/basic/ec/ec-anonymous-fd.t  -  7 second
> ./tests/basic/afr/tarissue.t  -  7 second
> ./tests/basic/afr/arbiter-remove-brick.t  -  7 second
> ./tests/gfid2path/block-mount-access.t  -  6 second
> ./tests/bugs/upcall/bug-1458127.t  -  6 second
> ./tests/bugs/snapshot/bug-1064768.t  -  6 second
> ./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
> ./tests/bugs/replicate/bug-1365455.t  -  6 second
> ./tests/bugs/replicate/bug-1132102.t  -  6 second
> ./tests/bugs/quota/bug-1243798.t  -  6 second
> ./tests/bugs/posix/bug-990028.t  -  6 second
> ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  6 second
> ./tests/bugs/io-cache/bug-read-hang.t  -  6 second
> ./tests/bugs/io-cache/bug-858242.t  -  6 second
> ./tests/bugs/glusterfs-server/bug-904300.t  -  6 second
> ./tests/bugs/glusterfs/bug-861015-log.t  -  6 second
> ./tests/bugs/glusterd/quorum-value-check.t  -  6 second
> ./tests/bugs/gfapi/bug-1630804/gfapi-bz1630804.t  -  6 second
> ./tests/bugs/fuse/bug-985074.t  -  6 second
> ./tests/bugs/distribute/bug-884597.t  -  6 second
> ./tests/bugs/distribute/bug-882278.t  -  6 second
> ./tests/bugs/distribute/bug-1088231.t  -  6 second
> ./tests/bugs/core/bug-986429.t  -  6 second
> ./tests/bugs/core/bug-908146.t  -  6 second
> ./tests/bugs/bug-1371806_2.t  -  6 second
> ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t  -  6 second
> ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
> 6 second
> ./tests/bitrot/br-stub.t  -  6 second
> ./tests/basic/volume-status.t  -  6 second
> ./tests/basic/playground/template-xlator-sanity.t  -  6 second
> ./tests/basic/glusterd/arbiter-volume.t  -  6 second
> ./tests/basic/gfapi/upcall-cache-invalidate.t  -  6 second
> ./tests/basic/gfapi/glfd-lkowner.t  -  6 second
> ./tests/basic/gfapi/bug-1241104.t  -  6 second
> ./tests/basic/fencing/fencing-crash-conistency.t  -  6 second
> ./tests/basic/ec/ec-read-policy.t  -  6 second
> ./tests/basic/ec/ec-internal-xattrs.t  -  6 second
> ./tests/basic/ec/ec-fallocate.t  -  6 second
> ./tests/basic/distribute/file-create.t  -  6 second
> ./tests/basic/ctime/ctime-noatime.t  -  6 second
> ./tests/basic/ctime/ctime-glfs-init.t  -  6 second
> ./tests/basic/afr/heal-info.t  -  6 second
> ./tests/basic/afr/gfid-mismatch.t  -  6 second
> ./tests/basic/afr/gfid-heal.t  -  6 second
> ./tests/bugs/upcall/bug-upcall-stat.t  -  5 second
> ./tests/bugs/upcall/bug-1369430.t  -  5 second
> ./tests/bugs/snapshot/bug-1178079.t  -  5 second
> ./tests/bugs/shard/bug-1468483.t  -  5 second
> ./tests/bugs/shard/bug-1342298.t  -  5 second
> ./tests/bugs/shard/bug-1258334.t  -  5 second
> ./tests/bugs/shard/bug-1256580.t  -  5 second
> ./tests/bugs/replicate/bug-976800.t  -  5 second
> ./tests/bugs/replicate/bug-1250170-fsync.t  -  5 second
> ./tests/bugs/replicate/bug-1101647.t  -  5 second
> ./tests/bugs/quota/bug-1104692.t  -  5 second
> ./tests/bugs/posix/bug-1034716.t  -  5 second
> ./tests/bugs/nfs/bug-877885.t  -  5 second
> ./tests/bugs/nfs/bug-1116503.t  -  5 second
> ./tests/bugs/io-stats/bug-1598548.t  -  5 second
> ./tests/bugs/glusterfs-server/bug-873549.t  -  5 second
> ./tests/bugs/glusterfs/bug-902610.t  -  5 second
> ./tests/bugs/glusterfs/bug-848251.t  -  5 second
> ./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  5 second
> ./tests/bugs/fuse/bug-1030208.t  -  5 second
> ./tests/bugs/core/bug-834465.t  -  5 second
> ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  5 second
> ./tests/bugs/cli/bug-982174.t  -  5 second
> ./tests/bugs/cli/bug-1022905.t  -  5 second
> ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t  -  5
> second
> ./tests/bitrot/bug-1221914.t  -  5 second
> ./tests/basic/posix/zero-fill-enospace.t  -  5 second
> ./tests/basic/hardlink-limit.t  -  5 second
> ./tests/basic/gfapi/glfs_xreaddirplus_r.t  -  5 second
> ./tests/basic/gfapi/gfapi-dup.t  -  5 second
> ./tests/basic/gfapi/anonymous_fd.t  -  5 second
> ./tests/basic/ec/nfs.t  -  5 second
> ./tests/basic/ec/dht-rename.t  -  5 second
> ./tests/basic/distribute/throttle-rebal.t  -  5 second
> ./tests/basic/afr/afr-read-hash-mode.t  -  5 second
> ./tests/performance/quick-read.t  -  4 second
> ./tests/features/readdir-ahead.t  -  4 second
>

[Gluster-Maintainers] GF_CALLOC to GF_MALLOC conversion - is it safe?

2019-03-21 Thread Atin Mukherjee

All,

In the last few releases of glusterfs, with stability as a primary theme of
the releases, there has been lots of changes done on the code optimization
with an expectation that such changes will have gluster to provide better
performance. While many of these changes do help, but off late we have
started seeing some diverse effects of them, one especially being the
calloc to malloc conversions. While I do understand that malloc syscall
will eliminate the extra memset bottleneck which calloc bears, but with
recent kernels having in-built strong compiler optimizations I am not sure
whether that makes any significant difference, but as I mentioned earlier
certainly if this isn't done carefully it can potentially introduce lot of
bugs and I'm writing this email to share one of such experiences.

Sanju & I were having troubles for last two days to figure out why
https://review.gluster.org/#/c/glusterfs/+/22388/ wasn't working in Sanju's
system but it had no problems running the same fix in my gluster
containers. After spending a significant amount of time, what we now
figured out is that a malloc call [1] (which was a calloc earlier) is the
culprit here. As you all can see, in this function we allocate txn_id and
copy the event->txn_id into it through gf_uuid_copy () . But when we were
debugging this step wise through gdb, txn_id wasn't exactly copied with the
exact event->txn_id and it had some junk values which made the
glusterd_clear_txn_opinfo to be invoked with a wrong txn_id later on
resulting the leaks to remain the same which was the original intention of
the fix.

This was quite painful to debug and we had to spend some time to figure
this out. Considering we have converted many such calls in past, I'd urge
that we review all such conversions and see if there're any side effects to
it. Otherwise we might end up running into many potential memory related
bugs later on. OTOH, going forward I'd request every patch
owners/maintainers to pay some special attention to these conversions and
see they are really beneficial and error free. IMO, general guideline
should be - for bigger buffers, malloc would make better sense but has to
be done carefully, for smaller size, we stick to calloc.

What do others think about it?

[1]
https://github.com/gluster/glusterfs/blob/master/xlators/mgmt/glusterd/src/glusterd-op-sm.c#L5681
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [gluster-packaging] glusterfs-6.0rc1 released

2019-03-13 Thread Atin Mukherjee

If you were on rc0 and upgraded to rc1, then you are hitting BZ 1684029 I
believe. Can you please upgrade all the nodes to rc1, bump up the
op-version to 6 (if not already done) and then restart glusterd
services to see if the peer rejection goes away?

On Thu, Mar 14, 2019 at 7:51 AM Guillaume Pavese <
guillaume.pav...@interactiv-group.com> wrote:

> putting us...@gluster.org in the loop
>
> Guillaume Pavese
> Ingénieur Système et Réseau
> Interactiv-Group
>
>
> On Thu, Mar 14, 2019 at 11:04 AM Guillaume Pavese <
> guillaume.pav...@interactiv-group.com> wrote:
>
>> Hi, I am testing gluster6-rc1 on a replica 3 oVirt cluster (engine full
>> replica 3 and 2 other volume replica + arbiter). They were on Gluster6-rc0.
>> I upgraded one host that was having the "0-epoll: Failed to dispatch
>> handler" bug for one of its volumes, but now all three volumes are down!
>> "gluster peer status" now shows its 2 other peers as connected but
>> rejected. Should I upgrade the other nodes? They are still on Gluster6-rc0
>>
>>
>> Guillaume Pavese
>> Ingénieur Système et Réseau
>> Interactiv-Group
>>
>>
>> On Wed, Mar 13, 2019 at 6:38 PM Niels de Vos  wrote:
>>
>>> On Wed, Mar 13, 2019 at 02:24:44AM +, jenk...@build.gluster.org
>>> wrote:
>>> > SRC:
>>> https://build.gluster.org/job/release-new/81/artifact/glusterfs-6.0rc1.tar.gz
>>> > HASH:
>>> https://build.gluster.org/job/release-new/81/artifact/glusterfs-6.0rc1.sha512sum
>>>
>>> Packages from the CentOS Storage SIG will become available shortly in
>>> the testing repository. Please use these packages to enable the repo and
>>> install the glusterfs components in a 2nd step.
>>>
>>> el7:
>>> https://cbs.centos.org/kojifiles/work/tasks/3263/723263/centos-release-gluster6-0.9-1.el7.centos.noarch.rpm
>>> el6
>>> :
>>>
>>> https://cbs.centos.org/kojifiles/work/tasks/3265/723265/centos-release-gluster6-0.9-1.el6.centos.noarch.rpm
>>>
>>> Once installed, the testing repo is enabled. Everything should be
>>> available.
>>>
>>> It is highly appreciated to let me know some results of the testing!
>>>
>>> Thanks,
>>> Niels
>>> ___
>>> packaging mailing list
>>> packag...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/packaging
>>>
>> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] GlusterFS - 6.0RC - Test days (27th, 28th Feb)

2019-03-07 Thread Atin Mukherjee

I am not sure how BZ 1683815
<https://bugzilla.redhat.com/show_bug.cgi?id=1683815> can be a blocker at
RC. We have a fix ready, but to me it doesn't look like a blocker. Vijay -
any objections?

Also the bugzilla dependency of all bugs attached to the release-6 is sort
of messed up. I see most of the times a mainline bug along with its clones
are attached to the tracker which is unnecessary. This has happened because
of default clone but I request every bugzilla assignees to spend few
additional seconds to establish the right dependency.

I have tried to correct few of them and will do the rest by next Monday.
That’d help us to filter out the unnecessary ones and get to know how many
actual blockers we have.

On Tue, Mar 5, 2019 at 11:51 PM Shyam Ranganathan 
wrote:

> On 3/4/19 12:33 PM, Shyam Ranganathan wrote:
> > On 3/4/19 10:08 AM, Atin Mukherjee wrote:
> >>
> >>
> >> On Mon, 4 Mar 2019 at 20:33, Amar Tumballi Suryanarayan
> >> mailto:atumb...@redhat.com>> wrote:
> >>
> >> Thanks to those who participated.
> >>
> >> Update at present:
> >>
> >> We found 3 blocker bugs in upgrade scenarios, and hence have marked
> >> release
> >> as pending upon them. We will keep these lists updated about
> progress.
> >>
> >>
> >> I’d like to clarify that upgrade testing is blocked. So just fixing
> >> these test blocker(s) isn’t enough to call release-6 green. We need to
> >> continue and finish the rest of the upgrade tests once the respective
> >> bugs are fixed.
> >
> > Based on fixes expected by tomorrow for the upgrade fixes, we will build
> > an RC1 candidate on Wednesday (6-Mar) (tagging early Wed. Eastern TZ).
> > This RC can be used for further testing.
>
> There have been no backports for the upgrade failures, request folks
> working on the same to post a list of bugs that need to be fixed, to
> enable tracking the same. (also, ensure they are marked against the
> release-6 tracker [1])
>
> Also, we need to start writing out the upgrade guide for release-6, any
> volunteers for the same?
>
> Thanks,
> Shyam
>
> [1] Release-6 tracker bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] GlusterFS - 6.0RC - Test days (27th, 28th Feb)

2019-03-04 Thread Atin Mukherjee

On Mon, 4 Mar 2019 at 20:33, Amar Tumballi Suryanarayan 
wrote:

> Thanks to those who participated.
>
> Update at present:
>
> We found 3 blocker bugs in upgrade scenarios, and hence have marked release
> as pending upon them. We will keep these lists updated about progress.


I’d like to clarify that upgrade testing is blocked. So just fixing these
test blocker(s) isn’t enough to call release-6 green. We need to continue
and finish the rest of the upgrade tests once the respective bugs are fixed.


>
> -Amar
>
> On Mon, Feb 25, 2019 at 11:41 PM Amar Tumballi Suryanarayan <
> atumb...@redhat.com> wrote:
>
> > Hi all,
> >
> > We are calling out our users, and developers to contribute in validating
> > ‘glusterfs-6.0rc’ build in their usecase. Specially for the cases of
> > upgrade, stability, and performance.
> >
> > Some of the key highlights of the release are listed in release-notes
> > draft
> > <
> https://github.com/gluster/glusterfs/blob/release-6/doc/release-notes/6.0.md
> >.
> > Please note that there are some of the features which are being dropped
> out
> > of this release, and hence making sure your setup is not going to have an
> > issue is critical. Also the default lru-limit option in fuse mount for
> > Inodes should help to control the memory usage of client processes. All
> the
> > good reason to give it a shot in your test setup.
> >
> > If you are developer using gfapi interface to integrate with other
> > projects, you also have some signature changes, so please make sure your
> > project would work with latest release. Or even if you are using a
> project
> > which depends on gfapi, report the error with new RPMs (if any). We will
> > help fix it.
> >
> > As part of test days, we want to focus on testing the latest upcoming
> > release i.e. GlusterFS-6, and one or the other gluster volunteers would
> be
> > there on #gluster channel on freenode to assist the people. Some of the
> key
> > things we are looking as bug reports are:
> >
> >-
> >
> >See if upgrade from your current version to 6.0rc is smooth, and works
> >as documented.
> >- Report bugs in process, or in documentation if you find mismatch.
> >-
> >
> >Functionality is all as expected for your usecase.
> >- No issues with actual application you would run on production etc.
> >-
> >
> >Performance has not degraded in your usecase.
> >- While we have added some performance options to the code, not all of
> >   them are turned on, as they have to be done based on usecases.
> >   - Make sure the default setup is at least same as your current
> >   version
> >   - Try out few options mentioned in release notes (especially,
> >   --auto-invalidation=no) and see if it helps performance.
> >-
> >
> >While doing all the above, check below:
> >- see if the log files are making sense, and not flooding with some
> >   “for developer only” type of messages.
> >   - get ‘profile info’ output from old and now, and see if there is
> >   anything which is out of normal expectation. Check with us on the
> numbers.
> >   - get a ‘statedump’ when there are some issues. Try to make sense
> >   of it, and raise a bug if you don’t understand it completely.
> >
> >
> > <
> https://hackmd.io/YB60uRCMQRC90xhNt4r6gA?both#Process-expected-on-test-days
> >Process
> > expected on test days.
> >
> >-
> >
> >We have a tracker bug
> >[0]
> >- We will attach all the ‘blocker’ bugs to this bug.
> >-
> >
> >Use this link to report bugs, so that we have more metadata around
> >given bugzilla.
> >- Click Here
> >   <
> https://bugzilla.redhat.com/enter_bug.cgi?blocked=1672818_severity=high=core=high=GlusterFS_whiteboard=gluster-test-day=6
> >
> >   [1]
> >-
> >
> >The test cases which are to be tested are listed here in this sheet
> ><
> https://docs.google.com/spreadsheets/d/1AS-tDiJmAr9skK535MbLJGe_RfqDQ3j1abX1wtjwpL4/edit?usp=sharing
> >[2],
> >please add, update, and keep it up-to-date to reduce duplicate efforts

-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Release 5.4: Earlier than Mar-10th

2019-02-26 Thread Atin Mukherjee

Milind - response inline.

Shyam - I'm not sure if you're tracking this. But at this point any blocker
bugs attached to 5.4 should also have clones of release-6 and attached to
the release tracker, otherwise we'll regress!

On Tue, Feb 26, 2019 at 5:00 PM Milind Changire  wrote:

> On Fri, Feb 8, 2019 at 8:26 PM Shyam Ranganathan 
> wrote:
>
>> Hi,
>>
>> There have been several crashes and issues reported by users on the
>> latest 5.3 release. Around 10 patches have been merged since 5.3 and we
>> have a few more addressing the blockers against 5.4 [1].
>>
>> The original date was March 10th, but we would like to accelerate a 5.4
>> release to earlier this month, once all critical blockers are fixed.
>>
>> Hence there are 3 asks in this mail,
>>
>> 1) Packaging team, would it be possible to accommodate an 5.4 release
>> around mid-next week (RC0 for rel-6 is also on the same week)? Assuming
>> we get the required fixes by then.
>>
>> 2) Maintainers, what other issues need to be tracked as blockers? Please
>> add them to [1].
>>
>> 3) Current blocker status reads as follows:
>> - Bug 1651246 - Failed to dispatch handler
>>   - This shows 2 patches that are merged, but there is no patch that
>> claims this is "Fixed" hence bug is still in POST state. What other
>> fixes are we expecting on this?
>>   - @Milind request you to update the status
>>
> Bug 1651246 has been addressed.
> Patch has been merged on master
>  as well as release-5
>  branches.
> Above patch addresses logging issue only.
>

Isn't this something which is applicable to release-6 branch as well? I
don't find https://review.gluster.org/#/c/glusterfs/+/1/ in release-6
branch which means we're going to regress this in release 6 if this isn't
backported and marked as blocker to release 6.


>
>> - Bug 1671556 - glusterfs FUSE client crashing every few days with
>> 'Failed to dispatch handler'
>>   - Awaiting fixes for identified issues
>>   - @Nithya what would be the target date?
>>
>> - Bug 1671603 - flooding of "dict is NULL" logging & crash of client
>> process
>>   - Awaiting a fix, what is the potential target date for the same?
>>   - We also need the bug assigned to a person
>>
> Bug 1671603 has been addressed.
> Patch has been posted on master
>  and merged on release-5
>  branches.
>

Are you sure the links are correct? Patch posted against release 5 branch
is abandoned? And also just like above, same question for release-6 branch,
I don't see a patch?

Above patch addresses logging issue only.
>
>
>> Thanks,
>> Shyam
>>
>> [1] Release 5.4 tracker:
>> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.4
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> https://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
> --
> Milind
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Bug state change proposal based on the conversation on bz 1630368

2018-11-06 Thread Atin Mukherjee

On Tue, 6 Nov 2018 at 19:57, Shyam Ranganathan  wrote:

> On 11/06/2018 09:20 AM, Atin Mukherjee wrote:
> >
> >
> > On Tue, Nov 6, 2018 at 7:16 PM Shyam Ranganathan  > <mailto:srang...@redhat.com>> wrote:
> >
> > On 11/05/2018 07:00 PM, Atin Mukherjee wrote:
> > > Bit late to this, but I’m in favour of the proposal.
> > >
> > > The script change should only consider transitioning the bug
> > status from
> > > POST to CLOSED NEXTRELEASE on master branch only. What’d be also
> ideal
> > > is to update the fixed in version in which this patch will land.
> >
> > 2 things, based on my response to this thread,
> >
> > - Script will change this bug state for all branches, not just
> master. I
> > do not see a reason to keep master special.
> >
> > - When moving the state to NEXTRELEASE I would not want to put in a
> > fixed in version yet, as that may change/morph, instead it would be
> > added (as it is now) when the release is made and the bug changed to
> > CURRENTRELEASE.
> >
> >
> > I can buy in the point of having the other branches also follow the same
> > rule of bug status moving to NEXTRELEASE from POST (considering we're
> > fine to run a script during the release of mass moving them to
> > CURRENTRELEASE) but not having the fixed in version in the bugs which
> > are with mainline branch may raise a question/concern on what exact
> > version this bug is being addressed at? Or is it that the post release
> > bug movement script also considers all the bugs fixed in the master
> > branch as well?
>
> Here is the way I see it,
> - If you find a bug on master and want to know if it is
> present/applicable for a release, you chase it's clone against the release
> - The state of the cloned bug against the release, tells you if is is
> CURRENTRELEASE/NEXTRELEASE/or what not.
>
> So referring to the bug on master, to determine state on which
> release(s) it is fixed in is not the way to find fixed state.


Question : With this workflow what happens when a bug is just filed & fixed
only in master and comes as a fix to the next release as part of branch
out? So how would an user understand what release version is the fix in if
we don’t have a fixed in version?


>
> As a result,
> - A bug on master with NEXTRELEASE means next major release of master.
>
> - A Bug on a release branch with NEXTRELEASE means, next major/minor
> release of the branch.
>
> >
> >
> > In all, the only change is the already existing script moving a bug
> from
> > POST to CLOSED-NEXTRELEASE instead of MODIFIED.
> >
> > >
> > > On Mon, 5 Nov 2018 at 21:39, Yaniv Kaul  > <mailto:yk...@redhat.com>
> > > <mailto:yk...@redhat.com <mailto:yk...@redhat.com>>> wrote:
> > >
> > >
> > >
> > > On Mon, Nov 5, 2018 at 5:05 PM Sankarshan Mukhopadhyay
> > >  > <mailto:sankarshan.mukhopadh...@gmail.com>
> > > <mailto:sankarshan.mukhopadh...@gmail.com
> > <mailto:sankarshan.mukhopadh...@gmail.com>>> wrote:
> > >
> > > On Mon, Nov 5, 2018 at 8:14 PM Yaniv Kaul
> > mailto:yk...@redhat.com>
> > > <mailto:yk...@redhat.com <mailto:yk...@redhat.com>>>
> wrote:
> > > > On Mon, Nov 5, 2018 at 4:28 PM Niels de Vos
> > mailto:nde...@redhat.com>
> > > <mailto:nde...@redhat.com <mailto:nde...@redhat.com>>>
> wrote:
> > > >>
> > > >> On Mon, Nov 05, 2018 at 05:31:26PM +0530, Pranith Kumar
> > > Karampuri wrote:
> > > >> > hi,
> > > >> > When we create a bz on master and clone it to the
> > next
> > > release(In my
> > > >> > case it was release-5.0), after that release happens
> > can we
> > > close the bz on
> > > >> > master with CLOSED NEXTRELEASE?
> > > >
> > > >
> > > > Since no one is going to verify it (right now, but I'm
> > hopeful
> > > this will change in the future!), no point in keeping it
> open.
> > > > You could keep it open and move it along the process,
> > and then
> > > close it properly when you rel

Re: [Gluster-Maintainers] Bug state change proposal based on the conversation on bz 1630368

2018-11-06 Thread Atin Mukherjee

On Tue, Nov 6, 2018 at 7:16 PM Shyam Ranganathan 
wrote:

> On 11/05/2018 07:00 PM, Atin Mukherjee wrote:
> > Bit late to this, but I’m in favour of the proposal.
> >
> > The script change should only consider transitioning the bug status from
> > POST to CLOSED NEXTRELEASE on master branch only. What’d be also ideal
> > is to update the fixed in version in which this patch will land.
>
> 2 things, based on my response to this thread,
>
> - Script will change this bug state for all branches, not just master. I
> do not see a reason to keep master special.
>
> - When moving the state to NEXTRELEASE I would not want to put in a
> fixed in version yet, as that may change/morph, instead it would be
> added (as it is now) when the release is made and the bug changed to
> CURRENTRELEASE.
>

I can buy in the point of having the other branches also follow the same
rule of bug status moving to NEXTRELEASE from POST (considering we're fine
to run a script during the release of mass moving them to CURRENTRELEASE)
but not having the fixed in version in the bugs which are with mainline
branch may raise a question/concern on what exact version this bug is being
addressed at? Or is it that the post release bug movement script also
considers all the bugs fixed in the master branch as well?


> In all, the only change is the already existing script moving a bug from
> POST to CLOSED-NEXTRELEASE instead of MODIFIED.
>
> >
> > On Mon, 5 Nov 2018 at 21:39, Yaniv Kaul  > <mailto:yk...@redhat.com>> wrote:
> >
> >
> >
> > On Mon, Nov 5, 2018 at 5:05 PM Sankarshan Mukhopadhyay
> >  > <mailto:sankarshan.mukhopadh...@gmail.com>> wrote:
> >
> > On Mon, Nov 5, 2018 at 8:14 PM Yaniv Kaul  > <mailto:yk...@redhat.com>> wrote:
> > > On Mon, Nov 5, 2018 at 4:28 PM Niels de Vos  > <mailto:nde...@redhat.com>> wrote:
> > >>
> > >> On Mon, Nov 05, 2018 at 05:31:26PM +0530, Pranith Kumar
> > Karampuri wrote:
> > >> > hi,
> > >> > When we create a bz on master and clone it to the next
> > release(In my
> > >> > case it was release-5.0), after that release happens can we
> > close the bz on
> > >> > master with CLOSED NEXTRELEASE?
> > >
> > >
> > > Since no one is going to verify it (right now, but I'm hopeful
> > this will change in the future!), no point in keeping it open.
> > > You could keep it open and move it along the process, and then
> > close it properly when you release the next release.
> > > It's kinda pointless if no one's going to do anything with it
> > between MODIFIED to CLOSED.
> > > I mean - assuming you move it to ON_QA - who's going to do the
> > verification?
> > >
> > > In oVirt, QE actually verifies upstream bugs, so there is
> > value. They are also all appear in the release notes, with their
> > status and so on.
> >
> > The Glusto framework is intended to accomplish this end, is it
> not?
> >
> >
> > If the developer / QE engineer developed a test case for that BZ -
> > that would be amazing!
> > Y.
> > ___
> > maintainers mailing list
> > maintainers@gluster.org <mailto:maintainers@gluster.org>
> > https://lists.gluster.org/mailman/listinfo/maintainers
> >
> > --
> > - Atin (atinm)
> >
> >
> > ___
> > maintainers mailing list
> > maintainers@gluster.org
> > https://lists.gluster.org/mailman/listinfo/maintainers
> >
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Bug state change proposal based on the conversation on bz 1630368

2018-11-05 Thread Atin Mukherjee

Bit late to this, but I’m in favour of the proposal.

The script change should only consider transitioning the bug status from
POST to CLOSED NEXTRELEASE on master branch only. What’d be also ideal is
to update the fixed in version in which this patch will land.

On Mon, 5 Nov 2018 at 21:39, Yaniv Kaul  wrote:

>
>
> On Mon, Nov 5, 2018 at 5:05 PM Sankarshan Mukhopadhyay <
> sankarshan.mukhopadh...@gmail.com> wrote:
>
>> On Mon, Nov 5, 2018 at 8:14 PM Yaniv Kaul  wrote:
>> > On Mon, Nov 5, 2018 at 4:28 PM Niels de Vos  wrote:
>> >>
>> >> On Mon, Nov 05, 2018 at 05:31:26PM +0530, Pranith Kumar Karampuri
>> wrote:
>> >> > hi,
>> >> > When we create a bz on master and clone it to the next
>> release(In my
>> >> > case it was release-5.0), after that release happens can we close
>> the bz on
>> >> > master with CLOSED NEXTRELEASE?
>> >
>> >
>> > Since no one is going to verify it (right now, but I'm hopeful this
>> will change in the future!), no point in keeping it open.
>> > You could keep it open and move it along the process, and then close it
>> properly when you release the next release.
>> > It's kinda pointless if no one's going to do anything with it between
>> MODIFIED to CLOSED.
>> > I mean - assuming you move it to ON_QA - who's going to do the
>> verification?
>> >
>> > In oVirt, QE actually verifies upstream bugs, so there is value. They
>> are also all appear in the release notes, with their status and so on.
>>
>> The Glusto framework is intended to accomplish this end, is it not?
>>
>
> If the developer / QE engineer developed a test case for that BZ - that
> would be amazing!
> Y.
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Release 5: Missing option documentation (need inputs)

2018-10-10 Thread Atin Mukherjee

On Wed, 10 Oct 2018 at 20:30, Shyam Ranganathan  wrote:

> The following options were added post 4.1 and are part of 5.0 as the
> first release for the same. They were added in as part of bugs, and
> hence looking at github issues to track them as enhancements did not
> catch the same.
>
> We need to document it in the release notes (and also the gluster doc.
> site ideally), and hence I would like a some details on what to write
> for the same (or release notes commits) for them.
>
> Option: cluster.daemon-log-level
> Attention: @atin
> Review: https://review.gluster.org/c/glusterfs/+/20442


This option has to be used based on extreme need basis and this is why it
has been mentioned as GLOBAL_NO_DOC. So ideally this shouldn't be
documented.

Do we still want to capture it in the release notes?


>
> Option: ctime-invalidation
> Attention: @Du
> Review: https://review.gluster.org/c/glusterfs/+/20286
>
> Option: shard-lru-limit
> Attention: @krutika
> Review: https://review.gluster.org/c/glusterfs/+/20544
>
> Option: shard-deletion-rate
> Attention: @krutika
> Review: https://review.gluster.org/c/glusterfs/+/19970
>
> Please send in the required text ASAP, as we are almost towards the end
> of the release.
>
> Thanks,
> Shyam
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 5: Branched and further dates

2018-10-05 Thread Atin Mukherjee

On Fri, 5 Oct 2018 at 20:29, Shyam Ranganathan  wrote:

> On 10/04/2018 11:33 AM, Shyam Ranganathan wrote:
> > On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> >> RC1 would be around 24th of Sep. with final release tagging around 1st
> >> of Oct.
> >
> > RC1 now stands to be tagged tomorrow, and patches that are being
> > targeted for a back port include,
>
> We still are awaiting release notes (other than the bugs section) to be
> closed.
>
> There is one new bug that needs attention from the replicate team.
> https://bugzilla.redhat.com/show_bug.cgi?id=1636502
>
> The above looks important to me to be fixed before the release, @ravi or
> @pranith can you take a look?
>
> >
> > 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> > mux cases)
> >
> > @RaBhat working on this.
>
> Done
>
> >
> > 2) Py3 corrections in master
> >
> > @Kotresh are all changes made to master backported to release-5 (may not
> > be merged, but looking at if they are backported and ready for merge)?
>
> Done, release notes amend pending
>
> >
> > 3) Release notes review and updates with GD2 content pending
> >
> > @Kaushal/GD2 team can we get the updates as required?
> > https://review.gluster.org/c/glusterfs/+/21303
>
> Still awaiting this.


Kaushal has added a comment into the patch providing the content today
morning IST. Any additional details are you looking for?


>
> >
> > 4) This bug [2] was filed when we released 4.0.
> >
> > The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> > missing and hence post-upgrade clients failing the mount). This is
> > possibly the last chance to fix it.
> >
> > Glusterd and protocol maintainers, can you chime in, if this bug needs
> > to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>
> Release notes to be corrected to call this out.
>
> >
> > The tracker bug [1] does not have any other blockers against it, hence
> > assuming we are not tracking/waiting on anything other than the set
> above.
> >
> > Thanks,
> > Shyam
> >
> > [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> > [2] Potential upgrade bug:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1540659
> > ___
> > maintainers mailing list
> > maintainers@gluster.org
> > https://lists.gluster.org/mailman/listinfo/maintainers
> >
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Atin Mukherjee

On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
wrote:

> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> > RC1 would be around 24th of Sep. with final release tagging around 1st
> > of Oct.
>
> RC1 now stands to be tagged tomorrow, and patches that are being
> targeted for a back port include,
>
> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> mux cases)
>
> @RaBhat working on this.
>
> 2) Py3 corrections in master
>
> @Kotresh are all changes made to master backported to release-5 (may not
> be merged, but looking at if they are backported and ready for merge)?
>
> 3) Release notes review and updates with GD2 content pending
>
> @Kaushal/GD2 team can we get the updates as required?
> https://review.gluster.org/c/glusterfs/+/21303
>
> 4) This bug [2] was filed when we released 4.0.
>
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
>
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>

This is a bad bug to live with. OTOH, I do not have an immediate solution
in my mind on how to make sure (a) these options when reintroduced are made
no-ops, especially they will be disallowed to tune (with out dirty option
check hacks at volume set staging code) . If we're to tag RC1 tomorrow, I
wouldn't be able to take a risk to commit this change.

Can we actually have a note in our upgrade guide to document that if you're
upgrading to 4.1 or higher version make sure to disable these options
before the upgrade to mitigate this?


> The tracker bug [1] does not have any other blockers against it, hence
> assuming we are not tracking/waiting on anything other than the set above.
>
> Thanks,
> Shyam
>
> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> [2] Potential upgrade bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [for discussion] suggestions around improvements in bug triage workflow

2018-09-27 Thread Atin Mukherjee

On Thu, 27 Sep 2018 at 20:37, Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> The origin of this conversation is a bit of a hall-way discussion with
> Shyam. The actual matter should be familiar to maintainers. For what
> it is worth, it was also mentioned at the recent Community meeting.
>
> As the current workflows go, once a release is made generally
> available, a large swathe of bugs against an EOLd release are
> automatically closed citing that "the release is EOLd and if the bug
> is still reproducible on later releases, please reopen against those".
> However, there is perhaps a better way to handle this:


I will play a devil’s advocate role here, but one of the question we need
to ask ourselves additionally:

- Why are we getting into such state where so many bugs primarily the ones
which haven’t got development’s attention get auto closed due to EOL?
- Doesn’t this indicate we’re actually piling up our backlog with
(probable) genuine defects and not taking enough action?

Bugzilla triage needs to be made as a habit by individuals to ensure new
bugs get attention. Technically this will no longer be a problem.

However, for now I think this workflow sounds a right measure atleast to
ensure we don’t close down a genuine defect.


>
> [0] clone the bug into master so that it continues to be part of a
> valid bug backlog
>
> [1] validate per release that the circumstances described by the bug
> are actually resolved and hence CLOSED CURRENTRELEASE them
>
> I am posting here for discussion around this as well as being able to
> identify whether tooling/automation can be used to handle some of
> this.
>
>
>
> --
> sankarshan mukhopadhyay
> 
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Lock down period merge process

2018-09-27 Thread Atin Mukherjee

On Thu, 27 Sep 2018 at 18:27, Pranith Kumar Karampuri 
wrote:

>
>
> On Thu, Sep 27, 2018 at 5:27 PM Atin Mukherjee 
> wrote:
>
>> tests/bugs//xxx.t failing can’t always mean there’s a bug in
>> component Y.
>>
>
> I agree.
>
>
>> It could be anywhere till we root cause the problem.
>>
>
> Some one needs to step in to find out what the root cause is. I agree that
> for a component like glusterd bugs in other components can easily lead to
> failures. How do we make sure that someone takes a look at it?
>
>
>> Now does this mean we block commit rights for component Y till we have
>> the root cause?
>>
>
> It was a way of making it someone's priority. If you have another way to
> make it someone's priority that is better than this, please suggest and we
> can have a discussion around it and agree on it :-).
>

This is what I can think of:

1. Component peers/maintainers take a first triage of the test failure. Do
the initial debugging and (a) point to the component which needs further
debugging or (b) seek for help at gluster-devel ML for additional insight
for identifying the problem and narrowing down to a component.
2. If it’s (1 a) then we already know the component and the owner. If it’s
(2 b) at this juncture, it’s all maintainers responsibility to ensure the
email is well understood and based on the available details the ownership
is picked up by respective maintainers. It might be also needed that
multiple maintainers might have to be involved and this is why I focus on
this as a group effort than individual one.


>
>
>> That doesn’t make much sense right? This is one of the reasons in such
>> case we need to work as a group, figure out the problem and fix it, till
>> then locking down the entire repo for further commits look a better option
>> (IMHO).
>>
>
> Let us dig deeper into what happens when we work as a group, in general it
> will be one person who will take the lead and get help. Is there a way to
> find that person without locking down whole master? If there is, we may
> never have to get to a place where we lock down master completely. We may
> not even have to lock down components. Suggestions are welcome.
>
>
>> On Thu, 27 Sep 2018 at 14:04, Nigel Babu  wrote:
>>
>>> We know maintainers of the components which are leading to repeated
>>>> failures in that component and we just need to do the same thing we did to
>>>> remove commit access for the maintainer of the component instead of all of
>>>> the people. So in that sense it is not good faith and can be enforced.
>>>>
>>>
>>> Pranith, I believe the difference of opinion is because you're looking
>>> at this problem in terms of "who" rather than "what". We do not care about
>>> *who* broke master. Removing commit access from a component owner doesn't
>>> stop someone else from landing a patch will create a failure in the same
>>> component or even a different component. We cannot stop patches from
>>> landing because it touches a specific component. And even if we could, our
>>> components are not entirely independent of each other. There could still be
>>> failures. This is a common scenario and it happened the last time we had to
>>> close master. Let me further re-emphasize our goals:
>>>
>>> * When master is broken, every team member's energy needs to be focused
>>> on getting master to green. Who broke the build isn't a concern as much as
>>> *the build is broken*. This is not a situation to punish specific people.
>>> * If we allow other commits to land, we run the risk of someone else
>>> breaking master with a different patch. Now we have two failures to debug
>>> and fix.
>>> ___
>>> maintainers mailing list
>>> maintainers@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/maintainers
>>>
>> --
>> - Atin (atinm)
>>
>
>
> --
> Pranith
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Maintainer's meeting minutes: 17th Sept, 2018

2018-09-18 Thread Atin Mukherjee

On Wed, Sep 19, 2018 at 9:39 AM Sankarshan Mukhopadhyay <
sankarshan.mukhopadh...@gmail.com> wrote:

> On Wed, Sep 19, 2018 at 9:34 AM Amar Tumballi  wrote:
>
> [snip]
>
> > GCS Status:
> >
> > Deploy scripts mostly working fine: https://github.com/kshlm/gd2-k8s
> >
> > Few more patches required in GD2.
> >
> > Where to send the PR ?
> >
> > Plan is to send it to GCS, as the plan is to get more issues and
> discussions, along with documentation on that repo.
> >
>
> What is the best way for a peripheral participant/audience to be aware
> of GCS? Is it going to be in the monthly report?
>

The plan is to keep community up to date about the progress on GCS at least
with a report of every alternate weeks, if not on weekly basis.


> --
> sankarshan mukhopadhyay
> 
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #857

2018-09-13 Thread Atin Mukherjee

Nigel - please apply
https://review.gluster.org/#/c/glusterfs/+/21126/2 to get this corrected.

On Thu, 13 Sep 2018 at 13:28, Amar Tumballi  wrote:

> This looks to be an issue with clang-format changes...
>
> On Wed 12 Sep, 2018, 7:58 PM ,  wrote:
>
>> See <
>> https://build.gluster.org/job/regression-test-with-multiplex/857/display/redirect?page=changes
>> >
>>
>> Changes:
>>
>> [Vijay Bellur] doc: add coding-standard and commit-msg link in README
>>
>> [Amar Tumballi] dht: Use snprintf instead of strncpy
>>
>> [Amar Tumballi] doc: make developer-index.md as README
>>
>> [Nigel Babu] clang-format: add the config file
>>
>> [Nigel Babu] Land clang-format changes
>>
>> [Nigel Babu] Land part 2 of clang-format changes
>>
>> [Amar Tumballi] template files: revert clang
>>
>> --
>> [...truncated 188.30 KB...]
>> in a given directory, LIBDIR, you must either use libtool, and
>> specify the full pathname of the library, or use the `-LLIBDIR'
>> flag during linking and do at least one of the following:
>>- add LIBDIR to the `LD_LIBRARY_PATH' environment variable
>>  during execution
>>- add LIBDIR to the `LD_RUN_PATH' environment variable
>>  during linking
>>- use the `-Wl,-rpath -Wl,LIBDIR' linker flag
>>- have your system administrator add LIBDIR to `/etc/ld.so.conf'
>>
>> See any operating system documentation about shared libraries for
>> more information, such as the ld(1) and ld.so(8) manual pages.
>> --
>> make[5]: Nothing to be done for `install-exec-am'.
>> make[5]: Nothing to be done for `install-data-am'.
>> Making install in utime
>> Making install in src
>> /usr/bin/python2 <
>> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-gen-fops-h.py>
>> <
>> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-autogen-fops-tmpl.h>
>> > utime-autogen-fops.h
>> make --no-print-directory install-am
>>   CC   utime-helpers.lo
>>   CC   utime.lo
>> /usr/bin/python2 <
>> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-gen-fops-c.py>
>> <
>> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-autogen-fops-tmpl.c>
>> > utime-autogen-fops.c
>>   CC   utime-autogen-fops.lo
>>   CCLD utime.la
>> make[6]: Nothing to be done for `install-exec-am'.
>>  /usr/bin/mkdir -p '/build/install/lib/glusterfs/4.2dev/xlator/features'
>>  /bin/sh ../../../../libtool   --mode=install /usr/bin/install -c
>> utime.la '/build/install/lib/glusterfs/4.2dev/xlator/features'
>> libtool: install: warning: relinking `utime.la'
>> libtool: install: (cd /build/scratch/xlators/features/utime/src; /bin/sh
>> /build/scratch/libtool  --silent --tag CC --mode=relink gcc -Wall -g -O2 -g
>> -rdynamic -O0 -DDEBUG -Wformat -Werror=format-security
>> -Werror=implicit-function-declaration -Wall -Werror -Wno-cpp -module
>> -avoid-version -export-symbols <
>> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/xlator.sym>
>> -Wl,--no-undefined -o utime.la -rpath
>> /build/install/lib/glusterfs/4.2dev/xlator/features utime-helpers.lo
>> utime.lo utime-autogen-fops.lo ../../../../libglusterfs/src/
>> libglusterfs.la -lrt -ldl -lpthread -lcrypto )
>> libtool: install: /usr/bin/install -c .libs/utime.soT
>> /build/install/lib/glusterfs/4.2dev/xlator/features/utime.so
>> libtool: install: /usr/bin/install -c .libs/utime.lai
>> /build/install/lib/glusterfs/4.2dev/xlator/features/utime.la
>> libtool: finish:
>> PATH="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin:/usr/local/bin:/usr/bin:/build/install/sbin:/build/install/bin:/build/install/sbin:/build/install/bin:/sbin"
>> ldconfig -n /build/install/lib/glusterfs/4.2dev/xlator/features
>> --
>> Libraries have been installed in:
>>/build/install/lib/glusterfs/4.2dev/xlator/features
>>
>> If you ever happen to want to link against installed libraries
>> in a given directory, LIBDIR, you must either use libtool, and
>> specify the full pathname of the library, or use the `-LLIBDIR'
>> flag during linking and do at least one of the following:
>>- add LIBDIR to the `LD_LIBRARY_PATH' environment variable
>>  during execution
>>- add LIBDIR to the `LD_RUN_PATH' environment variable
>>  during linking
>>- use the `-Wl,-rpath -Wl,LIBDIR' linker flag
>>- have your system administrator add LIBDIR to `/etc/ld.so.conf'
>>
>> See any operating system documentation about shared libraries for
>> more information, such as the ld(1) and ld.so(8) manual pages.
>> --
>> make[5]: Nothing to be done for `install-exec-am'.
>> make[5]: Nothing to be done for

Re: [Gluster-Maintainers] Build failed in Jenkins: experimental-periodic #441

2018-09-09 Thread Atin Mukherjee

Nigel highlighted me that this is in experimental branch where the fix
isn't yet there. So this was an oversight from my end.

On Mon, Sep 10, 2018 at 9:47 AM Karthik Subrahmanya 
wrote:

> Hey Atin,
>
> Yes I fixed this recently. Will check why it is failing.
>
> Regards,
> Karthik
>
> On Mon, Sep 10, 2018 at 8:07 AM Atin Mukherjee 
> wrote:
>
>> I believe we fixed this test recently. Is this failure something new?
>>
>> On Sun, 9 Sep 2018 at 23:37,  wrote:
>>
>>> See <
>>> https://build.gluster.org/job/experimental-periodic/441/display/redirect
>>> >
>>>
>>> --
>>> [...truncated 962.15 KB...]
>>> ./tests/bugs/geo-replication/bug-877293.t  -  9 second
>>> ./tests/bugs/fuse/bug-963678.t  -  9 second
>>> ./tests/bugs/distribute/bug-884597.t  -  9 second
>>> ./tests/bugs/distribute/bug-882278.t  -  9 second
>>> ./tests/bugs/cli/bug-1030580.t  -  9 second
>>> ./tests/bugs/bug-1371806_2.t  -  9 second
>>> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
>>> -  9 second
>>> ./tests/basic/xlator-pass-through-sanity.t  -  9 second
>>> ./tests/basic/quota-nfs.t  -  9 second
>>> ./tests/basic/inode-quota-enforcing.t  -  9 second
>>> ./tests/basic/ec/ec-up.t  -  9 second
>>> ./tests/basic/afr/afr-up.t  -  9 second
>>> ./tests/performance/open-behind.t  -  8 second
>>> ./tests/features/lock-migration/lkmigration-set-option.t  -  8 second
>>> ./tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t  -  8 second
>>> ./tests/bugs/shard/bug-1468483.t  -  8 second
>>> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
>>> -  8 second
>>> ./tests/bugs/posix/bug-990028.t  -  8 second
>>> ./tests/bugs/io-cache/bug-858242.t  -  8 second
>>> ./tests/bugs/glusterfs-server/bug-904300.t  -  8 second
>>> ./tests/bugs/glusterfs/bug-872923.t  -  8 second
>>> ./tests/bugs/glusterd/bug-1242875-do-not-pass-volinfo-quota.t  -  8
>>> second
>>> ./tests/bugs/fuse/bug-985074.t  -  8 second
>>> ./tests/bugs/ec/bug-1179050.t  -  8 second
>>> ./tests/bugs/distribute/bug-1368012.t  -  8 second
>>> ./tests/bugs/cli/bug-1087487.t  -  8 second
>>> ./tests/bugs/changelog/bug-1208470.t  -  8 second
>>> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  8 second
>>> ./tests/basic/volume-status.t  -  8 second
>>> ./tests/basic/tier/ctr-rename-overwrite.t  -  8 second
>>> ./tests/basic/glusterd/arbiter-volume-probe.t  -  8 second
>>> ./tests/basic/fop-sampling.t  -  8 second
>>> ./tests/basic/ec/ec-anonymous-fd.t  -  8 second
>>> ./tests/basic/afr/gfid-mismatch.t  -  8 second
>>> ./tests/basic/afr/arbiter-remove-brick.t  -  8 second
>>> ./tests/gfid2path/get-gfid-to-path.t  -  7 second
>>> ./tests/gfid2path/block-mount-access.t  -  7 second
>>> ./tests/bugs/upcall/bug-1458127.t  -  7 second
>>> ./tests/bugs/snapshot/bug-1260848.t  -  7 second
>>> ./tests/bugs/snapshot/bug-1064768.t  -  7 second
>>> ./tests/bugs/replicate/bug-1561129-enospc.t  -  7 second
>>> ./tests/bugs/replicate/bug-1365455.t  -  7 second
>>> ./tests/bugs/replicate/bug-1101647.t  -  7 second
>>> ./tests/bugs/quota/bug-1243798.t  -  7 second
>>> ./tests/bugs/nfs/bug-915280.t  -  7 second
>>> ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  7 second
>>> ./tests/bugs/io-stats/bug-1598548.t  -  7 second
>>> ./tests/bugs/io-cache/bug-read-hang.t  -  7 second
>>> ./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
>>> ./tests/bugs/gfapi/bug-1447266/1460514.t  -  7 second
>>> ./tests/bugs/fuse/bug-1030208.t  -  7 second
>>> ./tests/bugs/ec/bug-1227869.t  -  7 second
>>> ./tests/bugs/core/bug-986429.t  -  7 second
>>> ./tests/bugs/core/bug-908146.t  -  7 second
>>> ./tests/bugs/cli/bug-1022905.t  -  7 second
>>> ./tests/bugs/bug-1258069.t  -  7 second
>>> ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  7
>>> second
>>> ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t
>>> -  7 second
>>> ./tests/basic/gfapi/mandatory-lock-optimal.t  -  7 second
>>> ./tests/basic/ec/ec-read-policy.t  -  7 second
>>> ./tests/basic/distribute/throttle-rebal.t  -  7 second
>>> ./tests/basic/afr/tarissue.t  -  7 second
>>> ./tests/features/readdir-ahead.t  -  6 second
>>> ./tests/bugs/upcall/bug-upcall-stat.t  -  6 second
&

Re: [Gluster-Maintainers] Build failed in Jenkins: experimental-periodic #441

2018-09-09 Thread Atin Mukherjee

I believe we fixed this test recently. Is this failure something new?

On Sun, 9 Sep 2018 at 23:37,  wrote:

> See <
> https://build.gluster.org/job/experimental-periodic/441/display/redirect>
>
> --
> [...truncated 962.15 KB...]
> ./tests/bugs/geo-replication/bug-877293.t  -  9 second
> ./tests/bugs/fuse/bug-963678.t  -  9 second
> ./tests/bugs/distribute/bug-884597.t  -  9 second
> ./tests/bugs/distribute/bug-882278.t  -  9 second
> ./tests/bugs/cli/bug-1030580.t  -  9 second
> ./tests/bugs/bug-1371806_2.t  -  9 second
> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
> -  9 second
> ./tests/basic/xlator-pass-through-sanity.t  -  9 second
> ./tests/basic/quota-nfs.t  -  9 second
> ./tests/basic/inode-quota-enforcing.t  -  9 second
> ./tests/basic/ec/ec-up.t  -  9 second
> ./tests/basic/afr/afr-up.t  -  9 second
> ./tests/performance/open-behind.t  -  8 second
> ./tests/features/lock-migration/lkmigration-set-option.t  -  8 second
> ./tests/bugs/tier/bug-1205545-CTR-and-trash-integration.t  -  8 second
> ./tests/bugs/shard/bug-1468483.t  -  8 second
> ./tests/bugs/quota/bug-1250582-volume-reset-should-not-remove-quota-quota-deem-statfs.t
> -  8 second
> ./tests/bugs/posix/bug-990028.t  -  8 second
> ./tests/bugs/io-cache/bug-858242.t  -  8 second
> ./tests/bugs/glusterfs-server/bug-904300.t  -  8 second
> ./tests/bugs/glusterfs/bug-872923.t  -  8 second
> ./tests/bugs/glusterd/bug-1242875-do-not-pass-volinfo-quota.t  -  8 second
> ./tests/bugs/fuse/bug-985074.t  -  8 second
> ./tests/bugs/ec/bug-1179050.t  -  8 second
> ./tests/bugs/distribute/bug-1368012.t  -  8 second
> ./tests/bugs/cli/bug-1087487.t  -  8 second
> ./tests/bugs/changelog/bug-1208470.t  -  8 second
> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  8 second
> ./tests/basic/volume-status.t  -  8 second
> ./tests/basic/tier/ctr-rename-overwrite.t  -  8 second
> ./tests/basic/glusterd/arbiter-volume-probe.t  -  8 second
> ./tests/basic/fop-sampling.t  -  8 second
> ./tests/basic/ec/ec-anonymous-fd.t  -  8 second
> ./tests/basic/afr/gfid-mismatch.t  -  8 second
> ./tests/basic/afr/arbiter-remove-brick.t  -  8 second
> ./tests/gfid2path/get-gfid-to-path.t  -  7 second
> ./tests/gfid2path/block-mount-access.t  -  7 second
> ./tests/bugs/upcall/bug-1458127.t  -  7 second
> ./tests/bugs/snapshot/bug-1260848.t  -  7 second
> ./tests/bugs/snapshot/bug-1064768.t  -  7 second
> ./tests/bugs/replicate/bug-1561129-enospc.t  -  7 second
> ./tests/bugs/replicate/bug-1365455.t  -  7 second
> ./tests/bugs/replicate/bug-1101647.t  -  7 second
> ./tests/bugs/quota/bug-1243798.t  -  7 second
> ./tests/bugs/nfs/bug-915280.t  -  7 second
> ./tests/bugs/nfs/bug-1143880-fix-gNFSd-auth-crash.t  -  7 second
> ./tests/bugs/io-stats/bug-1598548.t  -  7 second
> ./tests/bugs/io-cache/bug-read-hang.t  -  7 second
> ./tests/bugs/glusterfs/bug-861015-log.t  -  7 second
> ./tests/bugs/gfapi/bug-1447266/1460514.t  -  7 second
> ./tests/bugs/fuse/bug-1030208.t  -  7 second
> ./tests/bugs/ec/bug-1227869.t  -  7 second
> ./tests/bugs/core/bug-986429.t  -  7 second
> ./tests/bugs/core/bug-908146.t  -  7 second
> ./tests/bugs/cli/bug-1022905.t  -  7 second
> ./tests/bugs/bug-1258069.t  -  7 second
> ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -  7
> second
> ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t  -
> 7 second
> ./tests/basic/gfapi/mandatory-lock-optimal.t  -  7 second
> ./tests/basic/ec/ec-read-policy.t  -  7 second
> ./tests/basic/distribute/throttle-rebal.t  -  7 second
> ./tests/basic/afr/tarissue.t  -  7 second
> ./tests/features/readdir-ahead.t  -  6 second
> ./tests/bugs/upcall/bug-upcall-stat.t  -  6 second
> ./tests/bugs/transport/bug-873367.t  -  6 second
> ./tests/bugs/shard/bug-1258334.t  -  6 second
> ./tests/bugs/replicate/bug-767585-gfid.t  -  6 second
> ./tests/bugs/replicate/bug-1250170-fsync.t  -  6 second
> ./tests/bugs/quota/bug-1287996.t  -  6 second
> ./tests/bugs/quota/bug-1104692.t  -  6 second
> ./tests/bugs/posix/bug-1034716.t  -  6 second
> ./tests/bugs/nfs/bug-877885.t  -  6 second
> ./tests/bugs/nfs/bug-1116503.t  -  6 second
> ./tests/bugs/glusterfs/bug-848251.t  -  6 second
> ./tests/bugs/glusterd/bug-948729/bug-948729-mode-script.t  -  6 second
> ./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  6 second
> ./tests/bugs/glusterd/bug-1482906-peer-file-blank-line.t  -  6 second
> ./tests/bugs/glusterd/bug-1091935-brick-order-check-from-cli-to-glusterd.t
> -  6 second
> ./tests/bugs/distribute/bug-1088231.t  -  6 second
> ./tests/bugs/core/bug-1168803-snapd-option-validation-fix.t  -  6 second
> ./tests/bugs/cli/bug-982174.t  -  6 second
> ./tests/bugs/bitrot/bug-1229134-bitd-not-support-vol-set.t  -  6 second
> ./tests/bugs/bitrot/bug-1210684-scrub-pause-resume-error-handling.t  -  6
> second
> ./tests/bitrot/br-stub.t  -  6 second
> ./tests/basic/md-cache/bug-1317785.t  -  6 second
>

[Gluster-Maintainers] Fwd: Build failed in Jenkins: regression-test-with-multiplex #854

2018-09-09 Thread Atin Mukherjee

Hi Nigel,

Seems like we need to tweak in the patch what we apply on top of master to
kick in brick mux run. Would you take care of it? Please reach out to me or
Mohit, if you need help.

-- Forwarded message -
From: 
Date: Sun, 9 Sep 2018 at 19:54
Subject: Build failed in Jenkins: regression-test-with-multiplex #854
To: , , , <
yk...@redhat.com>, , , <
srako...@redhat.com>


See <
https://build.gluster.org/job/regression-test-with-multiplex/854/display/redirect?page=changes
>

Changes:

[Sanju Rakonde] glusterd: avoid using glusterd's working directory as a
brick

[atin] Some (mgmt) xlators: use

--
[...truncated 188.76 KB...]
   - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
 during execution
   - add LIBDIR to the `LD_RUN_PATH' environment variable
 during linking
   - use the `-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
--
make[5]: Nothing to be done for `install-exec-am'.
make[5]: Nothing to be done for `install-data-am'.
Making install in utime
Making install in src
/usr/bin/python2 <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-gen-fops-h.py>
<
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-autogen-fops-tmpl.h>
> utime-autogen-fops.h
make --no-print-directory install-am
  CC   utime-helpers.lo
  CC   utime.lo
/usr/bin/python2 <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-gen-fops-c.py>
<
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/features/utime/src/utime-autogen-fops-tmpl.c>
> utime-autogen-fops.c
  CC   utime-autogen-fops.lo
  CCLD utime.la
make[6]: Nothing to be done for `install-exec-am'.
 /usr/bin/mkdir -p '/build/install/lib/glusterfs/4.2dev/xlator/features'
 /bin/sh ../../../../libtool   --mode=install /usr/bin/install -c   utime.la
'/build/install/lib/glusterfs/4.2dev/xlator/features'
libtool: install: warning: relinking `utime.la'
libtool: install: (cd /build/scratch/xlators/features/utime/src; /bin/sh
/build/scratch/libtool  --silent --tag CC --mode=relink gcc -Wall -g -O2 -g
-rdynamic -O0 -DDEBUG -Wformat -Werror=format-security
-Werror=implicit-function-declaration -Wall -Werror -Wno-cpp -module
-avoid-version -export-symbols <
https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/xlator.sym>
-Wl,--no-undefined -o utime.la -rpath
/build/install/lib/glusterfs/4.2dev/xlator/features utime-helpers.lo
utime.lo utime-autogen-fops.lo ../../../../libglusterfs/src/libglusterfs.la
-lrt -ldl -lpthread -lcrypto )
libtool: install: /usr/bin/install -c .libs/utime.soT
/build/install/lib/glusterfs/4.2dev/xlator/features/utime.so
libtool: install: /usr/bin/install -c .libs/utime.lai
/build/install/lib/glusterfs/4.2dev/xlator/features/utime.la
libtool: finish:
PATH="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin:/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin:/usr/local/bin:/usr/bin:/build/install/sbin:/build/install/bin:/build/install/sbin:/build/install/bin:/sbin"
ldconfig -n /build/install/lib/glusterfs/4.2dev/xlator/features
--
Libraries have been installed in:
   /build/install/lib/glusterfs/4.2dev/xlator/features

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
   - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
 during execution
   - add LIBDIR to the `LD_RUN_PATH' environment variable
 during linking
   - use the `-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
--
make[5]: Nothing to be done for `install-exec-am'.
make[5]: Nothing to be done for `install-data-am'.
make[4]: Nothing to be done for `install-exec-am'.
make[4]: Nothing to be done for `install-data-am'.
Making install in encryption
Making install in rot-13
Making install in src
  CC   rot-13.lo
  CCLD rot-13.la
make[5]: Nothing to be done for `install-exec-am'.
 /usr/bin/mkdir -p '/build/install/lib/glusterfs/4.2dev/xlator/encryption'
 /bin/sh ../../../../libtool   --mode=install /usr/bin/install -c
rot-13.la '/build/install/lib/glusterfs/4.2dev/xlator/encryption'
libtool: install: warning: relinking `rot-13.la'
libtool: install: (cd

[Gluster-Maintainers] Out of regression builders

2018-08-11 Thread Atin Mukherjee

As both Shyam & I are running multiple flavours of manually triggered
regression jobs (lcov, centos-7, brick-mux) on top of
https://review.gluster.org/#/c/glusterfs/+/20637/ , we'd need to occupy
most the builders.

I have currently run out of builders to trigger some of the runs and have
observed one of the patches occupying it and the patch doesn't come under
stabilization bucket. Where there's no harm in keeping your patch up to
date with the regression, but given the most critical part is to make
upstream regression suits back to green asap, I've no choice but to kill
such jobs. However I will add a note to the respective patches before
killing such jobs.

Inconvenience regretted.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Master branch lock down status (Fri, August 9th)

2018-08-11 Thread Atin Mukherjee

I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well. In both the cases the common pattern is if a test was retried but
overall the job succeeded. Is this a bug which got introduced recently? At
the moment, this is blocking us to debug any tests which has been retried
but the job overall succeeded.

*01:54:20* Archiving artifacts*01:54:21* ‘glusterfs-logs.tgz’ doesn’t
match anything*01:54:21* No artifacts found that match the file
pattern "glusterfs-logs.tgz". Configuration error?*01:54:21* Finished:
SUCCESS

I saw the same behaviour for
https://build.gluster.org/job/regression-on-demand-full-run/47/consoleFull
as well.

On Sat, Aug 11, 2018 at 9:40 AM Ravishankar N 
wrote:

>
>
> On 08/11/2018 07:29 AM, Shyam Ranganathan wrote:
> > ./tests/bugs/replicate/bug-1408712.t (one retry)
> I'll take a look at this. But it looks like archiving the artifacts
> (logs) for this run
> (
> https://build.gluster.org/job/regression-on-demand-full-run/44/consoleFull)
>
> was a failure.
> Thanks,
> Ravi
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Master branch lock down status (Thu, August 09th)

2018-08-10 Thread Atin Mukherjee

Pranith,

https://review.gluster.org/c/glusterfs/+/20685 seems to have caused
multiple failure runs out of
https://review.gluster.org/c/glusterfs/+/20637/8 out of yesterday's report.
Did you get a chance to look at it?

On Fri, Aug 10, 2018 at 1:03 PM Pranith Kumar Karampuri 
wrote:

>
>
> On Fri, Aug 10, 2018 at 6:34 AM Shyam Ranganathan 
> wrote:
>
>> Today's test results are updated in the spreadsheet in sheet named "Run
>> patch set 8".
>>
>> I took in patch https://review.gluster.org/c/glusterfs/+/20685 which
>> caused quite a few failures, so not updating new failures as issue yet.
>>
>> Please look at the failures for tests that were retried and passed, as
>> the logs for the initial runs should be preserved from this run onward.
>>
>> Otherwise nothing else to report on the run status, if you are averse to
>> spreadsheets look at this comment in gerrit [1].
>>
>> Shyam
>>
>> [1] Patch set 8 run status:
>>
>> https://review.gluster.org/c/glusterfs/+/20637/8#message-54de30fa384fd02b0426d9db6d07fad4eeefcf08
>> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
>> > Deserves a new beginning, threads on the other mail have gone deep
>> enough.
>> >
>> > NOTE: (5) below needs your attention, rest is just process and data on
>> > how to find failures.
>> >
>> > 1) We are running the tests using the patch [2].
>> >
>> > 2) Run details are extracted into a separate sheet in [3] named "Run
>> > Failures" use a search to find a failing test and the corresponding run
>> > that it failed in.
>> >
>> > 3) Patches that are fixing issues can be found here [1], if you think
>> > you have a patch out there, that is not in this list, shout out.
>> >
>> > 4) If you own up a test case failure, update the spreadsheet [3] with
>> > your name against the test, and also update other details as needed (as
>> > comments, as edit rights to the sheet are restricted).
>> >
>> > 5) Current test failures
>> > We still have the following tests failing and some without any RCA or
>> > attention, (If something is incorrect, write back).
>> >
>> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
>> > attention)
>> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
>> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>> > (Atin)
>> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
>> > ./tests/basic/ec/ec-1468261.t (needs attention)
>> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
>> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
>> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
>> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
>> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
>> >
>> > Here are some newer failures, but mostly one-off failures except cores
>> > in ec-5-2.t. All of the following need attention as these are new.
>> >
>> > ./tests/00-geo-rep/00-georep-verify-setup.t
>> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
>> > ./tests/basic/stats-dump.t
>> > ./tests/bugs/bug-1110262.t
>> >
>> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>> > ./tests/basic/ec/ec-data-heal.t
>> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>>
>
> Sent https://review.gluster.org/c/glusterfs/+/20697 for the test above.
>
>
>> >
>> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
>> > ./tests/basic/ec/ec-5-2.t
>> >
>> > 6) Tests that are addressed or are not occurring anymore are,
>> >
>> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
>> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
>> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
>> > ./tests/bitrot/bug-1373520.t
>> > ./tests/bugs/distribute/bug-1117851.t
>> > ./tests/bugs/glusterd/quorum-validation.t
>> > ./tests/bugs/distribute/bug-1042725.t
>> >
>> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
>> > ./tests/bugs/quota/bug-1293601.t
>> > ./tests/bugs/bug-1368312.t
>> > ./tests/bugs/distribute/bug-1122443.t
>> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>> >
>> > Shyam (and Atin)
>> >
>> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
>> >> Health on master as of the last nightly run [4] is still the same.
>> >>
>> >> Potential patches that rectify the situation (as in [1]) are bunched in
>> >> a patch [2] that Atin and myself have put through several regressions
>> >> (mux, normal and line coverage) and these have also not passed.
>> >>
>> >> Till we rectify the situation we are locking down master branch commit
>> >> rights to the following people, Amar, Atin, Shyam, Vijay.
>> >>
>> >> The intention is to stabilize master and not add more patches that my
>> >> destabilize it.
>> >>
>> >> Test cases that are tracked as failures and need action are present
>> here
>> >> [3].
>>

Re: [Gluster-Maintainers] [Gluster-devel] Master branch lock down status (Wed, August 08th)

2018-08-08 Thread Atin Mukherjee

On Thu, 9 Aug 2018 at 06:34, Shyam Ranganathan  wrote:

> Today's patch set 7 [1], included fixes provided till last evening IST,
> and its runs can be seen here [2] (yay! we can link to comments in
> gerrit now).
>
> New failures: (added to the spreadsheet)
> ./tests/bugs/protocol/bug-808400-repl.t (core dumped)
> ./tests/bugs/quick-read/bug-846240.t
>
> Older tests that had not recurred, but failed today: (moved up in the
> spreadsheet)
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
>
> Other issues;
> Test ./tests/basic/ec/ec-5-2.t core dumped again




> Few geo-rep failures, Kotresh should have more logs to look at with
> these runs
> Test ./tests/bugs/glusterd/quorum-validation.t dumped core again


>
> Atin/Amar, we may need to merge some of the patches that have proven to
> be holding up and fixing issues today, so that we do not leave
> everything to the last. Check and move them along or lmk.


Ack. I’ll be merging those patches.


>
> Shyam
>
> [1] Patch set 7: https://review.gluster.org/c/glusterfs/+/20637/7
> [2] Runs against patch set 7 and its status (incomplete as some runs
> have not completed):
>
> https://review.gluster.org/c/glusterfs/+/20637/7#message-37bc68ce6f2157f2947da6fd03b361ab1b0d1a77
> (also updated in the spreadsheet)
>
> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > Deserves a new beginning, threads on the other mail have gone deep
> enough.
> >
> > NOTE: (5) below needs your attention, rest is just process and data on
> > how to find failures.
> >
> > 1) We are running the tests using the patch [2].
> >
> > 2) Run details are extracted into a separate sheet in [3] named "Run
> > Failures" use a search to find a failing test and the corresponding run
> > that it failed in.
> >
> > 3) Patches that are fixing issues can be found here [1], if you think
> > you have a patch out there, that is not in this list, shout out.
> >
> > 4) If you own up a test case failure, update the spreadsheet [3] with
> > your name against the test, and also update other details as needed (as
> > comments, as edit rights to the sheet are restricted).
> >
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> > attention)
> > ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> > ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> > (Atin)
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
> > ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> > ./tests/basic/ec/ec-1468261.t (needs attention)
> > ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> > ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> > ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> > ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> > ./tests/bugs/replicate/bug-1363721.t (Ravi)
> >
> > Here are some newer failures, but mostly one-off failures except cores
> > in ec-5-2.t. All of the following need attention as these are new.
> >
> > ./tests/00-geo-rep/00-georep-verify-setup.t
> > ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> > ./tests/basic/stats-dump.t
> > ./tests/bugs/bug-1110262.t
> >
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
> > ./tests/basic/ec/ec-data-heal.t
> > ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
> >
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> > ./tests/basic/ec/ec-5-2.t
> >
> > 6) Tests that are addressed or are not occurring anymore are,
> >
> > ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> > ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> > ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> > ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> > ./tests/bitrot/bug-1373520.t
> > ./tests/bugs/distribute/bug-1117851.t
> > ./tests/bugs/glusterd/quorum-validation.t
> > ./tests/bugs/distribute/bug-1042725.t
> >
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> > ./tests/bugs/quota/bug-1293601.t
> > ./tests/bugs/bug-1368312.t
> > ./tests/bugs/distribute/bug-1122443.t
> > ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
> >
> > Shyam (and Atin)
> >
> > On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> >> Health on master as of the last nightly run [4] is still the same.
> >>
> >> Potential patches that rectify the situation (as in [1]) are bunched in
> >> a patch [2] that Atin and myself have put through several regressions
> >> (mux, normal and line coverage) and these have also not passed.
> >>
> >> Till we rectify the situation we are locking down master branch commit
> >> rights to the following people, Amar, Atin, Shyam, Vijay.
> >>
> >> The intention is to stabilize master and not

Re: [Gluster-Maintainers] Master branch lock down status

2018-08-08 Thread Atin Mukherjee

On Wed, Aug 8, 2018 at 5:08 AM Shyam Ranganathan 
wrote:

> Deserves a new beginning, threads on the other mail have gone deep enough.
>
> NOTE: (5) below needs your attention, rest is just process and data on
> how to find failures.
>
> 1) We are running the tests using the patch [2].
>
> 2) Run details are extracted into a separate sheet in [3] named "Run
> Failures" use a search to find a failing test and the corresponding run
> that it failed in.
>
> 3) Patches that are fixing issues can be found here [1], if you think
> you have a patch out there, that is not in this list, shout out.
>
> 4) If you own up a test case failure, update the spreadsheet [3] with
> your name against the test, and also update other details as needed (as
> comments, as edit rights to the sheet are restricted).
>
> 5) Current test failures
> We still have the following tests failing and some without any RCA or
> attention, (If something is incorrect, write back).
>
> ./tests/bugs/replicate/bug-1290965-detect-bitrotten-objects.t (needs
> attention)
> ./tests/00-geo-rep/georep-basic-dr-tarssh.t (Kotresh)
> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Atin)
>

This one is fixed through https://review.gluster.org/20651  as I see no
failures from this patch in the latest report from patch set 6.

./tests/bugs/ec/bug-1236065.t (Ashish)
> ./tests/00-geo-rep/georep-basic-dr-rsync.t (Kotresh)
> ./tests/basic/ec/ec-1468261.t (needs attention)
> ./tests/basic/afr/add-brick-self-heal.t (needs attention)
> ./tests/basic/afr/granular-esh/replace-brick.t (needs attention)
> ./tests/bugs/core/multiplex-limit-issue-151.t (needs attention)
> ./tests/bugs/glusterd/validating-server-quorum.t (Atin)
> ./tests/bugs/replicate/bug-1363721.t (Ravi)
>
> Here are some newer failures, but mostly one-off failures except cores
> in ec-5-2.t. All of the following need attention as these are new.
>
> ./tests/00-geo-rep/00-georep-verify-setup.t
> ./tests/basic/afr/gfid-mismatch-resolution-with-fav-child-policy.t
> ./tests/basic/stats-dump.t
> ./tests/bugs/bug-1110262.t
>
> ./tests/bugs/glusterd/mgmt-handshake-and-volume-sync-post-glusterd-restart.t
>

This failed because of https://review.gluster.org/20584. I believe there's
some timing issue introduced from this patch. As I highlighted in
https://review.gluster.org/#/c/20637 as a comment I'd request you to revert
this change and include https://review.gluster.org/20658

./tests/basic/ec/ec-data-heal.t
> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t
>
> ./tests/bugs/snapshot/bug-1482023-snpashot-issue-with-other-processes-accessing-mounted-path.t
> ./tests/basic/ec/ec-5-2.t
>
> 6) Tests that are addressed or are not occurring anymore are,
>
> ./tests/bugs/glusterd/rebalance-operations-in-single-node.t
> ./tests/bugs/index/bug-1559004-EMLINK-handling.t
> ./tests/bugs/replicate/bug-1386188-sbrain-fav-child.t
> ./tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t
> ./tests/bitrot/bug-1373520.t
> ./tests/bugs/distribute/bug-1117851.t
> ./tests/bugs/glusterd/quorum-validation.t
> ./tests/bugs/distribute/bug-1042725.t
>
> ./tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
> ./tests/bugs/quota/bug-1293601.t
> ./tests/bugs/bug-1368312.t
> ./tests/bugs/distribute/bug-1122443.t
> ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>
> Shyam (and Atin)
>
> On 08/05/2018 06:24 PM, Shyam Ranganathan wrote:
> > Health on master as of the last nightly run [4] is still the same.
> >
> > Potential patches that rectify the situation (as in [1]) are bunched in
> > a patch [2] that Atin and myself have put through several regressions
> > (mux, normal and line coverage) and these have also not passed.
> >
> > Till we rectify the situation we are locking down master branch commit
> > rights to the following people, Amar, Atin, Shyam, Vijay.
> >
> > The intention is to stabilize master and not add more patches that my
> > destabilize it.
> >
> > Test cases that are tracked as failures and need action are present here
> > [3].
> >
> > @Nigel, request you to apply the commit rights change as you see this
> > mail and let the list know regarding the same as well.
> >
> > Thanks,
> > Shyam
> >
> > [1] Patches that address regression failures:
> > https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
> >
> > [2] Bunched up patch against which regressions were run:
> > https://review.gluster.org/#/c/20637
> >
> > [3] Failing tests list:
> >
> https://docs.google.com/spreadsheets/d/1IF9GhpKah4bto19RQLr0y_Kkw26E_-crKALHSaSjZMQ/edit?usp=sharing
> >
> > [4] Nightly run dashboard: https://build.gluster.org/job/nightly-master/
> ___
> maintainers mailing list
> maintainers@gluster.org
> https://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Test: ./tests/bugs/ec/bug-1236065.t

2018-08-07 Thread Atin Mukherjee

+Mohit

Requesting Mohit for help.

On Wed, 8 Aug 2018 at 06:53, Shyam Ranganathan  wrote:

> On 08/07/2018 07:37 PM, Shyam Ranganathan wrote:
> > 5) Current test failures
> > We still have the following tests failing and some without any RCA or
> > attention, (If something is incorrect, write back).
> >
> > ./tests/bugs/ec/bug-1236065.t (Ashish)
>
> Ashish/Atin, the above test failed in run:
>
> https://build.gluster.org/job/regression-on-demand-multiplex/172/consoleFull
>
> The above run is based on patchset 4 of
> https://review.gluster.org/#/c/20637/4
>
> The logs look as below, and as Ashish is unable to reproduce this, and
> all failures are on line 78 with a heal outstanding of 105, looks like
> this run may provide some possibilities on narrowing it down.
>
> The problem seems to be glustershd not connecting to one of the bricks
> that is restarted, and hence failing to heal that brick. This also looks
> like what Ravi RCAd for the test: ./tests/bugs/replicate/bug-1363721.t
>
> ==
> Test times from: cat ./glusterd.log | grep TEST
> [2018-08-06 20:56:28.177386]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 77 gluster --mode=script
> --wignore volume heal patchy full ++
> [2018-08-06 20:56:28.767209]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 78 ^0$ get_pending_heal_count
> patchy ++
> [2018-08-06 20:57:48.957136]:++
> G_LOG:./tests/bugs/ec/bug-1236065.t: TEST: 80 rm -f 0.o 10.o 11.o 12.o
> 13.o 14.o 15.o 16.o 17.o 18.o 19.o 1.o 2.o 3.o 4.o 5.o 6.o 7.o 8.o 9.o
> ++
> ==
> Repeated connection failure to client-3 in glustershd.log:
> [2018-08-06 20:56:30.218482] I [rpc-clnt.c:2087:rpc_clnt_reconfig]
> 0-patchy-client-3: changing port to 49152 (from 0)
> [2018-08-06 20:56:30.222738] W [MSGID: 114043]
> [client-handshake.c:1061:client_setvolume_cbk] 0-patchy-client-3: failed
> to set the volume [Resource temporarily unavailable]
> [2018-08-06 20:56:30.222788] W [MSGID: 114007]
> [client-handshake.c:1090:client_setvolume_cbk] 0-patchy-client-3: failed
> to get 'process-uuid' from reply dict [Invalid argument]
> [2018-08-06 20:56:30.222813] E [MSGID: 114044]
> [client-handshake.c:1096:client_setvolume_cbk] 0-patchy-client-3:
> SETVOLUME on remote-host failed: cleanup flag is set for xlator.  Try
> again later [Resource tempor
> arily unavailable]
> [2018-08-06 20:56:30.222845] I [MSGID: 114051]
> [client-handshake.c:1201:client_setvolume_cbk] 0-patchy-client-3:
> sending CHILD_CONNECTING event
> [2018-08-06 20:56:30.222919] I [MSGID: 114018]
> [client.c:2255:client_rpc_notify] 0-patchy-client-3: disconnected from
> patchy-client-3. Client process will keep trying to connect to glusterd
> until brick's port is
>  available
> ==
> Repeated connection messages close to above retries in
> d-backends-patchy0.log:
> [2018-08-06 20:56:38.530009] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy0: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.530044] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> The message "I [MSGID: 101016] [glusterfs3.h:739:dict_to_xdr] 0-dict:
> key 'trusted.ec.version' is would not be sent on wire in future [Invalid
> argument]" repeated 6 times between [2018-08-06 20:56:37.931040] and
>  [2018-08-06 20:56:37.933084]
> [2018-08-06 20:56:38.530067] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-0-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.540499] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy1: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.540533] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.540555] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-1-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.552442] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy2: allowed = "*", received addr = "127.0.0.1"
> [2018-08-06 20:56:38.552472] I [login.c:111:gf_auth] 0-auth/login:
> allowed user names: 756f302a-66eb-4cc0-8f91-797183312f05
> [2018-08-06 20:56:38.552494] I [MSGID: 115029]
> [server-handshake.c:786:server_setvolume] 0-patchy-server: accepted
> client from
>
> CTX_ID:cb3b4fed-62a4-4ad5-8b92-97838c651b22-GRAPH_ID:0-PID:10506-HOST:builder104.clo
> ud.gluster.org-PC_NAME:patchy-client-2-RECON_NO:-0 (version: 4.2dev)
> [2018-08-06 20:56:38.571671] I [addr.c:55:compare_addr_and_update]
> 0-/d/backends/patchy4: allowed = "*", received

Re: [Gluster-Maintainers] [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-02 Thread Atin Mukherjee

On Thu, Aug 2, 2018 at 4:37 PM Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

>
>
> On Thu, Aug 2, 2018 at 3:49 PM, Xavi Hernandez 
> wrote:
>
>> On Thu, Aug 2, 2018 at 6:14 AM Atin Mukherjee 
>> wrote:
>>
>>>
>>>
>>> On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee 
>>> wrote:
>>>
>>>> I just went through the nightly regression report of brick mux runs and
>>>> here's what I can summarize.
>>>>
>>>>
>>>> =
>>>> Fails only with brick-mux
>>>>
>>>> =
>>>> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after
>>>> 400 secs. Refer
>>>> https://fstat.gluster.org/failure/209?state=2_date=2018-06-30_date=2018-07-31=all,
>>>> specifically the latest report
>>>> https://build.gluster.org/job/regression-test-burn-in/4051/consoleText
>>>> . Wasn't timing out as frequently as it was till 12 July. But since 27
>>>> July, it has timed out twice. Beginning to believe commit
>>>> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
>>>> secs isn't sufficient enough (Mohit?)
>>>>
>>>> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>>>> (Ref -
>>>> https://build.gluster.org/job/regression-test-with-multiplex/814/console)
>>>> -  Test fails only in brick-mux mode, AI on Atin to look at and get back.
>>>>
>>>> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
>>>> - Seems like failed just twice in last 30 days as per
>>>> https://fstat.gluster.org/failure/251?state=2_date=2018-06-30_date=2018-07-31=all.
>>>> Need help from AFR team.
>>>>
>>>> tests/bugs/quota/bug-1293601.t (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/812/console)
>>>> - Hasn't failed after 26 July and earlier it was failing regularly. Did we
>>>> fix this test through any patch (Mohit?)
>>>>
>>>> tests/bitrot/bug-1373520.t - (
>>>> https://build.gluster.org/job/regression-test-with-multiplex/811/console)
>>>> - Hasn't failed after 27 July and earlier it was failing regularly. Did we
>>>> fix this test through any patch (Mohit?)
>>>>
>>>
>>> I see this has failed in day before yesterday's regression run as well
>>> (and I could reproduce it locally with brick mux enabled). The test fails
>>> in healing a file within a particular time period.
>>>
>>> *15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19* FAILED 
>>> COMMAND: 512 path_size /d/backends/patchy5/FILE1
>>>
>>> Need EC dev's help here.
>>>
>>
>> I'm not sure where the problem is exactly. I've seen that when the test
>> fails, self-heal is attempting to heal the file, but when the file is
>> accessed, an Input/Output error is returned, aborting heal. I've checked
>> that a heal is attempted every time the file is accessed, but it fails
>> always. This error seems to come from bit-rot stub xlator.
>>
>> When in this situation, if I stop and start the volume, self-heal
>> immediately heals the files. It seems like an stale state that is kept by
>> the stub xlator, preventing the file from being healed.
>>
>> Adding bit-rot maintainers for help on this one.
>>
>
> Bitrot-stub marks the file as corrupted in inode_ctx. But when the file
> and it's hardlink are deleted from that brick and a lookup is done
> on the file, it cleans up the marker on getting ENOENT. This is part of
> recovery steps, and only md-cache is disabled during the process.
> Is there any other perf xlators that needs to be disabled for this
> scenario to expect a lookup/revalidate on the brick where
> the back end file is deleted?
>

But the same test doesn't fail with brick multiplexing not enabled. Do we
know why?


>
>> Xavi
>>
>>
>>
>>>
>>>> tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core,
>>>> not sure if related to brick mux or not, so not sure if brick mux is
&g

Re: [Gluster-Maintainers] [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-08-01 Thread Atin Mukherjee

On Tue, Jul 31, 2018 at 10:11 PM Atin Mukherjee  wrote:

> I just went through the nightly regression report of brick mux runs and
> here's what I can summarize.
>
>
> =
> Fails only with brick-mux
>
> =
> tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after 400
> secs. Refer
> https://fstat.gluster.org/failure/209?state=2_date=2018-06-30_date=2018-07-31=all,
> specifically the latest report
> https://build.gluster.org/job/regression-test-burn-in/4051/consoleText .
> Wasn't timing out as frequently as it was till 12 July. But since 27 July,
> it has timed out twice. Beginning to believe commit
> 9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
> secs isn't sufficient enough (Mohit?)
>
> tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
> (Ref -
> https://build.gluster.org/job/regression-test-with-multiplex/814/console)
> -  Test fails only in brick-mux mode, AI on Atin to look at and get back.
>
> tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
> https://build.gluster.org/job/regression-test-with-multiplex/813/console)
> - Seems like failed just twice in last 30 days as per
> https://fstat.gluster.org/failure/251?state=2_date=2018-06-30_date=2018-07-31=all.
> Need help from AFR team.
>
> tests/bugs/quota/bug-1293601.t (
> https://build.gluster.org/job/regression-test-with-multiplex/812/console)
> - Hasn't failed after 26 July and earlier it was failing regularly. Did we
> fix this test through any patch (Mohit?)
>
> tests/bitrot/bug-1373520.t - (
> https://build.gluster.org/job/regression-test-with-multiplex/811/console)
> - Hasn't failed after 27 July and earlier it was failing regularly. Did we
> fix this test through any patch (Mohit?)
>

I see this has failed in day before yesterday's regression run as well (and
I could reproduce it locally with brick mux enabled). The test fails in
healing a file within a particular time period.

*15:55:19* not ok 25 Got "0" instead of "512", LINENUM:55*15:55:19*
FAILED COMMAND: 512 path_size /d/backends/patchy5/FILE1

Need EC dev's help here.


> tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core,
> not sure if related to brick mux or not, so not sure if brick mux is
> culprit here or not. Ref -
> https://build.gluster.org/job/regression-test-with-multiplex/806/console
> . Seems to be a glustershd crash. Need help from AFR folks.
>
>
> =
> Fails for non-brick mux case too
>
> =
> tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup very
> often, with out brick mux as well. Refer
> https://build.gluster.org/job/regression-test-burn-in/4050/consoleText .
> There's an email in gluster-devel and a BZ 1610240 for the same.
>
> tests/bugs/bug-1368312.t - Seems to be recent failures (
> https://build.gluster.org/job/regression-test-with-multiplex/815/console)
> - seems to be a new failure, however seen this for a non-brick-mux case too
> - https://build.gluster.org/job/regression-test-burn-in/4039/consoleText
> . Need some eyes from AFR folks.
>
> tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to brick
> mux, have seen this failing at multiple default regression runs. Refer
> https://fstat.gluster.org/failure/392?state=2_date=2018-06-30_date=2018-07-31=all
> . We need help from geo-rep dev to root cause this earlier than later
>
> tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
> mux, have seen this failing at multiple default regression runs. Refer
> https://fstat.gluster.org/failure/393?state=2_date=2018-06-30_date=2018-07-31=all
> . We need help from geo-rep dev to root cause this earlier than later
>
> tests/bugs/glusterd/validating-server-quorum.t (
> https://build.gluster.org/job/regression-test-with-multiplex/810/console)
> - Fails for non-brick-mux cases too,
> https://fstat.gluster.org/failure/580?state=2_date=2018-06-30_date=2018-07-31=all
> .  Atin has a patch https://review.gluster.org/20584 which resolves it
> but patch is failing regression for a different test which is unrelated.
>
> tests

Re: [Gluster-Maintainers] [Gluster-devel] Release 5: Master branch health report (Week of 30th July)

2018-07-31 Thread Atin Mukherjee

I just went through the nightly regression report of brick mux runs and
here's what I can summarize.

=
Fails only with brick-mux
=
tests/bugs/core/bug-1432542-mpx-restart-crash.t - Times out even after 400
secs. Refer
https://fstat.gluster.org/failure/209?state=2_date=2018-06-30_date=2018-07-31=all,
specifically the latest report
https://build.gluster.org/job/regression-test-burn-in/4051/consoleText .
Wasn't timing out as frequently as it was till 12 July. But since 27 July,
it has timed out twice. Beginning to believe commit
9400b6f2c8aa219a493961e0ab9770b7f12e80d2 has added the delay and now 400
secs isn't sufficient enough (Mohit?)

tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t (Ref
- https://build.gluster.org/job/regression-test-with-multiplex/814/console)
-  Test fails only in brick-mux mode, AI on Atin to look at and get back.

tests/bugs/replicate/bug-1433571-undo-pending-only-on-up-bricks.t (
https://build.gluster.org/job/regression-test-with-multiplex/813/console) -
Seems like failed just twice in last 30 days as per
https://fstat.gluster.org/failure/251?state=2_date=2018-06-30_date=2018-07-31=all.
Need help from AFR team.

tests/bugs/quota/bug-1293601.t (
https://build.gluster.org/job/regression-test-with-multiplex/812/console) -
Hasn't failed after 26 July and earlier it was failing regularly. Did we
fix this test through any patch (Mohit?)

tests/bitrot/bug-1373520.t - (
https://build.gluster.org/job/regression-test-with-multiplex/811/console)
- Hasn't failed after 27 July and earlier it was failing regularly. Did we
fix this test through any patch (Mohit?)

tests/bugs/glusterd/remove-brick-testcases.t - Failed once with a core, not
sure if related to brick mux or not, so not sure if brick mux is culprit
here or not. Ref -
https://build.gluster.org/job/regression-test-with-multiplex/806/console .
Seems to be a glustershd crash. Need help from AFR folks.

=
Fails for non-brick mux case too
=
tests/bugs/distribute/bug-1122443.t 0 Seems to be failing at my setup very
often, with out brick mux as well. Refer
https://build.gluster.org/job/regression-test-burn-in/4050/consoleText .
There's an email in gluster-devel and a BZ 1610240 for the same.

tests/bugs/bug-1368312.t - Seems to be recent failures (
https://build.gluster.org/job/regression-test-with-multiplex/815/console) -
seems to be a new failure, however seen this for a non-brick-mux case too -
https://build.gluster.org/job/regression-test-burn-in/4039/consoleText .
Need some eyes from AFR folks.

tests/00-geo-rep/georep-basic-dr-tarssh.t - this isn't specific to brick
mux, have seen this failing at multiple default regression runs. Refer
https://fstat.gluster.org/failure/392?state=2_date=2018-06-30_date=2018-07-31=all
. We need help from geo-rep dev to root cause this earlier than later

tests/00-geo-rep/georep-basic-dr-rsync.t - this isn't specific to brick
mux, have seen this failing at multiple default regression runs. Refer
https://fstat.gluster.org/failure/393?state=2_date=2018-06-30_date=2018-07-31=all
. We need help from geo-rep dev to root cause this earlier than later

tests/bugs/glusterd/validating-server-quorum.t (
https://build.gluster.org/job/regression-test-with-multiplex/810/console) -
Fails for non-brick-mux cases too,
https://fstat.gluster.org/failure/580?state=2_date=2018-06-30_date=2018-07-31=all
.  Atin has a patch https://review.gluster.org/20584 which resolves it but
patch is failing regression for a different test which is unrelated.

tests/bugs/replicate/bug-1586020-mark-dirty-for-entry-txn-on-quorum-failure.t
(Ref -
https://build.gluster.org/job/regression-test-with-multiplex/809/console) -
fails for non brick mux case too -
https://build.gluster.org/job/regression-test-burn-in/4049/consoleText -
Need some eyes from AFR folks.
___
maintainers mailing list
maintainers@gluster.org
https://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-with-multiplex #786

2018-07-02 Thread Atin Mukherjee

+Mohit

Is this a new crash? I’ve not seen multiplex regressions dumping core in
recent times.

On Sat, 30 Jun 2018 at 00:25,  wrote:

> See <
> https://build.gluster.org/job/regression-test-with-multiplex/786/display/redirect?page=changes
> >
>
> Changes:
>
> [Varsha Rao] xlators/features/barrier: Fix RESOURCE_LEAK
>
> [Niels de Vos] extras/group : add database workload profile
>
> [Amar Tumballi] xlators/meta: Fix resource_leak
>
> [Raghavendra G] cluster/dht: Do not try to use up the readdirp buffer
>
> --
> [...truncated 2.63 MB...]
> arguments = {{gp_offset = 0, fp_offset = 0, overflow_arg_area =
> 0x7f9a3247eb60, reg_save_area = 0x7f9a18144a18}}
> msg = 0x0
> ctx = 0xe1e010
> host = 0x0
> hints = {ai_flags = 0, ai_family = 0, ai_socktype = 0, ai_protocol
> = 0, ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0}
> result = 0x0
> #12 0x7f9a3241a394 in server_rpc_notify (rpc=0x7f9a340449a0,
> xl=0x7f9a3402fd40, event=RPCSVC_EVENT_DISCONNECT, data=0x7f9a1ebd99c0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/xlators/protocol/server/src/server.c
> >:511
> detached = true
> this = 0x7f9a3402fd40
> trans = 0x7f9a1ebd99c0
> conf = 0x7f9a34037320
> client = 0xff6830
> auth_path = 0x7f9a34032210 "/d/backends/vol01/brick0"
> ret = 0
> victim_found = false
> xlator_name = 0x0
> ctx = 0xe1e010
> top = 0x0
> trav_p = 0x0
> travxl = 0x0
> xprtrefcount = 0
> tmp = 0x0
> __FUNCTION__ = "server_rpc_notify"
> #13 0x7f9a46ff597f in rpcsvc_handle_disconnect (svc=0x7f9a340449a0,
> trans=0x7f9a1ebd99c0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-lib/src/rpcsvc.c
> >:772
> event = RPCSVC_EVENT_DISCONNECT
> wrappers = 0x7f9a18da62c0
> wrapper = 0x7f9a34044a30
> ret = -1
> i = 0
> wrapper_count = 1
> listener = 0x0
> #14 0x7f9a46ff5afc in rpcsvc_notify (trans=0x7f9a1ebd99c0,
> mydata=0x7f9a340449a0, event=RPC_TRANSPORT_DISCONNECT, data=0x7f9a1ebd99c0)
> at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-lib/src/rpcsvc.c
> >:810
> ret = -1
> msg = 0x0
> new_trans = 0x0
> svc = 0x7f9a340449a0
> listener = 0x0
> __FUNCTION__ = "rpcsvc_notify"
> #15 0x7f9a46ffb74b in rpc_transport_notify (this=0x7f9a1ebd99c0,
> event=RPC_TRANSPORT_DISCONNECT, data=0x7f9a1ebd99c0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-lib/src/rpc-transport.c
> >:537
> ret = -1
> __FUNCTION__ = "rpc_transport_notify"
> #16 0x7f9a3be07ffb in socket_event_poll_err (this=0x7f9a1ebd99c0,
> gen=1, idx=140) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-transport/socket/src/socket.c
> >:1209
> priv = 0x7f9a1ebd9f20
> socket_closed = true
> __FUNCTION__ = "socket_event_poll_err"
> #17 0x7f9a3be0d5ad in socket_event_handler (fd=372, idx=140, gen=1,
> data=0x7f9a1ebd99c0, poll_in=1, poll_out=0, poll_err=0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/rpc/rpc-transport/socket/src/socket.c
> >:2627
> this = 0x7f9a1ebd99c0
> priv = 0x7f9a1ebd9f20
> ret = -1
> ctx = 0xe1e010
> socket_closed = false
> notify_handled = true
> __FUNCTION__ = "socket_event_handler"
> #18 0x7f9a472b1834 in event_dispatch_epoll_handler
> (event_pool=0xe55c30, event=0x7f9713f0aea0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/libglusterfs/src/event-epoll.c
> >:587
> ev_data = 0x7f9713f0aea4
> slot = 0xe8a340
> handler = 0x7f9a3be0d278 
> data = 0x7f9a1ebd99c0
> idx = 140
> gen = 1
> ret = -1
> fd = 372
> handled_error_previously = false
> __FUNCTION__ = "event_dispatch_epoll_handler"
> #19 0x7f9a472b1b27 in event_dispatch_epoll_worker
> (data=0x7f972d6cb8d0) at <
> https://build.gluster.org/job/regression-test-with-multiplex/ws/libglusterfs/src/event-epoll.c
> >:663
> event = {events = 1, data = {ptr = 0x1008c, fd = 140, u32 =
> 140, u64 = 4294967436}}
> ret = 1
> ev_data = 0x7f972d6cb8d0
> event_pool = 0xe55c30
> myindex = 106
> timetodie = 0
> __FUNCTION__ = "event_dispatch_epoll_worker"
> #20 0x7f9a4628ce25 in start_thread () from /lib64/libpthread.so.0
> No symbol table info available.
> #21 0x7f9a45951bad in clone () from /lib64/libc.so.6
> No symbol table info available.
>
> Thread 1 (Thread 0x7f97caf1a700 (LWP 24553)):
> #0  0x7f9a32b037fc in quota_lookup (frame=0x7f9a2c02f778,
> this=0x7f99f45e3ad0, loc=0x7f97caf198d0, xattr_req=0x0) at <
>

Re: [Gluster-Maintainers] Maintainer's meeting series

2018-06-19 Thread Atin Mukherjee

I always had the current slot conflicting with one of the meeting which I
can't skip. OTOH, I second Nigel's proposal of having alternate timeslots
to make it more convenient for folks to have presence from different time
zone/to avoid conflicting meetings at least on an alternative sequence. I
know that this model was tried earlier with the weekly community meeting
where there wasn't much improvement w.r.t participation but it doesn't harm
us either.


On Tue, Jun 19, 2018 at 4:18 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Tue, 19 Jun 2018, 3:38 pm Nigel Babu,  wrote:
>
>> I propose that we alternate times for every other meeting so that we can
>> accommodate people across all timezones. We're never going to find one
>> single timezone that works for everyone. The next best compromise that I've
>> seen projects make is to have the edge timezones take a compromise every
>> other meeting.
>>
>
> I second this.
>
>
>> On Tue, Jun 19, 2018 at 2:36 PM Amar Tumballi 
>> wrote:
>>
>>> Hi All,
>>>
>>> On the fun side, it seems that other than 2 people, not many people have
>>> noticed the end of recurring maintainer's meetings, on Wednesdays.
>>>
>>> Overall, there were 20+ maintainers meeting in last 1 year, and in those
>>> meetings, we tried to keep all the discussion open (shared agenda before
>>> for everyone to make a point, and shared meeting minutes with even the BJ
>>> download link). This also helped us to take certain decisions which
>>> otherwise would have taken long time to achieve, or even help with some
>>> release related discussions, helping us to keep the release timelines sane.
>>>
>>> I propose to get the biweekly maintainer's meeting back to life, and
>>> this time, considering some requests from previous thread, would like to
>>> keep it on Monday 9AM EST (would recommend to keep it 9AM EDT as-well). Or
>>> else Thursday 10AM EST ? I know it wouldn't be great time for many
>>> maintainers in India, but considering we now have presence from US West
>>> Coast to India... I guess these times are the one we can consider.
>>>
>>> Avoiding Tuesday/Wednesday slots mainly because major sponsor for
>>> project, Red Hat's members would be busy with multiple meetings during that
>>> time.
>>>
>>> Happy to hear the thoughts, and comments.
>>>
>>> Regards,
>>> Amar
>>> --
>>> Amar Tumballi (amarts)
>>> ___
>>> maintainers mailing list
>>> maintainers@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/maintainers
>>>
>>
>>
>> --
>> nigelb
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] glusterfs-3.12.7 released

2018-03-20 Thread Atin Mukherjee

On Wed, Mar 21, 2018 at 12:18 AM, Shyam Ranganathan 
wrote:

> On 03/20/2018 01:10 PM, Jiffin Thottan wrote:
> > Hi Shyam,
> >
> > Actually I planned to do the release on March 8th(posted the release
> note on that day). But it didn't happen.
> > I didn't merge any patches post sending the release note(blocker bug had
> some merge conflict issue on that so I skipped AFAIR).
> > I performed 3.12.7 tagging yesterday and ran the build job today.
> >
> > Can u please provide a suggestion here ? Do I need to perform a 3.12.7-1
> for the blocker bug ?
>
> I see that the bug is marked against the tracker, but is not a
> regression or an issue that is serious enough that it cannot wait for
> the next minor release.
>
> Copied Atin to the mail, who opened that issue for his comments. If he
> agrees, let's get this moving and get the fix into the next minor release.
>
>
Even though it's not a regression and a day 1 bug with brick multiplexing,
the issue is severe enough to consider this to be fixed *asap* . In this
scenario, if you're running a multi node cluster with brick multiplexing
enabled and one node down and there're some volume operations performed and
post that when the node comes back, brick processes fail to come up.

>
> > --
> > Regards,
> > Jiffin
> >
> >
> >
> >
> > - Original Message -
> > From: "Shyam Ranganathan" 
> > To: jenk...@build.gluster.org, packag...@gluster.org,
> maintainers@gluster.org
> > Sent: Tuesday, March 20, 2018 9:06:57 PM
> > Subject: Re: [Gluster-Maintainers] glusterfs-3.12.7 released
> >
> > On 03/20/2018 11:19 AM, jenk...@build.gluster.org wrote:
> >> SRC: https://build.gluster.org/job/release-new/47/artifact/
> glusterfs-3.12.7.tar.gz
> >> HASH: https://build.gluster.org/job/release-new/47/artifact/
> glusterfs-3.12.7.sha512sum
> >>
> >> This release is made off jenkins-release-47
> >
> > Jiffin, there are about 6 patches ready in the 3.12 queue, that are not
> > merged for this release, why?
> > https://review.gluster.org/#/projects/glusterfs,dashboards/
> dashboard:3-12-dashboard
> >
> > The tracker bug for 3.12.7 calls out
> > https://bugzilla.redhat.com/show_bug.cgi?id=1543708 as a blocker, and
> > has a patch, which is not merged.
> >
> > Was this some test packaging job?
> >
> >
> >
> >
> >>
> >>
> >>
> >> ___
> >> maintainers mailing list
> >> maintainers@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/maintainers
> >>
> > ___
> > maintainers mailing list
> > maintainers@gluster.org
> > http://lists.gluster.org/mailman/listinfo/maintainers
> >
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Proposal to change the version numbers of Gluster project

2018-03-14 Thread Atin Mukherjee

On Thu, Mar 15, 2018 at 9:45 AM, Vijay Bellur <vbel...@redhat.com> wrote:

>
>
> On Wed, Mar 14, 2018 at 5:40 PM, Shyam Ranganathan <srang...@redhat.com>
> wrote:
>
>> On 03/14/2018 07:04 PM, Joe Julian wrote:
>> >
>> >
>> > On 03/14/2018 02:25 PM, Vijay Bellur wrote:
>> >>
>> >>
>> >> On Tue, Mar 13, 2018 at 4:25 AM, Kaleb S. KEITHLEY
>> >> <kkeit...@redhat.com <mailto:kkeit...@redhat.com>> wrote:
>> >>
>> >> On 03/12/2018 02:32 PM, Shyam Ranganathan wrote:
>> >> > On 03/12/2018 10:34 AM, Atin Mukherjee wrote:
>> >> >>   *
>> >> >>
>> >> >> After 4.1, we want to move to either continuous
>> >> numbering (like
>> >> >> Fedora), or time based (like ubuntu etc) release
>> >> numbers. Which
>> >> >> is the model we pick is not yet finalized. Happy to
>> >> hear opinions.
>> >> >>
>> >> >>
>> >> >> Not sure how the time based release numbers would make more
>> >> sense than
>> >> >> the one which Fedora follows. But before I comment further on
>> >> this I
>> >> >> need to first get a clarity on how the op-versions will be
>> >> managed. I'm
>> >> >> assuming once we're at GlusterFS 4.1, post that the releases
>> >> will be
>> >> >> numbered as GlusterFS5, GlusterFS6 ... So from that
>> >> perspective, are we
>> >> >> going to stick to our current numbering scheme of op-version
>> >> where for
>> >> >> GlusterFS5 the op-version will be 5?
>> >> >
>> >> > Say, yes.
>> >> >
>> >> > The question is why tie the op-version to the release number?
>> That
>> >> > mental model needs to break IMO.
>> >> >
>> >> > With current options like,
>> >> > https://docs.gluster.org/en/latest/Upgrade-Guide/op_version/
>> >> <https://docs.gluster.org/en/latest/Upgrade-Guide/op_version/> it
>> is
>> >> > easier to determine the op-version of the cluster and what it
>> >> should be,
>> >> > and hence this need not be tied to the gluster release version.
>> >> >
>> >> > Thoughts?
>> >>
>> >> I'm okay with that, but——
>> >>
>> >> Just to play the Devil's Advocate, having an op-version that bears
>> >> some
>> >> resemblance to the _version_ number may make it easy/easier to
>> >> determine
>> >> what the op-version ought to be.
>> >>
>> >> We aren't going to run out of numbers, so there's no reason to be
>> >> "efficient" here. Let's try to make it easy. (Easy to not make a
>> >> mistake.)
>> >>
>> >> My 2¢
>> >>
>> >>
>> >> +1 to the overall release cadence change proposal and what Kaleb
>> >> mentions here.
>> >>
>> >> Tying op-versions to release numbers seems like an easier approach
>> >> than others & one to which we are accustomed to. What are the benefits
>> >> of breaking this model?
>> >>
>> > There is a bit of confusion among the user base when a release happens
>> > but the op-version doesn't have a commensurate bump. People ask why they
>> > can't set the op-version to match the gluster release version they have
>> > installed. If it was completely disconnected from the release version,
>> > that might be a great enough mental disconnect that the expectation
>> > could go away which would actually cause less confusion.
>>
>> Above is the reason I state it as well (the breaking of the mental model
>> around this), why tie it together when it is not totally related. I also
>> agree that, the notion is present that it is tied together and hence
>> related, but it may serve us better to break it.
>>
>>
>
> I see your perspective. Another related reason for not introducing an
> op-version bump in a new release would be that there are no incompatible
> features introduced (in the new release). Hence it makes sense to preserve
> the older op-version.
>
> To make everyone's lives simpler, would it be useful to introduce a
> command that provides the max op-version to release number mapping? The
> output of the command could look like:
>
> op-version X: 3.7.0 to 3.7.11
> op-version Y: 3.7.12 to x.y.z
>

We already have introduced an option called cluster.max-op-version where
one can run a command like "gluster v get all cluster.max-op-version" to
determine what highest op-version the cluster can be bumped up to. IMO,
this helps users not to look at the document for at given x.y.z release the
op-version has to be bumped up to X .  Isn't that sufficient for this
requirement?


>
> and so on.
>
> Thanks,
> Vijay
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Proposal to change the version numbers of Gluster project

2018-03-12 Thread Atin Mukherjee

On Mon, Mar 12, 2018 at 5:51 PM, Amar Tumballi  wrote:

> Hi all,
>
> Below is the proposal which most of us in maintainers list have agreed
> upon. Sharing it here so we come to conclusion quickly, and move on :-)
>
> ---
>
> Until now, Gluster project’s releases followed x.y.z model, where x is
> indicating a major revision, and y a minor, and z as a patched release.
> Read more on this model at wikipedia
> 
>
> As we are announcing the release and availability of Gluster 4.0[.0] now,
> it is a good time to reconsider our version numbering.
>
> What
> is the need to reconsider version number now?
>
> The major and minor version numbering is a good strategy for projects
> which would bring incompatibility between major versions.
>
> For Gluster, as it is a filesystem, and one of the major reason people use
> this project is because of ‘High Availability’, we can never think of
> breaking compatibility between releases. So, regardless of any major
> version changes, the filesystem should continue to work from a given mount
> point.
>
> *NOTE*: We are not saying there will be no issues ever for clients at
> all, but users will have enough time to plan, based on called out
> incompatabilities, and hence adapt to the new changes in an application
> maintenance window.
>
> Also, this allows us to bring features and changes frequently, and not
> wait for the major version number change to make a release.
> So, what next?
>
> There are multiple changes we are proposing.
>
>-
>
>As announced earlier 4.0 will be STM, and it will be the last STM.
>-
>
>As we had already announced, 4.1 will be our LTM (Long Term
>Maintenance) release. This will release 3 months from 4.0 (June, 2018 end)
>-
>
>After 4.1, we want to move to either continuous numbering (like
>Fedora), or time based (like ubuntu etc) release numbers. Which is the
>model we pick is not yet finalized. Happy to hear opinions.
>
>
Not sure how the time based release numbers would make more sense than the
one which Fedora follows. But before I comment further on this I need to
first get a clarity on how the op-versions will be managed. I'm assuming
once we're at GlusterFS 4.1, post that the releases will be numbered as
GlusterFS5, GlusterFS6 ... So from that perspective, are we going to stick
to our current numbering scheme of op-version where for GlusterFS5 the
op-version will be 5?


>
>-
>
>There will be no more STM releases for early access, still to mature
>features. We will either use the experimental branch, or tag a feature
>in a release as experimental. Everything core to the operation of Gluster,
>will remain stable and will only improve from release to release.
>
> *NOTE:* Exact mechanisims for tagging something experimental Vs stable is
> being evolved. Further, what this means for a user is also being evovled
> and will be put out for discussion soon.
>
>-
>
>Considering we had 6 months release cycle for LTM releases, and 3
>months for branching, we want to fall back to 4 months release cycle for
>different versions, so we will cut down on number of backports, and
>supported versions from which we can upgrade to latest. Also users will
>benefit from more releases which are going to be supported long term.
>-
>
>Every release will be maintained for 1 year as earlier
>- Monthly bug fixs per maintained release would be made available (as
>   before) (update releases)
>   - Post the first 3 or 4 months, for monthly bug fix update
>   releases, the cycle will change to bi-monthy (once in 2 months) or
>   expidated as necessary
>
> ---
>
> Happy to hear your opinion.
>
>
> Regards,
> Amar
>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Meeting minutes (7th March)

2018-03-07 Thread Atin Mukherjee

On Thu, 8 Mar 2018 at 11:43, Kaushal M  wrote:

> On Thu, Mar 8, 2018 at 10:21 AM, Amar Tumballi 
> wrote:
> > Meeting date: 03/07/2018 (March 3rd, 2018. 19:30IST, 14:00UTC, 09:00EST)
> >
> > BJ Link
> >
> > Bridge: https://bluejeans.com/205933580
> > Download : https://bluejeans.com/s/mOGb7
> >
> > Attendance
> >
> > [Sorry Note] : Atin (conflicting meeting), Michael Adam, Amye, Niels de
> Vos,
> > Amar, Nigel, Jeff, Shyam, Kaleb, Kotresh
> >
> > Agenda
> >
> > AI from previous meeting:
> >
> > Email on version numbers: Still pending - Amar/Shyam
> >
> > Planning to do this by Friday (9th March)
> >
> > can we run regression suite with GlusterD2
> >
> > OK with failures, but can we run?
> > Nigel to run tests and give outputs
>
> Apologies for not attending this meeting.
>
> I can help get this up and running.
>
> But, I also wanted to setup a smoke job to run GD2 CI against glusterfs
> patches.
> This will help us catch changes that adversly affect GD2, in
> particular changes to the option_t and xlator_api_t structs.
> Will not be a particularly long test to run. On average the current
> GD2 centos-ci jobs finish in under 4 minutes.
> I expect that building glusterfs will add about 5 minutes more.
> This job should be simple enough to get setup, and I'd like it if can
> set this up first.


+1, this is definitely needed going forward.


>
> >
> > Line coverage tests:
> >
> > SIGKILL was sent to processes, so the output was not proper.
> > Patch available, Nigel to test with the patch and give output before
> > merging.
> > [Nigel] what happens with GD2 ?
> >
> > [Shyam] https://github.com/gojp/goreportcard
> > [Shyam] (what I know)
> > https://goreportcard.com/report/github.com/gluster/glusterd2
> >
> > Gluster 4.0 is tagged:
> >
> > Retrospect meeting: Can this be google form?
> >
> > It usually is, let me find and paste the older one:
> >
> > 3.10 retro:
> >
> http://lists.gluster.org/pipermail/gluster-users/2017-February/030127.html
> > 3.11 retro: https://www.gluster.org/3-11-retrospectives/
> >
> > [Nigel] Can we do it a less of form, and keep it more generic?
> > [Shyam] Thats what mostly the form tries to do. Prefer meeting & Form
> >
> > Gluster Infra team is testing the distributed testing framework
> contributed
> > from FB
> >
> > [Nigel] Any issues, would like to collaborate
> > [Jeff] Happy to collaborate, let me know.
> >
> > Call out for features on 4-next
> >
> > should the next release be LTM and 4.1 and then pick the version number
> > change proposal later.
> >
> > Bugzilla Automation:
> >
> > Planning to test it out next week.
> > AI: send the email first, and target to take the patches before next
> > maintainers meeting.
> >
> > Round Table
> >
> > [Kaleb] space is tight on download.gluster.org
> > * may we delete, e.g. purpleidea files? experimental (freebsd stuff from
> > 2014)?
> > * any way to get more space?
> > * [Nigel] Should be possible to do it, file a bug
> > * AI: Kaleb to file a bug
> > *
> >
> > yesterday I noticed that some files (…/3.12/3.12.2/Debian/…) were not
> owned
> > by root:root. They were rsync_aide:rsync_aide. Was there an aborted rsync
> > job or something that left them like that?
> >
> > most glusterfs 4.0 packages are on download.g.o now. Starting on gd2
> > packages now.
> >
> > el7 packages on on buildroot if someone (shyam?) wants to get a head
> start
> > on testing them
> >
> > [Nigel] Testing IPv6 (with IPv4 on too), only 4 tests are consistently
> > failing. Need to look at it.
> >
> >
> >
> > --
> > Amar Tumballi (amarts)
> >
> > ___
> > maintainers mailing list
> > maintainers@gluster.org
> > http://lists.gluster.org/mailman/listinfo/maintainers
> >
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel

-- 
--Atin
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.10.11: Planned for the 28th of Feb, 2018

2018-02-19 Thread Atin Mukherjee

On Tue, Feb 20, 2018 at 7:26 AM, Atin Mukherjee <amukh...@redhat.com> wrote:

>
>
> On Mon, Feb 19, 2018 at 7:46 PM, Shyam Ranganathan <srang...@redhat.com>
> wrote:
>
>> On 01/30/2018 02:14 PM, Shyam Ranganathan wrote:
>> > Hi,
>> >
>> > As release 3.10.10 is tagged and off to packaging, here are the needed
>> > details for 3.10.11
>> >
>> > Release date: 28th Feb, 2018
>>
>> Checking the 3.10 review backlog, here are a couple of concerns,
>>
>> 1) Reviews,
>>   - https://review.gluster.org/#/c/19081/
>>   - https://review.gluster.org/#/c/19082/
>>
>> Have been submitted since Jan 9th, but are not passing regressions or
>> smoke. I rebased them again today, but they still fail.
>>
>> @Du, can you look at these, as you have backported them.
>>
>> 2) @Hari, https://review.gluster.org/#/c/19553/ seems to be failing
>> consistently in the test, ./tests/bugs/posix/bug-990028.t please take a
>> look, as this patch is important from an upgrade perspective.
>>
>
> atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git log
> tests/bugs/posix/bug-990028.t
> commit 858fae39936e5aee5ea4e3816a10ba310d04cf61
> Author: Amar Tumballi <ama...@redhat.com>
> Date:   Mon Nov 27 23:56:50 2017 +0530
>
> tests: mark currently failing regression tests as known issues
>
> Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
> BUG: 1517961
> Signed-off-by: Amar Tumballi <ama...@redhat.com>
>
> FWIW, this test is marked as bad in master.  Should we mark it bad in
> release-3.12 as well?
>

s/release-3.12/release-3.10/g


>
>
>> > Tracker bug for blockers:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.10.11
>> >
>> > Shyam
>> > ___
>> > Gluster-devel mailing list
>> > gluster-de...@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/gluster-devel
>> >
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.10.11: Planned for the 28th of Feb, 2018

2018-02-19 Thread Atin Mukherjee

On Mon, Feb 19, 2018 at 7:46 PM, Shyam Ranganathan 
wrote:

> On 01/30/2018 02:14 PM, Shyam Ranganathan wrote:
> > Hi,
> >
> > As release 3.10.10 is tagged and off to packaging, here are the needed
> > details for 3.10.11
> >
> > Release date: 28th Feb, 2018
>
> Checking the 3.10 review backlog, here are a couple of concerns,
>
> 1) Reviews,
>   - https://review.gluster.org/#/c/19081/
>   - https://review.gluster.org/#/c/19082/
>
> Have been submitted since Jan 9th, but are not passing regressions or
> smoke. I rebased them again today, but they still fail.
>
> @Du, can you look at these, as you have backported them.
>
> 2) @Hari, https://review.gluster.org/#/c/19553/ seems to be failing
> consistently in the test, ./tests/bugs/posix/bug-990028.t please take a
> look, as this patch is important from an upgrade perspective.
>

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ git log
tests/bugs/posix/bug-990028.t
commit 858fae39936e5aee5ea4e3816a10ba310d04cf61
Author: Amar Tumballi 
Date:   Mon Nov 27 23:56:50 2017 +0530

tests: mark currently failing regression tests as known issues

Change-Id: If6c36dc6c395730dfb17b5b4df6f24629d904926
BUG: 1517961
Signed-off-by: Amar Tumballi 

FWIW, this test is marked as bad in master.  Should we mark it bad in
release-3.12 as well?


> > Tracker bug for blockers:
> > https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.10.11
> >
> > Shyam
> > ___
> > Gluster-devel mailing list
> > gluster-de...@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-devel
> >
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Red Hat Users Contact List

2018-02-09 Thread Atin Mukherjee

Please ignore this email. I overlooked this while approving the other build
failure related emails.

@Amye - Is this a right approach where the admin(s) of this ML has to
actually approve all the emails from the queue? This is quite tedious and
mistakes like this can happen.

On Sat, Feb 3, 2018 at 1:06 AM, daisy connor  wrote:

>
>
>
>
> Hi,
>
>
>
> Would you like to acquire the email list of clients or companies using
> *Redhat?*
>
>
>
> We offer you with the most sorted technology database for Redhat. You can
> choose they list depending on your requirement.
>
>
>
> *Some of the technology list from Fortune Reuters has been mentioned
> below:* Redhat Cloud Computing users list, Redhat Middleware users list,
> Redhat operating systems users list, Redhat storage users list, Redhat
> virtualization user list, Ubuntu, Suse, Linux, VMware, Solaris, Oracle
> Linux, MSPs, CSPs, Sis, MSSPs, xSPs, ISPs, ISVs Cloud PBX, Hosting PBX,
> VARs, VADs and more…..
>
>
>
> *Please fill in the details below of your target market*:
>
>
>
> *Target Industry: __, Target title: __, Geography:
> . *
>
>
>
> Kindly let us know your interest to provide you with detailed information
> for the same.
>
>
> Regards,
> *Daisy Connor*
> Marketing Executive
>
>   To Opt Out, please respond “Leave Out” in the
> Subject line.
>
>
>
>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 4.0: Branched

2018-01-26 Thread Atin Mukherjee

On Fri, Jan 26, 2018 at 5:11 PM, Raghavendra G 
wrote:

>
>
> On Fri, Jan 26, 2018 at 4:49 PM, Raghavendra Gowdappa  > wrote:
>
>>
>>
>> - Original Message -
>> > From: "Shyam Ranganathan" 
>> > To: "Gluster Devel" , "GlusterFS
>> Maintainers" 
>> > Sent: Thursday, January 25, 2018 9:49:51 PM
>> > Subject: Re: [Gluster-Maintainers] [Gluster-devel] Release 4.0: Branched
>> >
>> > On 01/23/2018 03:17 PM, Shyam Ranganathan wrote:
>> > > 4.0 release has been branched!
>> > >
>> > > I will follow this up with a more detailed schedule for the release,
>> and
>> > > also the granted feature backport exceptions that we are waiting.
>> > >
>> > > Feature backports would need to make it in by this weekend, so that we
>> > > can tag RC0 by the end of the month.
>> >
>> > Backports need to be ready for merge on or before Jan, 29th 2018 3:00 PM
>> > Eastern TZ.
>> >
>> > Features that requested and hence are granted backport exceptions are as
>> > follows,
>> >
>> > 1) Dentry fop serializer xlator on brick stack
>> > https://github.com/gluster/glusterfs/issues/397
>> >
>> > @Du please backport the same to the 4.0 branch as the patch in master is
>> > merged.
>>
>> Sure.
>>
>
> https://review.gluster.org/#/c/19340/1
> But this might fail smoke as the bug associated is not associated with 4.0
> branch. Blocked on 4.0 version tag in bugzilla.
>

I think you can use the same github issue id and don't need a bug here
since it's a feature?


>
>> >
>> > 2) Leases support on GlusterFS
>> > https://github.com/gluster/glusterfs/issues/350
>> >
>> > @Jiffin and @ndevos, there is one patch pending against master,
>> > https://review.gluster.org/#/c/18785/ please do the needful and
>> backport
>> > this to the 4.0 branch.
>> >
>> > 3) Data corruption in write ordering of rebalance and application writes
>> > https://github.com/gluster/glusterfs/issues/308
>> >
>> > @susant, @du if we can conclude on the strategy here, please backport as
>> > needed.
>>
>> https://review.gluster.org/#/c/19207/
>> Review comments need to be addressed and centos regressions are failing.
>>
>> https://review.gluster.org/#/c/19202/
>> There are some suggestions on the patch. If others agree they are valid,
>> this patch can be considered as redundant with approach of #19207. However,
>> as I've mentioned in the comments there are some tradeoffs too. So, Waiting
>> for response to my comments. If nobody responds in the time period given,
>> we can merge the patch and susant will have to backport to 4.0 branch.
>>
>> >
>> > 4) Couple of patches that are tracked for a backport are,
>> > https://review.gluster.org/#/c/19223/
>> > https://review.gluster.org/#/c/19267/ (prep for ctime changes in later
>> > releases)
>> >
>> > Other features discussed are not in scope for a backports to 4.0.
>> >
>> > If you asked for one and do not see it in this list, shout out!
>> >
>> > >
>> > > Only exception could be: https://review.gluster.org/#/c/19223/
>> > >
>> > > Thanks,
>> > > Shyam
>> > > ___
>> > > Gluster-devel mailing list
>> > > gluster-de...@gluster.org
>> > > http://lists.gluster.org/mailman/listinfo/gluster-devel
>> > >
>> > ___
>> > maintainers mailing list
>> > maintainers@gluster.org
>> > http://lists.gluster.org/mailman/listinfo/maintainers
>> >
>> ___
>> Gluster-devel mailing list
>> gluster-de...@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Raghavendra G
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 4.0: Branched

2018-01-25 Thread Atin Mukherjee

Shyam,

We need to have 4.0 version created in bugzilla for GlusterFS which is
currently missing. I have a patch to backport into this branch.

On Wed, Jan 24, 2018 at 1:47 AM, Shyam Ranganathan 
wrote:

> 4.0 release has been branched!
>
> I will follow this up with a more detailed schedule for the release, and
> also the granted feature backport exceptions that we are waiting.
>
> Feature backports would need to make it in by this weekend, so that we
> can tag RC0 by the end of the month.
>
> Only exception could be: https://review.gluster.org/#/c/19223/
>
> Thanks,
> Shyam
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Mailinglist admins wanted

2018-01-12 Thread Atin Mukherjee

I can take that up if that helps..

On Fri, 12 Jan 2018 at 16:36, Niels de Vos  wrote:

> Hi,
>
> It seems that during the last weeeks neither Vijay, Jeff (with incorrect
> email address) or I had time to review/approve/reject emails sent to
> this list. Adding one or two additional moderators would be good, who's
> volunteering for that?
>
> Thanks,
> Niels
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Maintainers' meeting Agenda (10th Jan, 2018)

2018-01-11 Thread Atin Mukherjee

On Wed, Jan 10, 2018 at 8:59 PM, Nithya Balachandran 
wrote:

>
>
> On 10 January 2018 at 20:36, Jeff Darcy  wrote:
>
>> FYI, the crash that Kaleb mentioned looks like this, so I think Nithya
>> was right that it's a bug in shutdown. Better than memory corruption, which
>> is what I tend to think of when I hear about random crashes.
>>
>
> I was actually thinking of a different crash. Someone from the Glusterd
> team should take a look and see if this is a known crash.
>

This is not known to me atleast. Even though the crash is from glusterd
binary, we need some epoll and rpc experts to debug this. And yes, from the
logs it does look like this happened when cleanup_and_exit () was triggered
in glusterd.

t a a bt says:

(gdb) t a a bt

Thread 8 (LWP 529):
#0  0x7ff603d2773c in _int_free () from ./lib64/libc.so.6
#1  0x7ff6040ade1d in CRYPTO_free () from ./usr/lib64/libcrypto.so.10
#2  0x7ff6041293e9 in lh_free () from ./usr/lib64/libcrypto.so.10
#3  0x7ff60412bca0 in ?? () from ./usr/lib64/libcrypto.so.10
#4  0x7ff5f81dc129 in fini_openssl_mt ()
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-transport/socket/src/socket.c:4081
#5  0x7ff5f81cf9ff in __do_global_dtors_aux () from
./build/install/lib/glusterfs/4.0dev1/rpc-transport/socket.so
#6  0x in ?? ()

Thread 7 (LWP 530):
#0  0x7ff603d5bc4d in nanosleep () from ./lib64/libc.so.6
#1  0x7ff603d5bac0 in sleep () from ./lib64/libc.so.6
#2  0x7ff605185ab8 in pool_sweeper (arg=0x0)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/mem-pool.c:470
#3  0x7ff60442faa1 in start_thread () from ./lib64/libpthread.so.0
#4  0x7ff603d97bcd in clone () from ./lib64/libc.so.6

Thread 6 (LWP 560):
#0  0x7ff60443368c in pthread_cond_wait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x7ff5f9c3a3b6 in hooks_worker (args=0x1d0f470)
at
/home/jenkins/root/workspace/centos6-regression/xlators/mgmt/glusterd/src/glusterd-hooks.c:528
#2  0x7ff60442faa1 in start_thread () from ./lib64/libpthread.so.0
#3  0x7ff603d97bcd in clone () from ./lib64/libc.so.6

Thread 5 (LWP 532):
#0  0x7ff604433a5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x7ff60519d075 in syncenv_task (proc=0x1d07070)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:603
#2  0x7ff60519d317 in syncenv_processor (thdata=0x1d07070)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:695
#3  0x7ff60442faa1 in start_thread () from ./lib64/libpthread.so.0
#4  0x7ff603d97bcd in clone () from ./lib64/libc.so.6

Thread 4 (LWP 531):
#0  0x7ff604433a5e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x7ff60519d075 in syncenv_task (proc=0x1d06cb0)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:603
#2  0x7ff60519d317 in syncenv_processor (thdata=0x1d06cb0)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/syncop.c:695
#3  0x7ff60442faa1 in start_thread () from ./lib64/libpthread.so.0
#4  0x7ff603d97bcd in clone () from ./lib64/libc.so.6

Thread 3 (LWP 528):
#0  0x7ff60443700d in nanosleep () from ./lib64/libpthread.so.0
#1  0x7ff60515da5b in gf_timer_proc (data=0x1d05b60)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/timer.c:201
#2  0x7ff60442faa1 in start_thread () from ./lib64/libpthread.so.0
#3  0x7ff603d97bcd in clone () from ./lib64/libc.so.6

Thread 2 (LWP 527):
#0  0x7ff6044302fd in pthread_join () from ./lib64/libpthread.so.0
#1  0x7ff6051c5641 in event_dispatch_epoll (event_pool=0x1cfdbc0)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:742
---Type  to continue, or q  to quit---
#2  0x7ff6051841e6 in event_dispatch (event_pool=0x1cfdbc0)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event.c:124
#3  0x0040b1da in main (argc=11, argv=0x7fffbe19b1d8)
at
/home/jenkins/root/workspace/centos6-regression/glusterfsd/src/glusterfsd.c:2672

Thread 1 (LWP 562):
#0  0x7ff603ce1495 in raise () from ./lib64/libc.so.6
#1  0x7ff603ce2c75 in abort () from ./lib64/libc.so.6
#2  0x7ff603cda60e in __assert_fail_base () from ./lib64/libc.so.6
#3  0x7ff603cda6d0 in __assert_fail () from ./lib64/libc.so.6
#4  0x7ff6051c4a58 in event_unregister_epoll_common
(event_pool=0x1cfdbc0, fd=7, idx=-1, do_close=1)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:409
#5  0x7ff6051c4c02 in event_unregister_close_epoll
(event_pool=0x1cfdbc0, fd=7, idx_hint=-1)
at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:453
#6  0x7ff605184096 in event_unregister_close (event_pool=0x1cfdbc0,
fd=7, idx=-1)
at

Re: [Gluster-Maintainers] Maintainers meeting Agenda: Dec 13th

2017-12-12 Thread Atin Mukherjee

On Tue, Dec 12, 2017 at 5:15 PM, Amar Tumballi  wrote:

> This is going to be a longer meeting if we want to discuss everything
> here, so please consider going through this before and add your points
> (with name) in the meeting notes. See you all tomorrow.
>
> Meeting date: 12/13/2017 (Dec 13th, 19:30IST, 14:00UTC, 09:00EST)
> BJ
> Link
>
>- Bridge: https://bluejeans.com/205933580
>- Download: 
>
>
> 
> Attendance
>
>- [Sorry Note] - 
>
>
> 
> Agenda
>
>-
>
>Any AI from previous meeting?
>-
>
>Process Automation proposal
>- [WHY]
>  - We should have processes to help fast track the project’s
>  progress
>  - Any new contributor should find the steps non-confusing
>  - If it is not enforced in the process, no guidelines would be
>  enforced in practise
>  - If any developer is ‘spending extra time’ to follow the
>  process, it is not a good sign for the project
>   - [HOW]
>
>
> This
> is for everyone entering from github:
>
> For bugs
>
>- There is one liner in github issues by default (at the top) saying
>your bugs goto bugzilla.
>   - [One time activity] Change the current github issues default
>   msging to just give one line suggestion, instead of every detail.
>- If they still go ahead and create it, anyone triaging the issues
>marks it as ‘Type:Bug’
>   - [Manual] This human intervention expected in any ‘automation’. We
>   can do it as part of bug triage too.
>- Upon adding ‘Type:Bug’ tag, a bug is automatically created in
>bugzilla. Issue gets closed with URL to bugzilla ID, asking creator to
>refer bugzilla for further updates.
>   - [Automatic] Needs jenkins job (or other github automations)
>
> For questions
>
>- There is one liner in github issues by default (at the top - 1)
>saying your questions go into mailing list.
>   - [One time activity] Change the current github issues default
>   msging to just give one line suggestion, about mailing list.
>- If they still go ahead and create it, anyone triaging the issues
>marks it as ‘Question’
>   - [Manual] This human intervention expected in any ‘automation’. We
>   can do it as part of bug triage too.
>- Upon adding ‘Question’ tag, the question gets posted to mailing
>list, with creator in Cc, the archive URL gets posted to github, and the
>issue gets closed.
>   - [Automatic] Needs jenkins job (or other github automations)
>
> For features
>
>- Clearly ask the questions (ie, these are part of gluster specs)
>
>   Ask about monitoring
>   Ask about events
>   Ask about test cases
>   Ask about supporting / debugging
>   Ask about path from alpha to beta to GA for the feature.
>   Ask for contact person
>   Ask about release-notes
>   Usecase / impact areas
>
>
I expect the design doc to be also part of this checklist and a patch can
only be  'SpecApproved" if its corresponding design doc is already approved
and merged. There might be some features where all of the above may not be
applicable. So is it the maintainer's or the owner's responsibility to tick
the respective check box or mark them as N/A ?

>
>-
>
>Once user answers all these questions, provide ‘SpecApproved’ flag.
>
>
So this flag will be visible in the gerrit UI only when a patch has a
respective github issue id and the submit button will stay as disabled till
this flag is +1ed?


>-
>   - Only maintainers are allowed to provide this flag.
>-
>
>Ask developer to provide documentation. (Can be part of initial spec,
>if not can be followup question automatically posted after ‘specApproved’
>flag).
>-
>
>If provided give ‘DocApproved’ flag.
>
>
As I mentioned earlier, I'd think that we don't need this flag as it can be
part of the overall spec-list check.


>-
>   - Again, only maintainers are allowed to provide this flag.
>-
>
>For every patch in glusterfs project, (as part of smoke), run a test
>to see if a patch is for the feature, if yes (ie, a github issue is
>present), check if ‘SpecApproved’ and ‘DocApproved’ is present, and only
>then a feature gets +1 vote.
>- Expectation is every patch posted is either a bug fix or a feature.
>-
>
>Now Architects are approved to revert a patch which violates by either
>not having github issue nor bug-id, or uses a bug-id to get the feature in
>etc.
>- It is fine to revert a patch where SpecApproved and DocApproved is
>   given by the author of the patch, and

Re: [Gluster-Maintainers] Release 3.13: Release notes (Please read and contribute)

2017-11-27 Thread Atin Mukherjee

On Tue, Nov 21, 2017 at 1:41 AM, Shyam Ranganathan 
wrote:

> Hi,
>
> 3.13 RC0 is around the corner (possibly tomorrow). Towards this and the
> final 3.13.0 release, I was compiling the features that are a part of 3.13
> and also attempted to write out the release notes for the same [1].
>
> Some features have data and other do not (either in the commit message or
> in the github issue) and it is increasingly difficult to write the release
> notes by myself.
>
> So here is calling out folks who have committed the following features, to
> provide release notes as a patch to [1] to aid closing this activity out.
>
> Please refer older release notes, for what data goes into the respective
> sections [2]. Also, please provide CLI examples where required and/or
> command outputs when required.
>
> 1) Addition of summary option to the heal info CLI (@karthik-us)
> 2) Support for max-port range in glusterd.vol (@atin)
>

https://review.gluster.org/18867 posted.

3) Prevention of other processes accessing the mounted brick snapshots
> (@sunnykumar)
> 4) Ability to reserve backend storage space (@amarts)
> 5) List all the connected clients for a brick and also exported
> bricks/snapshots from each brick process (@harigowtham)
> 6) Imporved write performance with Disperse xlator, by intorducing
> parallel writes to file (@pranith/@xavi)
> 7) Disperse xlator now supports discard operations (@sunil)
> 8) Included details about memory pools in statedumps (@nixpanic)
> 9) Gluster APIs added to register callback functions for upcalls (@soumya)
> 10) Gluster API added with a glfs_mem_header for exported memory
> (@nixpanic)
> 11) Provided a new xlator to delay fops, to aid slow brick response
> simulation and debugging (@pranith)
>
> Thanks,
> Shyam
>
> [1] gerrit link to release-notes: https://review.gluster.org/#/c/18815/
>
> [2] Release 3.12.0 notes for reference: https://github.com/gluster/glu
> sterfs/blob/release-3.12/doc/release-notes/3.12.0.md
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Tests failing on master

2017-11-17 Thread Atin Mukherjee

On Fri, Nov 17, 2017 at 11:36 AM, Ravishankar N 
wrote:

>
>
> On 11/17/2017 11:25 AM, Nithya Balachandran wrote:
>
>
>
> On 16 November 2017 at 18:41, Ravishankar N 
> wrote:
>
>> Hi,
>>
>> In yesterday's maintainers' meeting, I said the are many tests that are
>> failing on the master branch on my laptop/ VMs. It turns out many of them
>> were due to setup issues: some due to obvious errors on my part like not
>> having dbench installed, not enabling gnfs etc., and some due to 'extras'
>> not being installed when I do a source compile which I'm not sure why.
>>
>> So there are only 2 tests  now which fail on master, and they don't seem
>> to be due to setup issues and are not marked as bad tests either:
>>
>> 1.tests/bugs/cli/bug-1169302.t
>>  Failed test:  14
>> 2.tests/bugs/distribute/bug-1247563.t (Wstat: 0 Tests: 12 Failed: 2)
>>   Failed tests:  11-12
>>
>
+Poornima

>
> Please file a bug for this one. And many thanks for the system .
>
> Thanks for checking Nithya, Filed 1514329 on DHT and 1514331 on infra.
> Regards
> Ravi
>
>
> Regards,
> Nithya
>
>>
>> Request the respective maintainers/peers to take a look. If they are
>> indeed failing because of problems in the test itself, then I will probably
>> file a bug on infra to investigate why they are passing on the jenkins
>> slaves.
>>
>> Thanks,
>> Ravi
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
>
>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.13: (STM release) Details

2017-10-31 Thread Atin Mukherjee

On Wed, 1 Nov 2017 at 00:14, Atin Mukherjee <amukh...@redhat.com> wrote:

>
> On Tue, 31 Oct 2017 at 18:34, Shyam Ranganathan <srang...@redhat.com>
> wrote:
>
>> On 10/31/2017 08:11 AM, Karthik Subrahmanya wrote:
>> > Hey Shyam,
>> >
>> > Can we also have the heal info summary feature [1], which is merged
>> > upstream [2].
>> > I haven't raised an issue for this yet, I can do that by tomorrow and I
>> > need to write a doc for that.
>>
>> Thanks for bringing this to my notice, it would have been missed out as
>> a feature otherwise.
>>
>> I do see that the commit start goes way back into 2015, and was
>> rekindled in Sep 2017 (by you), because I was initially thinking why
>> this did not have a issue reference anyway to begin with.
>>
>> Please raise a github issue for the same with the appropriate details
>> and I can take care of the remaining process there for you.
>>
>> @maintainers on the patch review, please ensure that we have a github
>> reference for features, else there is a lot we will miss for the same!
>
>
> This was a miss from my end where I should have checked the corresponding
> issue id in the patch. Apologies!
>

Can we have a job to check if a bugzilla is used upstream against a patch
which has a FutureFeature as a keyword, we fail such jobs to automate such
misses not to happen? And then RFEs can be closed using github references?


>
>>
>> >
>> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1261463
>> > [2] https://review.gluster.org/#/c/12154/
>> >
>> >
>> > Thanks & Regards,
>> > Karthik
>>
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
> --
> - Atin (atinm)
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.13: (STM release) Details

2017-10-31 Thread Atin Mukherjee

On Tue, 31 Oct 2017 at 18:34, Shyam Ranganathan  wrote:

> On 10/31/2017 08:11 AM, Karthik Subrahmanya wrote:
> > Hey Shyam,
> >
> > Can we also have the heal info summary feature [1], which is merged
> > upstream [2].
> > I haven't raised an issue for this yet, I can do that by tomorrow and I
> > need to write a doc for that.
>
> Thanks for bringing this to my notice, it would have been missed out as
> a feature otherwise.
>
> I do see that the commit start goes way back into 2015, and was
> rekindled in Sep 2017 (by you), because I was initially thinking why
> this did not have a issue reference anyway to begin with.
>
> Please raise a github issue for the same with the appropriate details
> and I can take care of the remaining process there for you.
>
> @maintainers on the patch review, please ensure that we have a github
> reference for features, else there is a lot we will miss for the same!


This was a miss from my end where I should have checked the corresponding
issue id in the patch. Apologies!


>
> >
> > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1261463
> > [2] https://review.gluster.org/#/c/12154/
> >
> >
> > Thanks & Regards,
> > Karthik
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] RFC: Suggestions on patches taking more than 10 tries

2017-10-31 Thread Atin Mukherjee

Honestly I have been doing this for some time based on the criticality of
the patches and of course with an agreement with the original author of the
patches. Another factor we need to consider here about patches where the
comments were available and haven't been addressed for a significant
period. We should also need to consider such patches based on their
importance and refresh them.

On Tue, Oct 31, 2017 at 4:21 PM, Jeff Darcy  wrote:

>
>
>
> On Tue, Oct 31, 2017, at 01:01 AM, Amar Tumballi wrote:
>
> In this case, I suggest maintainers can send a message to author, and send
> an updated patch with their suggestion (with making sure '--author' is set
> to original author). This can save both the effort of review, and also
> heart burn of someone not understanding the comments properly.
>
>
> I would really like it if we could get to the point where maintainers (or
> others) could feel comfortable updating other contributors' patches,
> because it really would improve our development velocity.  I've done it
> very sparingly, usually only for patches that the author seemed to have
> given up on, because there is a risk of people being offended.  It can feel
> like someone else is trying to take control of - or even credit for - one's
> own work.  To avoid this, I think we need to do two things:
>
> (1) Thoroughly document how to update someone else's patch while retaining
> proper credit for their work, and how to accept such an update into one's
> own local repository.  This addresses the technical/logistical issue.
>
> (2) Recognize new contributors as such and automatically (or at least
> semi-automatically) send them email explaining our expectations and
> standards for review etiquette - including this, but other things as well.
> This addresses the cultural issue.
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] glusterfs-3.12.1 released

2017-09-12 Thread Atin Mukherjee

On Tue, 12 Sep 2017 at 16:20, Shyam Ranganathan <srang...@redhat.com> wrote:

> On 09/11/2017 11:48 PM, Amye Scavarda wrote:
> > On Mon, Sep 11, 2017 at 8:35 PM, Amar Tumballi <atumb...@redhat.com>
> wrote:
> >>
> >>
> >> On 12-Sep-2017 6:47 AM, "Atin Mukherjee" <amukh...@redhat.com> wrote:
> >>
> >> Can someone please explain what's the reason of doing 3.12.1 so early,
> just
> >> after 5 days of 3.12?
>
> May I understand, why the concern?


Honestly, I missed the fact that as per our release schedule 3.12.z is
scheduled for 10th of every months, so no real objection here, what has
been done here is as per the process.

On the other side of it, after relooking at .z updates schedule I was just
debating with myself do we have to absolutely be strict on the dates
especially when a new LTS release is pushed out to the users just few days
back and we come out with .1 updates with very limited number of bug fixes?
Probably a balance of the timeline and the volume of bug fixes is something
to be relooked at?


>
> >>
> >>
> >> As per the schedule? It's 10days since release.
> >>
> >> -Amar
> >
> > Said another way, we have a scheduled maintenance day of the 10th for
> > this particular release:
> > https://www.gluster.org/release-schedule/
> >
> > Happy to welcome changes to this that make sense, because yes, it
> > periodically collides like this!
>
> Minor releases are bug fix releases, so things available as bug fixes
> will be pushed out. So if we want this changed, there needs to be
> reasoning around it as well (just saying).
>
> > -- amye
> >
> >>
> >> On Mon, 11 Sep 2017 at 22:58, Gluster Build System
> >> <jenk...@build.gluster.org> wrote:
> >>>
> >>>
> >>>
> >>> SRC:
> >>>
> http://bits.gluster.org/pub/gluster/glusterfs/src/glusterfs-3.12.1.tar.gz
> >>>
> >>> This release is made off v3.12.1
> >>>
> >>> -- Gluster Build System
> >>> ___
> >>> maintainers mailing list
> >>> maintainers@gluster.org
> >>> http://lists.gluster.org/mailman/listinfo/maintainers
> >>
> >> --
> >> - Atin (atinm)
> >>
> >> ___
> >> maintainers mailing list
> >> maintainers@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/maintainers
> >>
> >>
> >>
> >> ___
> >> maintainers mailing list
> >> maintainers@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/maintainers
> >>
> >
> >
> >
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Changing Submit Type on review.gluster.org

2017-09-07 Thread Atin Mukherjee

On Thu, Sep 7, 2017 at 11:50 AM, Nigel Babu  wrote:

> Hello folks,
>
> A few times, we've merged dependent patches out of order because the Submit
> type[1] did not block us from doing so. The last few times we've talked
> about
> this, we didn't actually take a strong decision either way. In yesterday's
> maintainers meeting, we agreed to change the Submit type to
> Rebase-If-Necessary. This change will happen on 18th September 2017.
>

One basic question (rather clarification) here. If indeed a rebase is
necessary for a patch which was posted some time back and a regression was
passed at that time, with this change will a (centos) regression job
re-triggered and once we have a positive vote then only the patch will be
in to the repo?


> What this means:
> * No more metadata flags added by Gerrit. There will only be a Change-Id,
>   Signed-off-by, and BUG (if you've added it). Gerrit itself will not add
> any
>   metadata.
> * If you push a patch on top of another patch, the Submit button will
> either be
>   grayed out because the dependent patches cannot be merged or they will be
>   submited in the correct order in one go.
>
> Some of the concerns that have been raised:
> Q: With the Reviewed-on flag gone, how do we keep track of changesets
>(especially backports)?
> A: The Change-Id will get you all the data directly on Gerrit. As long you
>retain the Change-Id, Gerrit will get you the matching changesets.
>
> Q: Will who-wrote-what continue to work?
> A: As far as I can see, it continues to work. I ran the script against
>build-jobs repo and it works correctly. Additionally, we'll be setting
> up an
>instance of Gerrit Stats[2] to provide more detailed stats.
>
> Q: Can we have some of the metadata if not all?
> Q: Why can't we have the metadata if we change the submit type?
> A: There's no good answer to this other than, this is how Gerrit works and
>I can neither change it nor control it.
>
> [1]: https://review.gluster.org/Documentation/intro-project-
> owner.html#submit-type
> [2]: http://gerritstats-demo.firebaseapp.com/
>
> --
> nigelb
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Reminder: Maintainers' meeting tomorrow (Sept 6th, 2017)

2017-09-05 Thread Atin Mukherjee

On Tue, 5 Sep 2017 at 20:23, Nithya Balachandran 
wrote:

> There is a rather interesting tech talk happening at the same time
> tomorrow. Can this meeting be postponed?
>

I also vote for the same. Rather I have highlighted earlier that this slot
conflicts with one of the important meetings at my company where some of
other maintainers from Red Hat are also part of. So I request for a change
in the recurring slot.


> thanks,
> Nithya
>
> On 5 September 2017 at 11:02, Amar Tumballi  wrote:
>
>> All,
>>
>> Updated the agenda @
>> https://hackmd.io/MYTgzADARgplCGBaA7DArMxAWAZvATIiFhJgCYarBQCMCYIQA===?both
>>
>> Feel free to add your topic, or updates in the above hackmd page. Even if
>> you are not going to be present, updating your comments in draft would help.
>>
>> 
>> Meeting date: 09/06/2017 (Sept 06th)
>> BJ
>> Link
>>
>>- Bridge: https://bluejeans.com/205933580
>>
>>
>> 
>> Attendance
>>
>>- [Sorry Note]: vbellur
>>- 
>>
>>
>> 
>> Agenda
>>
>>-
>>
>>Action Items pending from last week
>>- Amar to send changes required for Protocol changes - DONE
>>   - Nigel & Jiffin to figure out more about ASan builds - ?
>>   - Blog posts:
>>  - Amye’s note to list
>>  
>> 
>>  - Not much action seen here
>>  - How to increase content/traction?
>>   -
>>
>>Gluster 4.0 - Status check
>>- GD2 (must have)
>>   - Protocol changes (should have)
>>   - Error code changes (good to have)
>>   - Monitoring (good to have)
>>   - RIO (optional)
>>-
>>
>>Meeting timings
>>- Some conflicts with the current time since last month as RedHat
>>   moved its program call around same time, where many of maintainers are 
>> also
>>   part.
>>   - What other times are possible?
>>   - We can continue to use this for another 2 months till October,
>>   but after that surely need a change which doesn’t conflict.
>>   - For future, we recommend maintainers to consider this slot
>>   before agreeing for a recurring meeting at your company, so we can keep
>>   this slot static.
>>-
>>
>>Any more discussion required on review/patch etiquette?
>>- Jeff’s email
>>   
>> 
>>   - Amar’s reply
>>   
>> 
>>   - Any more comments? Suggestions for improvements?
>>-
>>
>>Summit - October 27th and 28th
>>- Gluster 4.0 - Release leads, life cycle, EOL
>>   - Discuss: What are the things we need next to call it Gluster 5.0
>>   ? Time bound? Feature Bound?
>>-
>>
>>Round Table (Check with every member, whats’ cooking in their domain)
>>
>> 
>> See you there tomorrow.
>>
>> --
>> Amar Tumballi (amarts)
>>
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://lists.gluster.org/mailman/listinfo/maintainers
>>
>>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.12: Status of features (Require responses!)

2017-07-31 Thread Atin Mukherjee

As part of get-state enhancements efforts, primary the requirements coming
from tendrl project, Samikshan is working on the patch get the geo-rep
session details included in it. This is the only patch which is pending atm.

@Samikshan - can we please put the patch up for review by tomorrow so that
we can get it in and its backport into 3.12 before RC0.

On Mon, 31 Jul 2017 at 21:38, Shyam Ranganathan  wrote:

> Hi,
>
> Here is an updated status:
>
> RC0 tagging date moved to 2nd August (awaiting 3 more feature backports
> before tagging RC0 and also some release notes for delivered features).
>
> This is a quick link for patches awaiting reviews and closure [2]
>
> Features awaiting backports (as these have enough activity on gerrit and
> look close to being done):
>
> 1) DISCARD support with EC (@sunilheggodu)
>- https://github.com/gluster/glusterfs/issues/254
>
> 2) allow users to enable used of localtime instead of UTC for log
> entries (@kalebskeithley)
>- https://github.com/gluster/glusterfs/issues/272
>
> 3) provide sub-directory mount option in fuse, for a given volume (@amarts)
>- https://github.com/gluster/glusterfs/issues/175
>
> Further, the following Tier related patches are awaiting a closure and a
> backport, request glusterd contributors attention on these,
> - https://review.gluster.org/15740
> - https://review.gluster.org/15503
>
> All other features mentioned below are either done or moved out of 3.12
> release, check the release lane [1] for details.
>
> Thanks,
> Shyam
> [1] See release 3.12 project lane:
> https://github.com/gluster/glusterfs/projects/1
>
> [2] Patches awaiting reviews and closure:
> https://review.gluster.org/#/q/starredby:srangana%2540redhat.com
>
> On 07/21/2017 04:06 PM, Shyam wrote:
> > Hi,
> >
> > Prepare for a lengthy mail, but needed for the 3.12 release branching,
> > so here is a key to aid the impatient,
> >
> > Key:
> > 1) If you asked for an exception to a feature (meaning delayed backport
> > to 3.12 branch post branching for the release) see "Section 1"
> >- Handy list of nick's that maybe interested in this:
> >  - @pranithk, @sunilheggodu, @aspandey, @amarts, @kalebskeithley,
> > @kshlm (IPv6), @jdarcy (Halo Hybrid)
> >
> > 2) If you have/had a feature targeted for 3.12 and have some code posted
> > against the same, look at "Section 2" AND we want to hear back from you!
> >- Handy list of nick's that should be interested in this:
> >  - @csabahenk, @nixpanic, @aravindavk, @amarts, @kotreshhr,
> > @soumyakoduri
> >
> > 3) If you have/had a feature targeted for 3.12 and have posted no code
> > against the same yet, see "Section 3", your feature is being dropped
> > from the release.
> >- Handy list of nick's that maybe interested in this:
> >  - @sanoj-unnikrishnan, @aravindavk, @kotreshhr, @amarts, @jdarcy,
> > @avra (people who filed the issue)
> >
> > 4) Finally, if you do not have any features for the release pending,
> > please help others out reviewing what is still pending, here [1] is a
> > quick link to those reviews.
> >
> > Sections:
> >
> > **Section 1:**
> > Exceptions granted to the following features: (Total: 8)
> > Reasons:
> >- Called out in the mail sent for noting exceptions and feature
> > status for 3.12
> >- Awaiting final changes/decision from a few Facebook patches
> >
> > Issue list:
> > - Implement an xlator to delay fops
> >- https://github.com/gluster/glusterfs/issues/257
> >
> > - Implement parallel writes feature on EC volume
> >- https://github.com/gluster/glusterfs/issues/251
> >
> > - DISCARD support with EC
> >- https://github.com/gluster/glusterfs/issues/254
> >
> > - Cache last stripe of an EC volume while write is going on
> >- https://github.com/gluster/glusterfs/issues/256
> >
> > - gfid-path by default
> >- https://github.com/gluster/glusterfs/issues/139
> >
> > - allow users to enable used of localtime instead of UTC for log entries
> >- https://github.com/gluster/glusterfs/issues/272
> >
> > - Halo translator: Hybrid mode
> >- https://github.com/gluster/glusterfs/issues/217
> >
> > - [RFE] Improve IPv6 support in GlusterFS
> >- https://github.com/gluster/glusterfs/issues/192
> >
> > **Section 2:**
> > Issues needing some further clarity: (Total: 6)
> > Reason:
> >- There are issues here, for which code is already merged (or
> > submitted) and issue is still open. This is the right state for an issue
> > to be in this stage of the release, as documentation or release-notes
> > would possibly be still pending, which will finally close the issue (or
> > rather mark it fixed)
> >- BUT, without a call out from the contributors that required code is
> > already merged in, it is difficult to assess if the issue should qualify
> > for the release
> >
> > Issue list:
> > - [RFE] libfuse rebase to latest?
> >- https://github.com/gluster/glusterfs/issues/153
> >- @csabahenk is this all done?
> >

Re: [Gluster-Maintainers] 4.0 discussions: Need time slots

2017-07-31 Thread Atin Mukherjee

On Mon, Jul 31, 2017 at 12:36 PM, Amar Tumballi  wrote:

> Hi All,
>
> It will be great to have everyone's participation for 4.0 discussions,
> considering it would be significant decision for the project. Hence, having
> a meeting slot which doesn't conflict with majority would be a great .
>
> Below are the time slots I am thinking for 4.0 discussions.
>
> Tuesday (1st Aug) - 8:30pm-9:30pm IST (11am - 12pm EDT).
> Wednesday (2nd Aug) - 5pm - 6pm IST (7:30am - 8:30am EDT)
> Wednesday (2nd Aug) - 7:30pm - 8:30pm IST (10am - 11am EDT)
> Friday (4th Aug) - 5pm - 7pm IST (7:30am - 9:30am EDT)
>

You meant 5pm - 6pm IST right? I only have this slot free.


>
> Please respond to this email, so I can create the meeting slot.
>
> For now, I will create a calendar invite for Wednesday 5pm-6pm IST.
>
> Regards,
> Amar
>
>
> --
> Amar Tumballi (amarts)
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Update on experimental branch (was Re: Jenkins build is back to normal : experimental-periodic #21)

2017-07-19 Thread Atin Mukherjee

On Wed, Jul 19, 2017 at 11:36 AM, Amar Tumballi  wrote:

> Experimental branch is back to successful build now. So I think its a good
> time to update everyone about it.
>
> Currently the branch has changes for:
>
>
>1. Added new on wire XDR changes.
>   - All is working as expected with newer protocol.
>   - So, if there is a change required for 4.0, I encourage everyone
>   to propose it and send a patch so we can test it out.
>2. Changes in STACK_WIND/UNWIND to measure count and latency.
>3. A separate way to dump metrics
>- Similar to statedump, but only metrics, ie, a key and a value (as
>   int/float) as entries in the file.
>   - Some more improvements planned to provide functions in xlator_t
>   itself for all translator to dump their private metrics if any.
>   - I got a project going to draw these metrics in graphite/grafana @
>   https://github.com/amarts/glustermetrics
>   
>   4. Subdir mount
>   - Still waiting for authentication handling in
>   xlator/protocol/auth/addr to send it to master.
>   - Will take CLI changes after handling auth, thought for now is,
>   same auth.allow volume set option can be reused in early state.
>   5. New discover fop.
>   - Currently its implemented at xlator_t, defaults and all the place.
>   - Saw major regressions with DHT / AFR by using it.
>   - Hence reverted the consumption of the discover fop for now.
>6. Global Inode table.
>   - Most of the work was done and changes were present mainly in
>   inode/fd files of libglusterfs.
>   - Saw issues with regression tests, and some behaviour at the
>   moment, and have reverted the patch.
>
>
> Also it has one patch from Hari on tier add-brick/remove-brick.
>
>
> Considering I don't have more traffic on the branch yet, I am planning to
> branch out from master and rebasing all these changes on master again to
> stick with 3 months branch out timelines.
>

I believe rebasing these changes to master will be done post 3.12
branching, right?

Regards,
> Amar
>
> On Wed, Jul 19, 2017 at 11:07 AM,  wrote:
>
>> See > lay/redirect?page=changes>
>>
>>
>
>
> --
> Amar Tumballi (amarts)
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Backport for "Add back socket for polling of events immediately..."

2017-05-28 Thread Atin Mukherjee

On Sun, May 28, 2017 at 1:48 PM, Niels de Vos  wrote:

> On Fri, May 26, 2017 at 12:25:42PM -0400, Shyam wrote:
> > Or this one: https://review.gluster.org/15036
> >
> > This is backported to 3.8/10 and 3.11 and considering the size and
> impact of
> > the change, I wanted to be sure that we are going to accept this across
> all
> > 3 releases?
> >
> > @Du, would like your thoughts on this.
> >
> > @niels, @kaushal, @talur, as release owners, could you weigh in as well
> > please.
> >
> > I am thinking that we get this into 3.11.1 if there is agreement, and
> not in
> > 3.11.0 as we are finalizing the release in 3 days, and this change looks
> > big, to get in at this time.
>

Given 3.11 is going to be a new release, I'd recommend to get this fix in
(if we have time). https://review.gluster.org/#/c/17402/ is dependent on
this one.

>
> > Further the change is actually an enhancement, and provides performance
> > benefits, so it is valid as a change itself, but I feel it is too late to
> > add to the current 3.11 release.
>
> Indeed, and mostly we do not merge enhancements that are non-trivial to
> stable branches. Each change that we backport introduces the chance on
> regressions for users with their unknown (and possibly awkward)
> workloads.
>
> The patch itself looks ok, but it is difficult to predict how the change
> affects current deployments. I prefer to be conservative and not have
> this merged in 3.8, at least for now. Are there any statistics in how
> performance is affected with this change? Having features like this only
> in newer versions might also convince users to upgrade sooner, 3.8 will
> only be supported until 3.12 (or 4.0) gets released, which is approx. 3
> months from now according to our schedule.
>
> Niels
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] [Gluster-users] Don't allow data loss via add-brick (was Re: Add single server)

2017-05-03 Thread Atin Mukherjee

On Wed, May 3, 2017 at 3:41 PM, Raghavendra Talur  wrote:

> On Tue, May 2, 2017 at 8:46 PM, Nithya Balachandran 
> wrote:
> >
> >
> > On 2 May 2017 at 16:59, Shyam  wrote:
> >>
> >> Talur,
> >>
> >> Please wait for this fix before releasing 3.10.2.
> >>
> >> We will take in the change to either prevent add-brick in
> >> sharded+distrbuted volumes, or throw a warning and force the use of
> --force
> >> to execute this.
>
> Agreed, I have filed bug and marked as blocker for 3.10.2.
> https://bugzilla.redhat.com/show_bug.cgi?id=1447608
>
>
> >>
> > IIUC, the problem is less the add brick operation and more the
> > rebalance/fix-layout. It is those that need to be prevented (as someone
> > could trigger those without an add-brick).
>
> Yes, that problem seems to be with fix-layout/rebalance and not add-brick.
> However, depending on how users have arranged their dir structure, a
> add-brick without a fix-layout might be useless for them.
>
> I also had a look at the code to see if I can do the cli/glusterd
> change myself. However, sharding is enabled just as a xlator and not
> added to glusterd_volinfo_t.
> If someone from dht team could work with glusterd team here it would
> fix the issue faster.
>
> Action item on Nithya/Atin to assign bug 1447608 to someone. I will
> wait for the fix for 3.10.2.
>

Fix is up @ https://review.gluster.org/#/c/17160/ . The only thing which
we'd need to decide (and are debating on) is that should we bypass this
validation with rebalance start force or not. What do others think?


> Thanks,
> Raghavendra Talur
>
> >
> > Nithya
> >>
> >> Let's get a bug going, and not wait for someone to report it in
> bugzilla,
> >> and also mark it as blocking 3.10.2 release tracker bug.
> >>
> >> Thanks,
> >> Shyam
> >>
> >> On 05/02/2017 06:20 AM, Pranith Kumar Karampuri wrote:
> >>>
> >>>
> >>>
> >>> On Tue, May 2, 2017 at 9:16 AM, Pranith Kumar Karampuri
> >>> > wrote:
> >>>
> >>> Yeah it is a good idea. I asked him to raise a bug and we can move
> >>> forward with it.
> >>>
> >>>
> >>> +Raghavendra/Nitya who can help with the fix.
> >>>
> >>>
> >>>
> >>> On Mon, May 1, 2017 at 9:07 PM, Joe Julian  >>> > wrote:
> >>>
> >>>
> >>> On 04/30/2017 01:13 AM, lemonni...@ulrar.net
> >>>  wrote:
> >>>
> >>> So I was a little but luck. If I has all the hardware
> >>> part, probably i
> >>> would be firesd after causing data loss by using a
> >>> software marked as stable
> >>>
> >>> Yes, we lost our data last year to this bug, and it wasn't
> a
> >>> test cluster.
> >>> We still hear from it from our clients to this day.
> >>>
> >>> Is known that this feature is causing data loss and
> >>> there is no evidence or
> >>> no warning in official docs.
> >>>
> >>> I was (I believe) the first one to run into the bug, it
> >>> happens and I knew it
> >>> was a risk when installing gluster.
> >>> But since then I didn't see any warnings anywhere except
> >>> here, I agree
> >>> with you that it should be mentionned in big bold letters
> on
> >>> the site.
> >>>
> >>> Might even be worth adding a warning directly on the cli
> >>> when trying to
> >>> add bricks if sharding is enabled, to make sure no-one will
> >>> destroy a
> >>> whole cluster for a known bug.
> >>>
> >>>
> >>> I absolutely agree - or, just disable the ability to add-brick
> >>> with sharding enabled. Losing data should never be allowed.
> >>> ___
> >>> Gluster-devel mailing list
> >>> gluster-de...@gluster.org 
> >>> http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Pranith
> >>>
> >>> ___
> >>> Gluster-users mailing list
> >>> gluster-us...@gluster.org 
> >>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Pranith
> >>>
> >>>
> >>> ___
> >>> Gluster-devel mailing list
> >>> gluster-de...@gluster.org
> >>> http://lists.gluster.org/mailman/listinfo/gluster-devel
> >>>
> >> ___
> >> Gluster-devel mailing list
> >> gluster-de...@gluster.org
> >>

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #2932

2017-04-10 Thread Atin Mukherjee

bug-1421590-brick-mux-reuse-ports.t seems to be a bad test to me and here
is my reasoning:

This test tries to check if the ports are reused or not. When a volume is
restarted, by the time glusterd tries to allocate a new port to the one of
the brick processes of the volume there is no guarantee that the older port
will be allocated given the kernel might take some extra time to free up
the port between this time frame. From https://build.gluster.org/job/
regression-test-burn-in/2932/console we can clearly see that post restart
of the volume, glusterd allocated port 49153 & 49155 for brick1 & brick2
respectively but the test was expecting the ports to be matched with 49155
& 49156 which were allocated before the volume was restarted.

@Jeff - Is there any specific reason we want to keep this test running?


On Sat, Apr 8, 2017 at 8:12 AM, Atin Mukherjee <amukh...@redhat.com> wrote:

>
> On Sat, 8 Apr 2017 at 08:06, <jenk...@build.gluster.org> wrote:
>
>> See <http://build.gluster.org/job/regression-test-burn-in/2932/d
>> isplay/redirect>
>>
>> --
>> [...truncated 12020 lines...]
>> ok 5, LINENUM:32
>> ok 6, LINENUM:33
>> ok 7, LINENUM:35
>> not ok 8 , LINENUM:37
>> FAILED COMMAND: gluster --mode=script --wignore volume stop patchy
>> ok 9, LINENUM:38
>> not ok 10 , LINENUM:40
>> FAILED COMMAND: gluster --mode=script --wignore volume start patchy
>> not ok 11 Got "" instead of "49152", LINENUM:42
>> FAILED COMMAND: 49152 get_nth_brick_port_for_volume patchy 1
>> not ok 12 , LINENUM:47
>> FAILED COMMAND: gluster --mode=script --wignore volume stop patchy
>> not ok 13 , LINENUM:48
>> FAILED COMMAND: gluster --mode=script --wignore volume start patchy
>> not ok 14 Got "" instead of "49152", LINENUM:50
>> FAILED COMMAND: 49152 get_nth_brick_port_for_volume patchy 1
>> not ok 15 Got "" instead of "get_nth_brick_port_for_volume", LINENUM:51
>> FAILED COMMAND: get_nth_brick_port_for_volume patchy 2
>> not ok 16 , LINENUM:53
>> FAILED COMMAND: gluster --mode=script --wignore volume stop patchy
>> ok 17, LINENUM:55
>> not ok 18 , LINENUM:57
>> FAILED COMMAND: gluster --mode=script --wignore volume start patchy
>> not ok 19 Got "" instead of "49152", LINENUM:59
>> FAILED COMMAND: 49152 get_nth_brick_port_for_volume patchy 1
>> volume set: success
>> Failed 10/19 subtests
>>
>> Test Summary Report
>> ---
>> ./tests/bugs/core/bug-1421590-brick-mux-reuse-ports.t (Wstat: 0 Tests:
>> 19 Failed: 10)
>>   Failed tests:  8, 10-16, 18-19
>> Files=1, Tests=19, 249 wallclock secs ( 0.03 usr  0.01 sys + 13.40 cusr
>> 3.36 csys = 16.80 CPU)
>> Result: FAIL
>> End of test ./tests/bugs/core/bug-1421590-brick-mux-reuse-ports.t
>
>
> Something is wrong with this test, have seen it failing in many regression
> test burns. I'll take a look at it.
>
>
>> 
>> 
>>
>>
>> Run complete
>> 
>> 
>> Number of tests found: 199
>> Number of tests selected for run based on pattern: 199
>> Number of tests skipped as they were marked bad:   7
>> Number of tests skipped because of known_issues:   4
>> Number of tests that were run: 188
>>
>> 1 test(s) failed
>> ./tests/bugs/core/bug-1421590-brick-mux-reuse-ports.t
>>
>> 0 test(s) generated core
>>
>>
>> Tests ordered by time taken, slowest to fastest:
>> 
>> 
>> ./tests/basic/ec/ec-12-4.t  -  336 second
>> ./tests/basic/ec/ec-7-3.t  -  199 second
>> ./tests/basic/ec/ec-6-2.t  -  178 second
>> ./tests/basic/ec/self-heal.t  -  158 second
>> ./tests/basic/afr/split-brain-favorite-child-policy.t  -  151 second
>> ./tests/basic/ec/ec-5-2.t  -  150 second
>> ./tests/basic/ec/ec-5-1.t  -  150 second
>> ./tests/basic/afr/entry-self-heal.t  -  150 second
>> ./tests/basic/afr/self-heal.t  -  137 second
>> ./tests/bugs/core/bug-1421590-brick-mux-reuse-ports.t  -  131 second
>> ./tests/basic/tier/legacy-many.t  -  127 second
>> ./tests/basic/tier/tier.t  -  126 second
>> ./tests/basic/ec/ec-4-1.t  -  123 second
>> ./tests/basic/ec/ec-optimistic-changelog.t  -  111 second
>> ./tests/basic/afr/self-heald.t  -  109 second
>> ./tests/basic/ec/ec-3-1.t  -  95 second
>>

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.10.1: Scheduled for the 30th of March

2017-03-30 Thread Atin Mukherjee

On Wed, Mar 29, 2017 at 10:50 PM, Shyam <srang...@redhat.com> wrote:

> On 03/27/2017 12:59 PM, Shyam wrote:
>
>> Hi,
>>
>> It's time to prepare the 3.10.1 release, which falls on the 30th of each
>> month, and hence would be Mar-30th-2017 this time around.
>>
>> We have one blocker issue for the release, which is [1] "auth failure
>> after upgrade to GlusterFS 3.10", that we are tracking using the release
>> tracker bug [2]. @Atin, can we have this fixed in a day or 2, or does it
>> look like we may slip beyond that?
>>
>
> This looks almost complete, I assume that in the next 24h we should be
> able to have this backported and merged into 3.10.1.
>
> This means we will tag 3.10.1 in all probability tomorrow and packages for
> various distributions will follow.
>
>
Master patch is merged now. I've a backport
https://review.gluster.org/#/c/16967 ready for review.


>
>> This mail is to call out the following,
>>
>> 1) Are there any pending *blocker* bugs that need to be tracked for
>> 3.10.1? If so mark them against the provided tracker [2] as blockers for
>> the release, or at the very least post them as a response to this mail
>>
>
> I have not heard of any other issue (other than the rebalance+shard case,
> for which root cause is still in progress). So I will assume nothing else
> blocks the minor update.
>
>
>> 2) Pending reviews in the 3.10 dashboard will be part of the release,
>> *iff* they pass regressions and have the review votes, so use the
>> dashboard [3] to check on the status of your patches to 3.10 and get
>> these going
>>
>> 3) I have made checks on what went into 3.8 post 3.10 release and if
>> these fixes are included in 3.10 branch, the status on this is *green*
>> as all fixes ported to 3.8, are ported to 3.10 as well
>>
>
> This is still green.
>
>
>> 4) First cut of the release notes are posted here [4], if there are any
>> specific call outs for 3.10 beyond bugs, please update the review, or
>> leave a comment in the review, for me to pick it up
>>
>> Thanks,
>> Shyam
>>
>> [1] Pending blocker bug for 3.10.1:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1429117
>>
>> [2] Release bug tracker:
>> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.10.1
>>
>> [3] 3.10 review dashboard:
>> https://review.gluster.org/#/projects/glusterfs,dashboards/d
>> ashboard:3-10-dashboard
>>
>>
>> [4] Release notes WIP: https://review.gluster.org/16957
>> ___
>> Gluster-devel mailing list
>> gluster-de...@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>



-- 

ATin Mukherjee

Associate Manager, RHGS Development

Red Hat

<https://www.redhat.com>

amukh...@redhat.comM: +919739491377
<http://redhatemailsignature-marketing.itos.redhat.com/> IM: IRC:
atinm, twitter: @mukherjee_atin
<https://red.ht/sig>
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Cannot attend today's maintainer's syncup meeting

2017-03-08 Thread Atin Mukherjee

Even I am skipping this too. Keeping unwell. Sorry for the late notice.

On Wed, Mar 8, 2017 at 7:29 PM, Raghavendra Gowdappa 
wrote:

> All,
>
> Got some personal work and will not be able to attend today's meeting.
> Will sync up later with meeting minutes.
>
> regards,
> Raghavendra
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>



-- 

~ Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] Release 3.10 spurious(?) regression failures in the past week

2017-02-21 Thread Atin Mukherjee

On Tue, Feb 21, 2017 at 9:47 PM, Shyam  wrote:

> Update from week of: (2017-02-13 to 2017-02-21)
>
> This week we have 3 problems from fstat to report as follows,
>
> 1) ./tests/features/lock_revocation.t
> - *Pranith*, request you take a look at this
> - This seems to be hanging on CentOS runs causing *aborted* test runs
> - Some of these test runs are,
>   - https://build.gluster.org/job/centos6-regression/3256/console
>   - https://build.gluster.org/job/centos6-regression/3196/console
>   - https://build.gluster.org/job/centos6-regression/3196/console
>
> 2) tests/basic/quota-anon-fd-nfs.t
> - This had one spurious failure in 3.10
> - I think it is because of not checking if NFS mount is available (which
> is anyway a good check to have in the test to avoid spurious failures)
> - I have filed and posted a fix for the same,
>   - Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425515
>   - Possible Fix: https://review.gluster.org/16701
>
> 3) ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-
> connection-issue.t
> - *Milind/Hari*, request you take a look at this
> - This seems to have about 8 failures in the last week on master and
> release-3.10
> - The failure seems to stem from tier.rc:function rebalance_run_time (line
> 133)?
> - Logs follow,
>
> 
>   02:36:38 [10:36:38] Running tests in file ./tests/bugs/glusterd/bug-1303
> 028-Rebalance-glusterd-rpc-connection-issue.t
>   02:36:45 No volumes present
>   02:37:36 Tiering Migration Functionality: patchy: failed: Tier daemon is
> not running on volume patchy
>   02:37:36 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 +  * 60
> + : syntax error: operand expected (error token is "* 3600 +  * 60 + ")
>   02:37:36 
> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t:
> line 23: [: : integer expression expected
>   02:37:41 Tiering Migration Functionality: patchy: failed: Tier daemon is
> not running on volume patchy
>   02:37:41 ./tests/bugs/glusterd/../../tier.rc: line 133: * 3600 +  * 60
> + : syntax error: operand expected (error token is "* 3600 +  * 60 + ")
>   02:37:41 
> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t:
> line 23: [: : integer expression expected
>   02:37:41 
> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t:
> line 23: [: -: integer expression expected
>   02:37:41 
> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
> ..
>   ...
>   02:37:41 ok 14, LINENUM:69
>   02:37:41 not ok 15 Got "1" instead of "0", LINENUM:70
>   02:37:41 FAILED COMMAND: 0 tier_daemon_check
>   02:37:41 not ok 16 Got "1" instead of "0", LINENUM:72
>   02:37:41 FAILED COMMAND: 0 non_zero_check
>   02:37:41 not ok 17 Got "1" instead of "0", LINENUM:75
>   02:37:41 FAILED COMMAND: 0 non_zero_check
>   02:37:41 not ok 18 Got "1" instead of "0", LINENUM:77
>   02:37:41 FAILED COMMAND: 0 non_zero_check -
>   02:37:41 Failed 4/18 subtests
> 
>


http://lists.gluster.org/pipermail/gluster-devel/2017-February/052137.html

Hari did mention that he has identified the issue and will be sending a
patch soon.


> Shyam
>
>
> On 02/15/2017 09:25 AM, Shyam wrote:
>
>> Update from week of: (2017-02-06 to 2017-02-13)
>>
>> No major failures to report this week, things look fine from a
>> regression suite failure stats perspective.
>>
>> Do we have any updates on the older cores? Specifically,
>>   - https://build.gluster.org/job/centos6-regression/3046/consoleText
>> (./tests/basic/tier/tier.t -- tier rebalance)
>>   - https://build.gluster.org/job/centos6-regression/2963/consoleFull
>> (./tests/basic/volume-snapshot.t -- glusterd)
>>
>> Shyam
>>
>> On 02/06/2017 02:21 PM, Shyam wrote:
>>
>>> Update from week of: (2017-01-30 to 2017-02-06)
>>>
>>> Failure stats and actions:
>>>
>>> 1) ./tests/basic/tier/tier.t
>>> Core dump needs attention
>>> https://build.gluster.org/job/centos6-regression/3046/consoleText
>>>
>>> Looks like the tier rebalance process has crashed (see below for the
>>> stack details)
>>>
>>> 2) ./tests/basic/ec/ec-background-heals.t
>>> Marked as bad in master, not in release-3.10. May cause unwanted
>>> failures in 3.10 and as a result marked this as bad in 3.10 as well.
>>>
>>> Commit: https://review.gluster.org/16549
>>>
>>> 3) ./tests/bitrot/bug-1373520.t
>>> Marked as bad in master, not in release-3.10. May cause unwanted
>>> failures in 3.10 and as a result marked this as bad in 3.10 as well.
>>>
>>> Commit: https://review.gluster.org/16549
>>>
>>> Thanks,
>>> Shyam
>>>
>>> On 01/30/2017 03:00 PM, Shyam wrote:
>>>
 Hi,

 The following is a list of spurious(?) regression failures in the 3.10
 branch last week (from fstat.gluster.org).

 Request component owner or other devs to take a look at the failures,
 and weed out real issues.

 Regression failures 3.10:

 Summary:
 1) https://build.gluster.org/job/centos6-regression/2960/consoleFull

Re: [Gluster-Maintainers] Move out of bugzilla to github issues --> for everything...

2017-02-17 Thread Atin Mukherjee

On Thu, 16 Feb 2017 at 06:36, Shyam  wrote:

> On 02/15/2017 04:27 PM, Amye Scavarda wrote:
> > On Wed, Feb 8, 2017 at 11:04 AM, Shyam  > > wrote:
> >
> > Hi,
> >
> > In today's maintainers meeting I wanted to introduce what it would
> > take us to move away from bugzilla to github. The details are in [1].
> >
> > Further to this, below is a mail detailing the plan(s) and attention
> > needed to make this happen. Further I am setting up a maintainer
> > meet on Friday 10th Feb, 2017, to discuss this plan and also discuss,
> > - Focus areas: ownership and responsibility
> > - Backlog population into github
> >
> > Request that maintainers attend this, as without a quorum we
> > *cannot* make this move. If you are unable to attend, the please let
> > us know any feedback on these plans that we need to consider.
> >
> > Calendar and plans for moving away from bugzilla to gihub:
> > 1) Arrive maintainer consensus on the move
> >   - 15th Feb, 2017
> >   - This would require understanding [1] and figuring out if all
> > requirements are considered
> >   - We will be discussing [1] in detail on the coming Friday.
> >
> > 2) Announce plans to the larger development and user community for
> > consensus
> >   - Close consensus by 22nd Feb, 2017
> >
> > 3) (request and) Work with Infra folks for worker ant like
> > integration to github instead of bugzilla
> >   - Date TBD (done in parallel from the beginning)
> >
> > <>
> >
> > 4) Announce migration plans to larger community, calling out a 2
> > week window, after which bugzilla will be closed (available for
> > historical reasons), and gerrit will also not accept bug IDs for
> changes
> >   - 27th Feb, 2017
> >
> > 5) Close bugzilla and update gerrit as needed
> >   - 10th Feb, 2017 weekend
> >
> > 6) Go live on the weekend specified in (5)
> >
> > Shyam
> > [1] http://bit.ly/2kIoFJf
> > ___
> > maintainers mailing list
> > maintainers@gluster.org 
> > http://lists.gluster.org/mailman/listinfo/maintainers
> > 
> >
> >
> >
> > I don't see a ton of followup around this issue here, and we should have
> > this conversation in -devel as well. I feel like it's gotten lost in
> > Winter Conference Season.
> >
> > That being said, here's my take: this seems hasty and doing this in a
> > few weeks seems like asking for trouble.
> >
> > Bugzilla has a number of features that we're currently using, and some
> > that we're really going to need as a project. Being able to have a
> > feature that whines at you if you haven't touched an issue in some time?
> > Helpful.
>
> So, BZ features we (myself and a few others) considered most of what we
> use (clones, release tracker, keywords etc.). Yes, there are folks who
> may have setup whines and other such, but we will lose that.
>
> We need people to speak up on what they may lose or what they want, so
> that we can evaluate it.


Personally I am not in favour of moving to 100% github model over bugzilla
as I can form almost any queries out of it. What it gives me a better
tracking ability especially being a maintainer. Until and unless I can do
the same granular things with github as I do for bugzilla, I am not
convinced (Proove me wrong and I do admit that I am a bugzilla expert but
not github!).

I think moving to a github model in maintaining releases and status of
important features is a good step taken, but we should continue to use
bugzilla when it comes to bug fixes.


>
> >
> > Bugzilla also has some features that we, as a project, are going to
> > need, even if we don't think we need them right now.
>
> Bugzilla is much more than github issues will ever be, the question is
> can we live with github, possibly not the other way around.
>
> I would say we (as in maintainers) agreed to living with github, now we
> need to realize the limitations (if any).
>
> >
> > How does Github help a project with something like a zero-day issue that
> > needs to be fixed but can't be public?
> > Or other security issues?
>
> Does a secur...@gluster.org like list help here? People who are
> reporting security vulnerabilities are also responsible not to make it
> public (I think), so reaching out to a mailing list that is more
> strictly controlled may help here?
>
> How is this handled in Redhat Bugzilla for upstream/opensource projects?
> That may help arriving at solutions to the problem.
>
> > Does the current github workflow make sense? What other features are
> > they likely to add in the future? What happens if they take it away?
> > We're at their mercy. (Which is fine if we decide that, but we need to
> > choose.)
>
> Does the current workflow make sense? Only others need to answer

Re: [Gluster-Maintainers] Stop sending patches for and merging patches on release-3.7

2017-02-01 Thread Atin Mukherjee

Thanks Shyam!

On Thu, 2 Feb 2017 at 06:45, Shyam <srang...@redhat.com> wrote:

> On 02/01/2017 12:14 PM, Atin Mukherjee wrote:
> >
> > On Wed, 1 Feb 2017 at 17:06, Kaushal M <kshlms...@gmail.com
> > <mailto:kshlms...@gmail.com>> wrote:
> >
> > Hi all,
> >
> > GlusterFS-3.7.20 is intended to be the final release for release-3.7.
> > 3.7 enters EOL with the expected release of 3.10 in about 2 weeks.
> >
> >
> > Am I missing any key points here? With 3.10 coming in, 3.9 goes EOL
> > considering it is short term and then don't we need to maintain 3
> > releases at a point? In that case 3.7 still needs to be active with 3.8
> > & 3.10, no?
>
> No, we maintain the last 2 LTMs, so in this case that is 3.8 and 3.10
> (when 3.10 releases).
>
> Intermediate STMs are maintained for the 3 months till the next LTM
> comes along to replace it.
>
> Reference: https://www.gluster.org/community/release-schedule/
>
> >
> >
> >
> > Once 3.10 is released I'll be closing any open bugs on 3.7 and
> > abandoning any patches on review.
> >
> > So as the subject says, developers please stop sending changes to
> > release-3.7, and maintainers please don't merge any more changes onto
> > release-3.7.
> >
> > ~kaushal
> > ___
> > maintainers mailing list
> > maintainers@gluster.org <mailto:maintainers@gluster.org>
> > http://lists.gluster.org/mailman/listinfo/maintainers
> >
> > --
> > - Atin (atinm)
> >
> >
> > ___
> > maintainers mailing list
> > maintainers@gluster.org
> > http://lists.gluster.org/mailman/listinfo/maintainers
> >
>
-- 
- Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://lists.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Requesting patch to be considered for 3.10 release

2017-01-06 Thread Atin Mukherjee

This would definitely be a great addition looking at the past  user
requests and bugs. I already have acked the patch and awaiting Pranith's
review.

@Samikshan - Can we file a github issue for the same (with a spec) and then
if it's approved we can put it in release-3.10 lane?

On Fri, Jan 6, 2017 at 4:34 PM, Samikshan Bairagya 
wrote:

> Hi all,
>
>
> This patch, http://review.gluster.org/#/c/16303/ adds details on maximum
> supported op-version for clients to the volume status  output.
> This might be a useful change to have as part of the upcoming 3.10 release
> and I would like to request the respective maintainers to consider the
> same. https://bugzilla.redhat.com/show_bug.cgi?id=1409078 is the
> corresponding BZ for this change.
>
> Thanks and Regards,
>
> Samikshan
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>

-- 

~ Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Release 3.10: Feature page status (and call for reviews)

2016-12-19 Thread Atin Mukherjee

On Mon, Dec 19, 2016 at 5:48 PM, Shyam  wrote:

> A) We need reviewers/maintainers to complete the reviews of the feature
> pages that are under review, and get them to an accepted state.
>
> B) We need the missing feature pages added, so that those can go through
> the required reviews.
>
> Just a reminder on the timelines, we have branching set to: 17th Jan, 2017
> (at which point code for the feature needs to be in merged state)
>
> Following are the list of features, and their corresponding feature pages,
> and its review status.
>
> Accepted state:
> 1) Disable creation of trash directory by default
> https://github.com/gluster/glusterfs-specs/blob/master/accep
> ted/Trash-Improvements.md
>
> 2) SELinux support for Gluster Volumes
> https://github.com/gluster/glusterfs-specs/blob/master/accep
> ted/SELinux-client-support.md
>
> Under review: (Reviewers please take a look)
> 1) Implement Parallel readdirp in dht
> http://review.gluster.org/#/c/16090/
>
> 2) Estimate how long it will take for a rebalance operation to complete
> http://review.gluster.org/#/c/16119/
>
> 3) Add brick multiplexing
> https://github.com/gluster/glusterfs-specs/blob/master/under
> _review/multiplexing.md
> - I think this can be moved to accepted, Jeff request a patch for
> the same
>
> 4) Introduce force option for Snapshot Restore
> http://review.gluster.org/#/c/16097/
>
> 5) Support to get maximum op-version supported in a heterogeneous cluster
> http://review.gluster.org/#/c/16118/


This is reviewed and once Samikshan addresses the comments should be ok to
accept. We are also exploring a possibility to accumulate the client's
op-version in the same CLI and see if its feasible to implement by 3.10
timelines. We'd hear from Samikshan on this.


>
>
> 6) Introducing block CLI commands
> http://review.gluster.org/#/c/16092/


This needs further discussion/details on the lines if this has to be
supportedthrough Gluster CLI  or other tools.


>
> Feature page missing:
> 1) Implement statedump for gfapi applications (Rajesh)
> 2) In gfapi fix memory leak during graph switch (Rajesh)
> 3) switch to storhaug for HA for ganesha and samba (Kaleb)
> 4) volume expansion/contraction for tiered volumes (stage 1 - tier as a
> service) (Dan)
>   - Google doc present, need a feature page
> 5) multithreaded promotion/demotion in tiering (Dan)
>   - Google doc present, need a feature page
>   - Google doc not accessible
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>



-- 

~ Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Please pause merging patches to 3.9 waiting for just one patch

2016-11-09 Thread Atin Mukherjee

On Thu, Nov 10, 2016 at 1:04 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

> I am trying to understand the criticality of these patches. Raghavendra's
> patch is crucial because gfapi workloads(for samba and qemu) are affected
> severely. I waited for Krutika's patch because VM usecase can lead to disk
> corruption on replace-brick. If you could let us know the criticality and
> we are in agreement that they are this severe, we can definitely take them
> in. Otherwise next release is better IMO. Thoughts?
>

If you are asking about how critical they are, then the first two are
definitely not but third one is actually a critical one as if user upgrades
from 3.6 to latest with quota enable, further peer probes get rejected and
the only work around is to disable quota and re-enable it back.

On a different note, 3.9 head is not static and moving forward. So if you
are really looking at only critical patches need to go in, that's not
happening, just a word of caution!


> On Thu, Nov 10, 2016 at 12:56 PM, Atin Mukherjee <amukh...@redhat.com>
> wrote:
>
>> Pranith,
>>
>> I'd like to see following patches getting in:
>>
>> http://review.gluster.org/#/c/15722/
>> http://review.gluster.org/#/c/15714/
>> http://review.gluster.org/#/c/15792/
>>
>
>>
>>
>>
>> On Thu, Nov 10, 2016 at 7:12 AM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>> hi,
>>>   The only problem left was EC taking more time. This should affect
>>> small files a lot more. Best way to solve it is using compound-fops. So for
>>> now I think going ahead with the release is best.
>>>
>>> We are waiting for Raghavendra Talur's http://review.gluster.org/#/c/
>>> 15778 before going ahead with the release. If we missed any other
>>> crucial patch please let us know.
>>>
>>> Will make the release as soon as this patch is merged.
>>>
>>> --
>>> Pranith & Aravinda
>>>
>>> ___
>>> maintainers mailing list
>>> maintainers@gluster.org
>>> http://www.gluster.org/mailman/listinfo/maintainers
>>>
>>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>
>
>
> --
> Pranith
>



-- 

~ Atin (atinm)
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] Why a -2 need to be carried over to next patch set?

2016-10-11 Thread Atin Mukherjee

Although a -2 on a patch indicates that reviewer has strongly disagreed on
the changes done on the patch, but is it right to carry forward the same
vote on the subsequent patch set(s)? What if the changes on the following
patch sets are in line with the comments on the patch set where a -2 was
mentioned? As it stands until the same reviewer revokes the -2, the patch
can't be merged. Is this what was intended for?

My primary concern here is if the concerned person is unavailable (for
various reason) the acceptance of the patch gets delayed even if we have co
maintainers for the same module acking the patch?

What do others think here? Should we continue to carry over a -2 on the
subsequent patch sets?

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #1682

2016-09-11 Thread Atin Mukherjee

http://review.gluster.org/15457 fixes this failure.

On Sun, Sep 11, 2016 at 8:24 AM,  wrote:

> See 
>
> --
> [...truncated 16035 lines...]
> ./tests/bugs/glusterd/bug-1367478-volume-start-validation-after-glusterd-restart.t
> -  29 second
> ./tests/basic/afr/durability-off.t  -  29 second
> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
> -  28 second
> ./tests/bugs/glusterd/bug-1173414-mgmt-v3-remote-lock-failure.t  -  28
> second
> ./tests/bugs/glusterd/859927/repl.t  -  28 second
> ./tests/basic/gfapi/gfapi-ssl-test.t  -  28 second
> ./tests/basic/geo-replication/marker-xattrs.t  -  28 second
> ./tests/bugs/glusterd/bug-1318591-skip-non-directories-inside-vols.t  -
> 27 second
> ./tests/bugs/glusterd/bug-1293414-import-brickinfo-uuid.t  -  27 second
> ./tests/bugs/glusterd/bug-1260185-donot-allow-detach-commit-unnecessarily.t
> -  27 second
> ./tests/basic/tier/file_with_spaces.t  -  27 second
> ./tests/bugs/glusterd/bug-1231437-rebalance-test-in-cluster.t  -  26
> second
> ./tests/bugs/glusterd/bug-1230121-replica_subvol_count_correct_cal.t  -
> 26 second
> ./tests/basic/op_errnos.t  -  26 second
> ./tests/basic/afr/arbiter-add-brick.t  -  26 second
> ./tests/bugs/glusterd/bug-1352277-spawn-daemons-on-two-node-setup.t  -
> 25 second
> ./tests/bugs/glusterd/bug-1109741-auth-mgmt-handshake.t  -  25 second
> ./tests/bugs/glusterd/bug-1075087.t  -  25 second
> ./tests/bugs/ec/bug-1188145.t  -  25 second
> ./tests/basic/glusterd/volfile_server_switch.t  -  25 second
> ./tests/bugs/distribute/bug-1193636.t  -  24 second
> ./tests/basic/tier/readdir-during-migration.t  -  24 second
> ./tests/basic/ec/quota.t  -  24 second
> ./tests/basic/afr/heal-quota.t  -  24 second
> ./tests/bugs/glusterd/bug-1351021-rebalance-info-post-glusterd-restart.t
> -  23 second
> ./tests/bugs/bitrot/bug-1227996.t  -  23 second
> ./tests/basic/afr/gfid-self-heal.t  -  23 second
> ./tests/bugs/glusterd/bug-1223213-peerid-fix.t  -  22 second
> ./tests/bugs/glusterd/bug-1104642.t  -  22 second
> ./tests/bugs/cli/bug-1113476.t  -  22 second
> ./tests/bugs/changelog/bug-1225542.t  -  22 second
> ./tests/basic/afr/granular-esh/replace-brick.t  -  22 second
> ./tests/basic/0symbol-check.t  -  22 second
> ./tests/bugs/glusterd/bug-1213295-snapd-svc-uninitialized.t  -  21 second
> ./tests/bugs/glusterd/bug-1047955.t  -  21 second
> ./tests/bugs/fuse/bug-983477.t  -  21 second
> ./tests/bitrot/bug-1207627-bitrot-scrub-status.t  -  21 second
> ./tests/basic/ec/statedump.t  -  21 second
> ./tests/bugs/glusterd/bug-1323287-real_path-handshake-test.t  -  20 second
> ./tests/bugs/gfapi/glfs_vol_set_IO_ERR.t  -  20 second
> ./tests/bugs/distribute/bug-853258.t  -  20 second
> ./tests/bugs/distribute/bug-1099890.t  -  20 second
> ./tests/bugs/bitrot/bug-1245981.t  -  20 second
> ./tests/bitrot/br-state-check.t  -  20 second
> ./tests/basic/afr/split-brain-resolution.t  -  20 second
> ./tests/basic/afr/replace-brick-self-heal.t  -  20 second
> ./tests/bugs/glusterd/bug-859927.t  -  19 second
> ./tests/bugs/glusterd/bug-1245142-rebalance_test.t  -  19 second
> ./tests/bugs/glusterd/bug-1209329_daemon-svcs-on-reset-volume.t  -  19
> second
> ./tests/bugs/glusterd/bug-1089668.t  -  19 second
> ./tests/bugs/distribute/bug-973073.t  -  19 second
> ./tests/bugs/distribute/bug-884455.t  -  19 second
> ./tests/bugs/cli/bug-1077682.t  -  19 second
> ./tests/bugs/cli/bug-1047416.t  -  19 second
> ./tests/bugs/changelog/bug-1321955.t  -  19 second
> ./tests/bitrot/bug-1373520.t  -  19 second
> ./tests/basic/glusterd/arbiter-volume-probe.t  -  19 second
> ./tests/basic/afr/resolve.t  -  19 second
> ./tests/bugs/glusterd/bug-765230-remove-quota-related-option-after-disabling-quota.t
> -  18 second
> ./tests/bugs/glusterd/bug-1242543-replace-brick.t  -  18 second
> ./tests/bugs/glusterd/bug-1225716-brick-online-validation-remove-brick.t
> -  18 second
> ./tests/bugs/glusterd/bug-1002556.t  -  18 second
> ./tests/bugs/distribute/bug-862967.t  -  18 second
> ./tests/bugs/distribute/bug-1161156.t  -  18 second
> ./tests/bugs/distribute/bug-1063230.t  -  18 second
> ./tests/bugs/changelog/bug-1211327.t  -  18 second
> ./tests/basic/glusterd/disperse-create.t  -  18 second
> ./tests/basic/cdc.t  -  18 second
> ./tests/basic/bd.t  -  18 second
> ./tests/basic/afr/client-side-heal.t  -  18 second
> ./tests/bugs/glusterd/bug-889630.t  -  17 second
> ./tests/bugs/glusterd/bug-1265479-validate-replica-volume-options.t  -
> 17 second
> ./tests/bugs/cli/bug-822830.t  -  17 second
> ./tests/bugs/bug-1110262.t  -  17 second
> ./tests/bugs/bitrot/bug-1228680.t  -  17 second
> ./tests/bugs/bitrot/1209751-bitrot-scrub-tunable-reset.t  -  17 second
> ./tests/basic/volume-status.t  -  17 second
> ./tests/bugs/glusterd/bug-948729/bug-948729.t  -  16 second
> ./tests/bugs/glusterd/bug-948729/bug-948729-force.t  -  16 second
>

Re: [Gluster-Maintainers] port allocation change for 3.8, needs release-notes update

2016-09-07 Thread Atin Mukherjee

On Wednesday 7 September 2016, Pranith Kumar Karampuri 
wrote:

> Just wondering, is it going to be something more than the aggregation of
> BZ descriptions? Do you think it is better to callout just some BZs which
> are user facing compared to the normal aggregation?
>

IMHO, I don't think users go through all the individual BZs mentioned in
the release note, so having a seperate section in highlighting any major
change w.r.t usability/performance should be explicitly mentioned.


> On Wed, Sep 7, 2016 at 8:33 PM, Niels de Vos  > wrote:
>
>> Hi Avra,
>>
>> http://review.gluster.org/15308 is one of your patches, and this changes
>> the allocation of ports used. It seems to address a real problem, so it
>> is acceptible to include it in 3.8.
>>
>> Because it is a user facing change (different ports), we need to mention
>> the difference in behaviour in the release notes. Could you provide me
>> with a suitable text that includes the problem being addressed, and how
>> the usage of ports differs from before?
>>
>> Thanks,
>> Niels
>>
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> 
>> http://www.gluster.org/mailman/listinfo/maintainers
>>
>>
>
>
> --
> Pranith
>


-- 
--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] [Gluster-devel] GlusterFs upstream bugzilla components Fine graining

2016-09-06 Thread Atin Mukherjee

On Tue, Sep 6, 2016 at 12:42 PM, Muthu Vigneshwaran 
wrote:

> Hi,
>
>   Actually the currently component list in the Bugzilla appears to be in
> just alphabetical order of all components and sub-components as a flattened
> list.
>
> Planning to better organize the component list. So the bugs can be
> reported on the components( mostly matching different git repositories) and
> sub-components( mostly matching different components in the git repository,
> or functionality ) in the list respectively which will help in easy access
> for the reporter of the bug and as well as the assignee.
>
> Along with these changes we will have only major version number(3.6,
> 3.7..) (as mentioned in an earlier email from Kaleb - check that :) )
> unlike previously we had major version with minor version. Reporter has to
> mention the minor version in the description (the request for the exact
> version is already part of the template)
>
> In order to do so we require the maintainers to list their top-level
> component and sub-components to be listed along with the version for
> each.You should include the version for glusterfs (3.6,3.7,3.8,3.9,mainline
> ) and the sub-components as far as you have them ready. Also give examples
> of other components and their versions (gdeploy etc). It makes a huge
> difference for people to amend that has bits missing, starting from scratch
> without examples it difficult ;-)
>

This is the tree structure for cli, glusterd & glusterd2 sub components.
Although glusterd2 is currently maintained as a separate github project
under gluster, going forward the same would be integrated in the main repo
and hence there is no point to have this maintained as a different
component in bugzilla IMHO. @Kaushal - let us know if you think otherwise.

|
|
- glusterfs
| |
| |- cli
| |- glusterd
| |- glusterd2
|


> Thanks and regards,
> Muthu Vigneshwaran and Niels
>
>
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #1623

2016-09-01 Thread Atin Mukherjee

This is strange, Ashish has already fixed the issue of the missing SSL
certs through fad93c1

>From the logs it definitely looks like glusterd failed to come up:

[2016-09-01 13:30:28.909555] E [socket.c:4122:socket_init]
0-socket.management: could not load our cert



On Thu, Sep 1, 2016 at 7:38 PM, Vijay Bellur  wrote:

> tests/bugs/cli/bug-1320388.t is failing quite frequently in
> regression-test burn in. Can you please take a look in?
>
> Thx,
> Vijay
>
> On Thu, Sep 1, 2016 at 9:33 AM,   wrote:
> > See 
> >
> > --
> > [...truncated 10509 lines...]
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > touch: cannot touch `/mnt/glusterfs/0/a': Transport endpoint is not
> connected
> > cat: 
> > /var/lib/glusterd/vols/patchy/run/slave28.cloud.gluster.org-d-backends-patchy5.pid:
> No such file or directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > ./tests/bugs/cli/bug-1320388.t: line 19: /mnt/glusterfs/0/a: Transport
> endpoint is not connected
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
> or kill -l [sigspec]
> > sed: read error on /var/run/gluster/: Is a directory
> > rm: cannot remove `/var/run/gluster/': Is a directory
> > ./tests/bugs/cli/bug-1320388.t ..
> > 1..11
> > not ok 1 , LINENUM:11
> > FAILED COMMAND: glusterd
> > not ok 2 , LINENUM:12
> > FAILED COMMAND: pidof glusterd
> > ok 3, LINENUM:13
> > ok 4, LINENUM:14
> > not ok 5 , LINENUM:15
> > FAILED COMMAND: glusterfs --entry-timeout=0 --attribute-timeout=0 -s
> slave28.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/0
> > not ok 6 Got "" instead of "^6$", LINENUM:16
> > FAILED COMMAND: ^6$ ec_child_up_count patchy 0
> > not ok 7 , LINENUM:18
> > FAILED COMMAND: kill_brick patchy slave28.cloud.gluster.org
> /d/backends/patchy5
> > not ok 8 Got "" instead of "^5$", LINENUM:20
> > FAILED COMMAND: ^5$ get_pending_heal_count patchy
> > not ok 9 Got "" instead of "^6$", LINENUM:22
> > FAILED COMMAND: ^6$ ec_child_up_count patchy 0
> > ok 10, LINENUM:23
> > not ok 11 Got "" instead of "^0$", LINENUM:24
> > FAILED COMMAND: ^0$ get_pending_heal_count patchy
> > Failed 8/11 subtests
> >
> > Test Summary Report
> > ---
> > ./tests/bugs/cli/bug-1320388.t (Wstat: 0 Tests: 11 Failed: 8)
> >   Failed tests:  1-2, 5-9, 11
> > Files=1, Tests=11, 202 wallclock secs ( 0.02 usr  0.01 sys + 12.55 cusr
> 3.28 csys = 15.86 CPU)
> > Result: FAIL
> > End of test ./tests/bugs/cli/bug-1320388.t

Re: [Gluster-Maintainers] Request to provide PASS flags to a patch in gerrit

2016-08-31 Thread Atin Mukherjee

On Wed, Aug 31, 2016 at 4:23 PM, Raghavendra Talur 
wrote:

> Hi All,
>
> We have a test [1] which is causing hangs in NetBSD. We have not been able
> to debug the issue yet.
> It could be because the bash script does not comply with posix guidelines
> or that there is a bug in the brick code.
>
> However, as we have 3.9 merge deadline tomorrow this is causing the test
> pipeline to grow a lot and needing manual intervention.
> I recommend we disable this test for now. I request Kaushal to provide
> pass flags to the patch [2] for faster merge.
>

+1 to this as we have a very long regression queue in the pipeline and this
patch may get its turn pretty late.


>
>
> [1] ./tests/features/lock_revocation.t
> [2] http://review.gluster.org/#/c/15374/
>
>
> Thanks,
> Raghavendra Talur
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>
>


-- 

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] NetBSD aborted runs

2016-08-27 Thread Atin Mukherjee

This is still bothering us a lot and looks like there is a genuine issue in
the code which is making the the process to be hung/deadlocked?

Raghavendra T - any more findings?

On Friday 19 August 2016, Atin Mukherjee <amukh...@redhat.com> wrote:

> https://bugzilla.redhat.com/show_bug.cgi?id=1368421
>
> NetBSD regressions are getting aborted very frequently. Apart from the
> infra issue related to connectivity (Nigel has started looking into it),
> lock_revocation.t is getting hung in such instances which is causing run to
> be aborted after 300 minutes. This has already started impacting the
> patches to get in which eventually impacts the upcoming release cycles.
>
> I'd request the feature owner/maintainer to have a look at it asap.
>
> --Atin
>


-- 
--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] NetBSD aborted runs

2016-08-19 Thread Atin Mukherjee

https://bugzilla.redhat.com/show_bug.cgi?id=1368421

NetBSD regressions are getting aborted very frequently. Apart from the
infra issue related to connectivity (Nigel has started looking into it),
lock_revocation.t is getting hung in such instances which is causing run to
be aborted after 300 minutes. This has already started impacting the
patches to get in which eventually impacts the upcoming release cycles.

I'd request the feature owner/maintainer to have a look at it asap.

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #1481

2016-08-09 Thread Atin Mukherjee

+Mohit

On Tue, Aug 9, 2016 at 5:17 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

> This looks to be an SSL issue. GlusterD is not coming up after enabling
> mgmt encryption with the following error log:
>
> [2016-08-09 11:37:08.854932] E [socket.c:4107:socket_init]
> 0-socket.management: could not load our cert
>
> I believe the prerequisite of this test to have all the certificates to be
> available in the right location which is missing here, is this a new slave
> by any chance where the certs are yet to be created?
>
> P.S : I am not an expert on this part, but Jeff/Kaushal/Mohit can chime in
> if I am wrong.
>
> On Tue, Aug 9, 2016 at 1:37 PM, Ashish Pandey <aspan...@redhat.com> wrote:
>
>> I could see that this test is failing on my laptop with and without my
>> unlink patch.
>> So, I don't think unlink patch is responsible for failure.
>>
>>
>>
>>
>> --
>> *From: *"Pranith Kumar Karampuri" <pkara...@redhat.com>
>> *To: *jenk...@build.gluster.org
>> *Cc: *maintainers@gluster.org, "Ashish Pandey" <aspan...@redhat.com>
>> *Sent: *Tuesday, August 9, 2016 11:05:58 AM
>> *Subject: *Re: [Gluster-Maintainers] Build failed in Jenkins:
>> regression-test-burn-in #1481
>>
>>
>> Ashish could you take a look at this failure to see if it is because of
>> the change in posix xlator we did for unlink dir?
>>
>> On Mon, Aug 8, 2016 at 7:08 PM, <jenk...@build.gluster.org> wrote:
>>
>>> See <http://build.gluster.org/job/regression-test-burn-in/1481/changes>
>>>
>>> Changes:
>>>
>>> [Pranith Kumar K] posix: Do not move and recreate .glusterfs/unlink
>>> directory
>>>
>>> --
>>> [...truncated 9683 lines...]
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>>> or kill -l [sigspec]
>>> sed: read error on /var/run/gluster/: Is a directory
>>> rm: cannot remove `/var/run/gluster/': Is a directory
>>> ./tests/bugs/cli/bug-1320388.t ..
>>> 1..11
>>> not ok 1 , LINENUM:11
>>> FAILED COMMAND: glusterd
>>> not ok 2 , LINENUM:12
>>> FAILED COMMAND: pidof glusterd
>>> ok 3, LINENUM:13
>>> ok 4, LINENUM:14
>>> not ok 5 , LINENUM:15
>>> FAILED COMMAND: glusterfs --entry-timeout=0 --attribute-timeout=0 -s
>>> slave26.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/0
>>> not ok 6 Got "" instead of "^6$", LINENUM:16
>>> FAILED COMMAND: ^6$ ec_child_up_count patchy 0
>>> not ok 7 , LINENUM:18
>>> FAILED COMMAND: kill_brick patchy slave26.cloud.gluster.org
>>> /d/backends/patchy5
>>> not ok 8 Got "" instead of "^5$", LINENUM:20
>>> FAILED COMMAND: ^5$ get_pending_heal_count patchy
>>> not ok 9 Got "" instead of "^6$", LINENUM:22
>>> FAILED COMMAND: ^6$ ec_child_up_count patchy 0
>>> ok 10, LINENUM:23
>>> not ok 11 Got "" instead of "^0$", LINENUM:24
>>> FAILED COMMAND: ^0$ get_pending_heal_count patchy
>>> Failed 8/11 subt

Re: [Gluster-Maintainers] Build failed in Jenkins: regression-test-burn-in #1481

2016-08-09 Thread Atin Mukherjee

This looks to be an SSL issue. GlusterD is not coming up after enabling
mgmt encryption with the following error log:

[2016-08-09 11:37:08.854932] E [socket.c:4107:socket_init]
0-socket.management: could not load our cert

I believe the prerequisite of this test to have all the certificates to be
available in the right location which is missing here, is this a new slave
by any chance where the certs are yet to be created?

P.S : I am not an expert on this part, but Jeff/Kaushal/Mohit can chime in
if I am wrong.

On Tue, Aug 9, 2016 at 1:37 PM, Ashish Pandey  wrote:

> I could see that this test is failing on my laptop with and without my
> unlink patch.
> So, I don't think unlink patch is responsible for failure.
>
>
>
>
> --
> *From: *"Pranith Kumar Karampuri" 
> *To: *jenk...@build.gluster.org
> *Cc: *maintainers@gluster.org, "Ashish Pandey" 
> *Sent: *Tuesday, August 9, 2016 11:05:58 AM
> *Subject: *Re: [Gluster-Maintainers] Build failed in Jenkins:
> regression-test-burn-in #1481
>
>
> Ashish could you take a look at this failure to see if it is because of
> the change in posix xlator we did for unlink dir?
>
> On Mon, Aug 8, 2016 at 7:08 PM,  wrote:
>
>> See 
>>
>> Changes:
>>
>> [Pranith Kumar K] posix: Do not move and recreate .glusterfs/unlink
>> directory
>>
>> --
>> [...truncated 9683 lines...]
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ...
>> or kill -l [sigspec]
>> sed: read error on /var/run/gluster/: Is a directory
>> rm: cannot remove `/var/run/gluster/': Is a directory
>> ./tests/bugs/cli/bug-1320388.t ..
>> 1..11
>> not ok 1 , LINENUM:11
>> FAILED COMMAND: glusterd
>> not ok 2 , LINENUM:12
>> FAILED COMMAND: pidof glusterd
>> ok 3, LINENUM:13
>> ok 4, LINENUM:14
>> not ok 5 , LINENUM:15
>> FAILED COMMAND: glusterfs --entry-timeout=0 --attribute-timeout=0 -s
>> slave26.cloud.gluster.org --volfile-id patchy /mnt/glusterfs/0
>> not ok 6 Got "" instead of "^6$", LINENUM:16
>> FAILED COMMAND: ^6$ ec_child_up_count patchy 0
>> not ok 7 , LINENUM:18
>> FAILED COMMAND: kill_brick patchy slave26.cloud.gluster.org
>> /d/backends/patchy5
>> not ok 8 Got "" instead of "^5$", LINENUM:20
>> FAILED COMMAND: ^5$ get_pending_heal_count patchy
>> not ok 9 Got "" instead of "^6$", LINENUM:22
>> FAILED COMMAND: ^6$ ec_child_up_count patchy 0
>> ok 10, LINENUM:23
>> not ok 11 Got "" instead of "^0$", LINENUM:24
>> FAILED COMMAND: ^0$ get_pending_heal_count patchy
>> Failed 8/11 subtests
>>
>> Test Summary Report
>> ---
>> ./tests/bugs/cli/bug-1320388.t (Wstat: 0 Tests: 11 Failed: 8)
>>   Failed tests:  1-2, 5-9, 11
>> Files=1, Tests=11, 201 wallclock secs ( 0.02 usr  0.00 sys +  7.79 cusr
>> 2.70 csys = 10.51 CPU)
>> Result: FAIL
>> End of test ./tests/bugs/cli/bug-1320388.t
>> 
>> 
>>
>>
>> Run complete
>> 
>> 
>> Number of tests found: 158
>> Number of tests selected for run based on pattern: 158
>> Number of tests skipped as they were marked bad:   6
>> Number of tests skipped because of known_issues:   1
>> Number of tests that were run: 151
>>
>> 1 test(s) failed
>> ./tests/bugs/cli/bug-1320388.t
>>
>> 0 test(s) generated core
>>
>>
>> Tests ordered by time taken, slowest to fastest:
>> 
>> 
>>

Re: [Gluster-Maintainers] Updates to the GlusterFS release process document

2016-08-07 Thread Atin Mukherjee

I had my similar concerns (when to branch) with Aravinda shared on the
etherpad. Otherwise rest of the contents look good.

On Mon, Aug 8, 2016 at 9:56 AM, Kaushal M  wrote:

> On Tue, Aug 2, 2016 at 2:21 PM, Kaushal M  wrote:
> > Hi All.
> >
> > We've been discussing about improvements to our release process and
> > schedules for a while now. As a result of these discussions I had
> > started an etherpad [1] to put together a release process document.
> >
> > I've created a pull-request [2] to glusterdocs based on this etherpad,
> > to make it official. The document isn't complete yet. I'll be adding
> > more information and doing cleanups as required. Most of the required
> > information regarding the release-process has been added. I'd like the
> > maintainers to go over the pull-request and give comments.
>
> Bumping this again. I request all maintainers to please review the
> document and provide your approvals.
> Currently I've gotten comments only from Neils and Aravinda.
>
> ~kaushal
>
> >
> > Thanks,
> > Kaushal
> >
> > [1] https://public.pad.fsfe.org/p/glusterfs-release-process-201606
> > [2] https://github.com/gluster/glusterdocs/pull/139
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>



-- 

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Gluster Events API - Help required to identify the list of Events from each component

2016-07-18 Thread Atin Mukherjee

So the framework is in now mainline branch [1]. I'd request all of you to
start thinking about all the important events required to be captured as a
next step and feedback.

[1] http://review.gluster.org/14248

~Atin

On Thu, Jul 14, 2016 at 1:45 PM, Aravinda  wrote:

> +gluster-users
>
> regards
> Aravinda
>
> On 07/13/2016 09:03 PM, Vijay Bellur wrote:
>
>> On 07/13/2016 10:23 AM, Aravinda wrote:
>>
>>> Hi,
>>>
>>> We are working on Eventing feature for Gluster, Sent feature patch for
>>> the same.
>>> Design: http://review.gluster.org/13115
>>> Patch:  http://review.gluster.org/14248
>>> Demo: http://aravindavk.in/blog/10-mins-intro-to-gluster-eventing
>>>
>>> Following document lists the events(mostly user driven events are
>>> covered in the doc). Please let us know the Events from your components
>>> to be supported by the Eventing Framework.
>>>
>>>
>>> https://docs.google.com/document/d/1oMOLxCbtryypdN8BRdBx30Ykquj4E31JsaJNeyGJCNo/edit?usp=sharing
>>>
>>>
>>>
>> Thanks for putting this together, Aravinda! Might be worth to poll -users
>> ML also about events of interest.
>>
>> -Vijay
>>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>



-- 

--Atin
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

2016-07-12 Thread Atin Mukherjee

On Tue, Jul 12, 2016 at 12:05 PM, Aravinda <avish...@redhat.com> wrote:

>
> regards
> Aravinda
>
> On 07/12/2016 11:51 AM, Atin Mukherjee wrote:
>
>
>
> On Tue, Jul 12, 2016 at 11:40 AM, Aravinda <avish...@redhat.com> wrote:
>
>> How about running the same upgrade steps again after %post
>> geo-replication. Upgrade steps will run twice(fails in first step) but it
>> solves these issues.
>>
>
> I'd not do that if we can solve the problem in first upgrade attempt
> itself which looks feasible.
>
> I think we can't safely handle this in first call unless we skip
> checking/calling gsyncd.
>

That's what I proposed earlier, we'd need to call configure_syncdaemon ()
conditionally. Kotresh already has a patch [1] now.

[1] http://review.gluster.org/#/c/14898

>
>
>
>>
>> regards
>> Aravinda
>>
>> On 07/11/2016 01:56 PM, Niels de Vos wrote:
>>
>> On Mon, Jul 11, 2016 at 12:56:24PM +0530, Kaushal M wrote:
>>
>> On Sat, Jul 9, 2016 at 10:02 PM, Atin Mukherjee <amukh...@redhat.com> 
>> <amukh...@redhat.com> wrote:
>>
>> ...
>>
>>
>> GlusterD depends on the cluster op-version when generating volfiles,
>> to insert new features/xlators into the volfile graph.
>> This was done to make sure that the homogeneity of the volfiles is
>> preserved across the cluster.
>> This behaviour makes running GlusterD in upgrade mode after a package
>> upgrade, essentially a noop.
>> The cluster op-version doesn't change automatically when packages are 
>> upgraded,
>> so the regenerated volfiles in the post-upgrade section are basically
>> the same as before.
>> (If something is getting added into volfiles after this, it is
>> incorrect, and is something I'm yet to check).
>>
>> The correct time to regenerate the volfiles is after all members of
>> the cluster have been upgraded and the cluster op-version has been
>> bumped.
>> (Bumping op-version doesn't regenerate anything, it is just an
>> indication that the cluster is now ready to use new features.)
>>
>> We don't have a direct way to get volfiles regenerated on all members
>> with a single command yet. We can implement such a command with
>> relative ease.
>> For now, volfiles can regenerated by making use of the `volume set`
>> command, by setting a `user.upgrade` option on a volume.
>> Options in the `user.` namespace are passed on to hook scripts and not
>> added into any volfiles, but setting such an option on a volume causes
>> GlusterD to regenerate volfiles for the volume.
>>
>> My suggestion would be to stop using glusterd in upgrade mode during
>> post-upgrade to regenerate volfiles, and document the above way to get
>> volfiles regenerated across the cluster correctly.
>> We could do away with upgrade mode itself, but it could be useful for
>> other things (Though I can't think of any right now).
>>
>> What do the other maintainers feel about this?
>>
>> Would it make sense to have the volfiles regenerated when changing the
>> op-version? For environments where multiple volumes are used, I do not
>> like the need to regenerate them manually for all of them.
>>
>> On the other hand, a regenerate+reload/restart results in a short
>> interruption. This may not be suitable for all volumes at the same time.
>> A per volume option might be preferred by some users. Getting the
>> feedback from users would be good before deciding on an approach.
>>
>> Running GlusterD in upgrade mode while updating the installed binaries
>> is something that easily gets forgotten. I'm not even sure if this is
>> done in all packages, and I guess it is skipped a lot when people have
>> installations from source. We should probably put the exact steps in our
>> release-notes to remind everyone.
>>
>> Thanks,
>> Niels
>>
>>
>>
>> ___
>> maintainers mailing 
>> listmaintainers@gluster.orghttp://www.gluster.org/mailman/listinfo/maintainers
>>
>>
>>
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://www.gluster.org/mailman/listinfo/maintainers
>>
>>
>
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

2016-07-12 Thread Atin Mukherjee

On Tue, Jul 12, 2016 at 11:40 AM, Aravinda <avish...@redhat.com> wrote:

> How about running the same upgrade steps again after %post
> geo-replication. Upgrade steps will run twice(fails in first step) but it
> solves these issues.
>

I'd not do that if we can solve the problem in first upgrade attempt itself
which looks feasible.


>
> regards
> Aravinda
>
> On 07/11/2016 01:56 PM, Niels de Vos wrote:
>
> On Mon, Jul 11, 2016 at 12:56:24PM +0530, Kaushal M wrote:
>
> On Sat, Jul 9, 2016 at 10:02 PM, Atin Mukherjee <amukh...@redhat.com> 
> <amukh...@redhat.com> wrote:
>
> ...
>
>
> GlusterD depends on the cluster op-version when generating volfiles,
> to insert new features/xlators into the volfile graph.
> This was done to make sure that the homogeneity of the volfiles is
> preserved across the cluster.
> This behaviour makes running GlusterD in upgrade mode after a package
> upgrade, essentially a noop.
> The cluster op-version doesn't change automatically when packages are 
> upgraded,
> so the regenerated volfiles in the post-upgrade section are basically
> the same as before.
> (If something is getting added into volfiles after this, it is
> incorrect, and is something I'm yet to check).
>
> The correct time to regenerate the volfiles is after all members of
> the cluster have been upgraded and the cluster op-version has been
> bumped.
> (Bumping op-version doesn't regenerate anything, it is just an
> indication that the cluster is now ready to use new features.)
>
> We don't have a direct way to get volfiles regenerated on all members
> with a single command yet. We can implement such a command with
> relative ease.
> For now, volfiles can regenerated by making use of the `volume set`
> command, by setting a `user.upgrade` option on a volume.
> Options in the `user.` namespace are passed on to hook scripts and not
> added into any volfiles, but setting such an option on a volume causes
> GlusterD to regenerate volfiles for the volume.
>
> My suggestion would be to stop using glusterd in upgrade mode during
> post-upgrade to regenerate volfiles, and document the above way to get
> volfiles regenerated across the cluster correctly.
> We could do away with upgrade mode itself, but it could be useful for
> other things (Though I can't think of any right now).
>
> What do the other maintainers feel about this?
>
> Would it make sense to have the volfiles regenerated when changing the
> op-version? For environments where multiple volumes are used, I do not
> like the need to regenerate them manually for all of them.
>
> On the other hand, a regenerate+reload/restart results in a short
> interruption. This may not be suitable for all volumes at the same time.
> A per volume option might be preferred by some users. Getting the
> feedback from users would be good before deciding on an approach.
>
> Running GlusterD in upgrade mode while updating the installed binaries
> is something that easily gets forgotten. I'm not even sure if this is
> done in all packages, and I guess it is skipped a lot when people have
> installations from source. We should probably put the exact steps in our
> release-notes to remind everyone.
>
> Thanks,
> Niels
>
>
>
> ___
> maintainers mailing 
> listmaintainers@gluster.orghttp://www.gluster.org/mailman/listinfo/maintainers
>
>
>
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] Fwd: gluster_strfmt - Build # 4 - Failure!

2016-07-12 Thread Atin Mukherjee

Attached.

-- Forwarded message --
From: 
Date: Tue, Jul 12, 2016 at 9:43 AM
Subject: [Gluster-Maintainers] gluster_strfmt - Build # 4 - Failure!
To: maintainers@gluster.org

String formatting warnings have been detected. See the attached
warnings.txt for details.
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

warnings.txt
Description: Binary data
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

2016-07-11 Thread Atin Mukherjee

I still see the release notes for 3.8.1 & 3.7.13 not reflecting this
change.

Niels, Kaushal,

Shouldn't we highlight this as early as possible to the users given release
note is the best possible medium to capture all the known issues and the
work around?


~Atin


On Sat, Jul 9, 2016 at 10:02 PM, Atin Mukherjee <amukh...@redhat.com> wrote:

> We have hit a bug 1347250 in downstream (applicable upstream too) where it
> was seen that glusterd didnt regenerate the volfiles when it was interimly
> brought up with upgrade mode by yum. Log file captured that gsyncd
> --version failed to execute and hence glusterd init couldnt proceed till
> the volfile regeneration. Since the ret code is not handled here in spec
> file users wouldnt come to know about this and going forward this is going
> to cause major issues in healing and all and finally it exploits the
> possibility of split brains at its best.
>
> Further analysis by Kotresh & Raghavendra Talur reveals that gsyncd failed
> here because of the compatibility issue where gsyncd was still not upgraded
> where as glusterfs-server was and this failure was mainly because of change
> in the mem type enum. We have seen a similar issue for RDMA as well
> (probably a year back). So to be very generic this can happen in any
> upgrade path from one version to another where new mem type is introduced.
> We have seen this from 3.7.8 to 3.7.12 and 3.8. People upgrading from 3.6
> to 3.7/3.8 will also experience this issue.
>
> Till we work on this fix, I suggest all the release managers to highlight
> this in the release note of the latest releases with the following work
> around after yum update:
>
> 1. grep -irns "geo-replication module not working as desired" 
> /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l
>
>  If the output is non-zero, then go to step 2 else follow the rest of the 
> steps as per the guide.
>
> 2.Check if glusterd instance is running or not by 'ps aux | grep glusterd', 
> if it is, then stop the glusterd service.
>
>  3. glusterd --xlator-option *.upgrade=on -N
>
> and then proceed ahead with rest of the steps as per the guide.
>
> Thoughts?
>
> P.S : this email is limited to maintainers till we decide on the approach
> to highlight this issues to the users
>
>
> --
> Atin
> Sent from iPhone
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] gluster_strfmt - Build # 4 - Failure!

2016-07-11 Thread Atin Mukherjee

cli-rps-ops.c warnings are from cli_populate_req_dict_for_delete () which
is related to snapshot functionality.

Rajesh/Avra - could you take a look at it?


On Tue, Jul 12, 2016 at 9:43 AM,  wrote:

> String formatting warnings have been detected. See the attached
> warnings.txt for details.
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

2016-07-11 Thread Atin Mukherjee

My intention to initiate the email was more from how to prevent users to
hit this problem with a proper work around captured in the release note. We
can fork a separate thread on the approach of fixing this issue. Now on the
fix, my take is given that glusterd coming up with upgrade mode is just to
regenerate the volfiles, we don't need to call gsyncd --version with
upgrade=ON and it should solve this specific issue. However from long term
perspective we do need to think about versioning the other libraries as
pointed out by Kaushal/Niels.

The BZ is not yet filed upstream, I think Kotresh would be taking care of
that.

~Atin

On Mon, Jul 11, 2016 at 1:19 PM, Niels de Vos <nde...@redhat.com> wrote:

> On Mon, Jul 11, 2016 at 12:56:24PM +0530, Kaushal M wrote:
> > On Sat, Jul 9, 2016 at 10:02 PM, Atin Mukherjee <amukh...@redhat.com>
> wrote:
> > > We have hit a bug 1347250 in downstream (applicable upstream too)
> where it
> > > was seen that glusterd didnt regenerate the volfiles when it was
> interimly
> > > brought up with upgrade mode by yum. Log file captured that gsyncd
> --version
> > > failed to execute and hence glusterd init couldnt proceed till the
> volfile
> > > regeneration. Since the ret code is not handled here in spec file users
> > > wouldnt come to know about this and going forward this is going to
> cause
> > > major issues in healing and all and finally it exploits the
> possibility of
> > > split brains at its best.
> > >
> > > Further analysis by Kotresh & Raghavendra Talur reveals that gsyncd
> failed
> > > here because of the compatibility issue where gsyncd was still not
> upgraded
> > > where as glusterfs-server was and this failure was mainly because of
> change
> > > in the mem type enum. We have seen a similar issue for RDMA as well
> > > (probably a year back). So to be very generic this can happen in any
> upgrade
> > > path from one version to another where new mem type is introduced. We
> have
> > > seen this from 3.7.8 to 3.7.12 and 3.8. People upgrading from 3.6 to
> 3.7/3.8
> > > will also experience this issue.
> > >
> > > Till we work on this fix, I suggest all the release managers to
> highlight
> > > this in the release note of the latest releases with the following work
> > > around after yum update:
> > >
> > > 1. grep -irns "geo-replication module not working as desired"
> > > /var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l
> > >
> > >  If the output is non-zero, then go to step 2 else follow the rest of
> the
> > > steps as per the guide.
> > >
> > > 2.Check if glusterd instance is running or not by 'ps aux | grep
> glusterd',
> > > if it is, then stop the glusterd service.
> > >
> > >  3. glusterd --xlator-option *.upgrade=on -N
> > >
> > > and then proceed ahead with rest of the steps as per the guide.
> > >
> > > Thoughts?
> >
> > Proper .so versioning of libglusterfs should help with problems like
> > this. I don't know how to do this though.
>
> We could provde the 'current' version of libglusterfs with the same
> number as the op-version. For 3.7.13 it would be 030713, dropping the
> prefixed 0 makes that 30713, so libglusterfs.so.30713. The same should
> probably be done for all other internal libraries.
>
> Some more details about library versioning can be found here:
>
> https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/versioning.md
>
> Note that libgfapi uses symbol versioning, that is a more fine-grained
> solution. It prevents the need for applications using the library to get
> re-compiled. Details about that, and the more involved changes to get
> that to work correctly are in this document:
>
> https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/gfapi-symbol-versions.md
>
> Is there already a bug filed to get this fixed?
>
> Thanks,
> Niels
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

[Gluster-Maintainers] Upgrade issue when new mem type is added in libglusterfs

2016-07-09 Thread Atin Mukherjee

We have hit a bug 1347250 in downstream (applicable upstream too) where it
was seen that glusterd didnt regenerate the volfiles when it was interimly
brought up with upgrade mode by yum. Log file captured that gsyncd
--version failed to execute and hence glusterd init couldnt proceed till
the volfile regeneration. Since the ret code is not handled here in spec
file users wouldnt come to know about this and going forward this is going
to cause major issues in healing and all and finally it exploits the
possibility of split brains at its best.

Further analysis by Kotresh & Raghavendra Talur reveals that gsyncd failed
here because of the compatibility issue where gsyncd was still not upgraded
where as glusterfs-server was and this failure was mainly because of change
in the mem type enum. We have seen a similar issue for RDMA as well
(probably a year back). So to be very generic this can happen in any
upgrade path from one version to another where new mem type is introduced.
We have seen this from 3.7.8 to 3.7.12 and 3.8. People upgrading from 3.6
to 3.7/3.8 will also experience this issue.

Till we work on this fix, I suggest all the release managers to highlight
this in the release note of the latest releases with the following work
around after yum update:

1. grep -irns "geo-replication module not working as desired"
/var/log/glusterfs/etc-glusterfs-glusterd.vol.log | wc -l

 If the output is non-zero, then go to step 2 else follow the rest of
the steps as per the guide.

2.Check if glusterd instance is running or not by 'ps aux | grep
glusterd', if it is, then stop the glusterd service.

 3. glusterd --xlator-option *.upgrade=on -N

and then proceed ahead with rest of the steps as per the guide.

Thoughts?

P.S : this email is limited to maintainers till we decide on the approach
to highlight this issues to the users


-- 
Atin
Sent from iPhone
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Glusterfs-3.7.13 release plans

2016-06-30 Thread Atin Mukherjee

On Thu, Jun 30, 2016 at 11:56 AM, Atin Mukherjee <amukh...@redhat.com>
wrote:

>
>
> On Thu, Jun 30, 2016 at 11:08 AM, Kaushal M <kshlms...@gmail.com> wrote:
>
>> Hi all,
>>
>> I'm (or was) planning to do a 3.7.13 release on schedule today. 3.7.12
>> has a huge issue with libgfapi, solved by [1].
>> I'm not sure if this fixes the other issues with libgfapi noticed by
>> Lindsay on gluster-users.
>>
>> This patch has been included in the packages 3.7.12 built for CentOS,
>> Fedora, Ubuntu, Debian and SUSE. I guess Lindsay is using one of these
>> packages, so it might be that the issue seen is new. So I'd like to do
>> a quick release once we have a fix.
>>
>
>  http://review.gluster.org/14835 probably is the one you are looking for.
>
>

Ignore it. I had a chance to talk to Poornima and she mentioned that this
is a different problem.


>
>> Maintainers can merge changes into release-3.7 that follow the
>> criteria given in [2]. Please make sure to add the bugs for patches
>> you are merging are added as dependencies for the 3.7.13 tracker bug
>> [3].
>>
>> Thanks,
>> Kaushal
>>
>> [1]: https://review.gluster.org/14822
>> [2]: https://public.pad.fsfe.org/p/glusterfs-release-process-201606
>> under the GlusterFS minor release heading
>> [3]: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.7.13
>> ___
>> maintainers mailing list
>> maintainers@gluster.org
>> http://www.gluster.org/mailman/listinfo/maintainers
>>
>
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

Re: [Gluster-Maintainers] Glusterfs-3.7.13 release plans

2016-06-30 Thread Atin Mukherjee

On Thu, Jun 30, 2016 at 11:08 AM, Kaushal M  wrote:

> Hi all,
>
> I'm (or was) planning to do a 3.7.13 release on schedule today. 3.7.12
> has a huge issue with libgfapi, solved by [1].
> I'm not sure if this fixes the other issues with libgfapi noticed by
> Lindsay on gluster-users.
>
> This patch has been included in the packages 3.7.12 built for CentOS,
> Fedora, Ubuntu, Debian and SUSE. I guess Lindsay is using one of these
> packages, so it might be that the issue seen is new. So I'd like to do
> a quick release once we have a fix.
>

 http://review.gluster.org/14835 probably is the one you are looking for.


>
> Maintainers can merge changes into release-3.7 that follow the
> criteria given in [2]. Please make sure to add the bugs for patches
> you are merging are added as dependencies for the 3.7.13 tracker bug
> [3].
>
> Thanks,
> Kaushal
>
> [1]: https://review.gluster.org/14822
> [2]: https://public.pad.fsfe.org/p/glusterfs-release-process-201606
> under the GlusterFS minor release heading
> [3]: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.7.13
> ___
> maintainers mailing list
> maintainers@gluster.org
> http://www.gluster.org/mailman/listinfo/maintainers
>
___
maintainers mailing list
maintainers@gluster.org
http://www.gluster.org/mailman/listinfo/maintainers

1 2 >

1 - 100 of 107 matches

Mail list logo