Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
We should document (better still add checks in rfc.sh and warn user to
upgrade) that we need clang version x or greater.

On Fri, Oct 5, 2018 at 10:45 AM Sachidananda URS  wrote:

>
>
> On Fri, Oct 5, 2018 at 10:41 AM, Raghavendra Gowdappa  > wrote:
>
>> General options:
>>
>>   -help - Display available options (-help-hidden for
>> more)
>>   -help-list- Display list of available options
>> (-help-list-hidden for more)
>>   -version  - Display the version of this program
>> [rgowdapp@rgowdapp ~]$ clang-format -version ; echo $?
>> LLVM (http://llvm.org/):
>>   LLVM version 3.4.2
>>   Optimized build.
>>   Built Dec  7 2015 (09:37:36).
>>   Default target: x86_64-redhat-linux-gnu
>>   Host CPU: x86-64
>> 1
>>
>>
> It is a bug then, which they've fixed later on. Upgrade is the only choice.
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Sachidananda URS
On Fri, Oct 5, 2018 at 10:41 AM, Raghavendra Gowdappa 
wrote:

> General options:
>
>   -help - Display available options (-help-hidden for
> more)
>   -help-list- Display list of available options
> (-help-list-hidden for more)
>   -version  - Display the version of this program
> [rgowdapp@rgowdapp ~]$ clang-format -version ; echo $?
> LLVM (http://llvm.org/):
>   LLVM version 3.4.2
>   Optimized build.
>   Built Dec  7 2015 (09:37:36).
>   Default target: x86_64-redhat-linux-gnu
>   Host CPU: x86-64
> 1
>
>
It is a bug then, which they've fixed later on. Upgrade is the only choice.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
General options:

  -help - Display available options (-help-hidden for
more)
  -help-list- Display list of available options
(-help-list-hidden for more)
  -version  - Display the version of this program
[rgowdapp@rgowdapp ~]$ clang-format -version ; echo $?
LLVM (http://llvm.org/):
  LLVM version 3.4.2
  Optimized build.
  Built Dec  7 2015 (09:37:36).
  Default target: x86_64-redhat-linux-gnu
  Host CPU: x86-64
1


On Fri, Oct 5, 2018 at 10:37 AM Sachidananda URS  wrote:

>
>
>
>  [rgowdapp@rgowdapp glusterfs]$ clang-format --version ; echo $?
>> LLVM (http://llvm.org/):
>>   LLVM version 3.4.2
>>   Optimized build.
>>   Built Dec  7 2015 (09:37:36).
>>   Default target: x86_64-redhat-linux-gnu
>>   Host CPU: x86-64
>> 1
>>
>> Wonder why clang-format --version has to return non-zero return code
>> though.
>>
>
> Maybe because the syntax is `clang-format -version' not --version.
> In the newer releases both -version and --version return 0.
> You can try -version, if it returns 0. We can fix `rfc.sh'
>
> But, what they document is -version.
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Sachidananda URS
 [rgowdapp@rgowdapp glusterfs]$ clang-format --version ; echo $?
> LLVM (http://llvm.org/):
>   LLVM version 3.4.2
>   Optimized build.
>   Built Dec  7 2015 (09:37:36).
>   Default target: x86_64-redhat-linux-gnu
>   Host CPU: x86-64
> 1
>
> Wonder why clang-format --version has to return non-zero return code
> though.
>

Maybe because the syntax is `clang-format -version' not --version.
In the newer releases both -version and --version return 0.
You can try -version, if it returns 0. We can fix `rfc.sh'

But, what they document is -version.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
On Fri, Oct 5, 2018 at 9:58 AM Sachidananda URS  wrote:

>
>
> On Fri, Oct 5, 2018 at 9:45 AM, Raghavendra Gowdappa 
> wrote:
>
>>
>>
>> On Fri, Oct 5, 2018 at 9:34 AM Raghavendra Gowdappa 
>> wrote:
>>
>>>
>>>
>>> On Fri, Oct 5, 2018 at 9:11 AM Kaushal M  wrote:
>>>
 On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:
 >
 >
 >
 > On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi 
 wrote:
 >>
 >> Can you try below diff in your rfc, and let me know if it works?
 >
 >
 > No. it didn't. I see the same error.
 >  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
 > + rebase_changes
 > + GIT_EDITOR=./rfc.sh
 > + git rebase -i origin/master
 > [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
 >  1 file changed, 10674 insertions(+), 11166 deletions(-)
 >  rewrite xlators/cluster/dht/src/dht-common.c (88%)
 > [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid
 linkto file
 >  1 file changed, 1 insertion(+), 1 deletion(-)
 > [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
 >  1 file changed, 8 insertions(+), 3 deletions(-)
 > Successfully rebased and updated refs/heads/1635145.
 > + check_backport
 > + moveon=N
 > + '[' master = master ']'
 > + return
 > + assert_diverge
 > + git diff origin/master..HEAD
 > + grep -q .
 > ++ git log -n1 --format=%b
 > ++ grep -ow -E
 '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
 > ++ awk -F '#' '{print $2}'
 > + reference=1635145
 > + '[' -z 1635145 ']'
 > ++ clang-format --version
 > + clang_format='LLVM (http://llvm.org/):
 >   LLVM version 3.4.2
 >   Optimized build.
 >   Built Dec  7 2015 (09:37:36).
 >   Default target: x86_64-redhat-linux-gnu
 >   Host CPU: x86-64'

 This is a pretty old version of clang. Maybe this is the problem?

>>>
>>> Yes. That's what I suspected too. Trying to get repos for the upgrade.
>>>
>>
>> But, what's surprising is that script exits.
>>
>
> What is the return code of clang-format? If it is non-zero then script
> will silently exit because that is what
> it is told to do.
>
> `#!/bin/sh -e' means exit on error.
>

You are right :).

 [rgowdapp@rgowdapp glusterfs]$ clang-format --version ; echo $?
LLVM (http://llvm.org/):
  LLVM version 3.4.2
  Optimized build.
  Built Dec  7 2015 (09:37:36).
  Default target: x86_64-redhat-linux-gnu
  Host CPU: x86-64
1

Wonder why clang-format --version has to return non-zero return code though.


> -sac
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Poornima Gurusiddaiah
Even I encountered the same. If you have clang version less than 6, it
exits. Upgrading clang version fixed it.

Regards,
Poornima

On Fri, Oct 5, 2018, 9:58 AM Sachidananda URS  wrote:

>
>
> On Fri, Oct 5, 2018 at 9:45 AM, Raghavendra Gowdappa 
> wrote:
>
>>
>>
>> On Fri, Oct 5, 2018 at 9:34 AM Raghavendra Gowdappa 
>> wrote:
>>
>>>
>>>
>>> On Fri, Oct 5, 2018 at 9:11 AM Kaushal M  wrote:
>>>
 On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:
 >
 >
 >
 > On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi 
 wrote:
 >>
 >> Can you try below diff in your rfc, and let me know if it works?
 >
 >
 > No. it didn't. I see the same error.
 >  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
 > + rebase_changes
 > + GIT_EDITOR=./rfc.sh
 > + git rebase -i origin/master
 > [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
 >  1 file changed, 10674 insertions(+), 11166 deletions(-)
 >  rewrite xlators/cluster/dht/src/dht-common.c (88%)
 > [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid
 linkto file
 >  1 file changed, 1 insertion(+), 1 deletion(-)
 > [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
 >  1 file changed, 8 insertions(+), 3 deletions(-)
 > Successfully rebased and updated refs/heads/1635145.
 > + check_backport
 > + moveon=N
 > + '[' master = master ']'
 > + return
 > + assert_diverge
 > + git diff origin/master..HEAD
 > + grep -q .
 > ++ git log -n1 --format=%b
 > ++ grep -ow -E
 '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
 > ++ awk -F '#' '{print $2}'
 > + reference=1635145
 > + '[' -z 1635145 ']'
 > ++ clang-format --version
 > + clang_format='LLVM (http://llvm.org/):
 >   LLVM version 3.4.2
 >   Optimized build.
 >   Built Dec  7 2015 (09:37:36).
 >   Default target: x86_64-redhat-linux-gnu
 >   Host CPU: x86-64'

 This is a pretty old version of clang. Maybe this is the problem?

>>>
>>> Yes. That's what I suspected too. Trying to get repos for the upgrade.
>>>
>>
>> But, what's surprising is that script exits.
>>
>
> What is the return code of clang-format? If it is non-zero then script
> will silently exit because that is what
> it is told to do.
>
> `#!/bin/sh -e' means exit on error.
>
> -sac
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Sachidananda URS
On Fri, Oct 5, 2018 at 9:45 AM, Raghavendra Gowdappa 
wrote:

>
>
> On Fri, Oct 5, 2018 at 9:34 AM Raghavendra Gowdappa 
> wrote:
>
>>
>>
>> On Fri, Oct 5, 2018 at 9:11 AM Kaushal M  wrote:
>>
>>> On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa 
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi 
>>> wrote:
>>> >>
>>> >> Can you try below diff in your rfc, and let me know if it works?
>>> >
>>> >
>>> > No. it didn't. I see the same error.
>>> >  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
>>> > + rebase_changes
>>> > + GIT_EDITOR=./rfc.sh
>>> > + git rebase -i origin/master
>>> > [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
>>> >  1 file changed, 10674 insertions(+), 11166 deletions(-)
>>> >  rewrite xlators/cluster/dht/src/dht-common.c (88%)
>>> > [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid linkto
>>> file
>>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>>> > [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
>>> >  1 file changed, 8 insertions(+), 3 deletions(-)
>>> > Successfully rebased and updated refs/heads/1635145.
>>> > + check_backport
>>> > + moveon=N
>>> > + '[' master = master ']'
>>> > + return
>>> > + assert_diverge
>>> > + git diff origin/master..HEAD
>>> > + grep -q .
>>> > ++ git log -n1 --format=%b
>>> > ++ grep -ow -E '([fF][iI][xX][eE][sS]|[uU][
>>> pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)
>>> ?(bz)?#[[:digit:]]+'
>>> > ++ awk -F '#' '{print $2}'
>>> > + reference=1635145
>>> > + '[' -z 1635145 ']'
>>> > ++ clang-format --version
>>> > + clang_format='LLVM (http://llvm.org/):
>>> >   LLVM version 3.4.2
>>> >   Optimized build.
>>> >   Built Dec  7 2015 (09:37:36).
>>> >   Default target: x86_64-redhat-linux-gnu
>>> >   Host CPU: x86-64'
>>>
>>> This is a pretty old version of clang. Maybe this is the problem?
>>>
>>
>> Yes. That's what I suspected too. Trying to get repos for the upgrade.
>>
>
> But, what's surprising is that script exits.
>

What is the return code of clang-format? If it is non-zero then script will
silently exit because that is what
it is told to do.

`#!/bin/sh -e' means exit on error.

-sac
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
On Fri, Oct 5, 2018 at 9:34 AM Raghavendra Gowdappa 
wrote:

>
>
> On Fri, Oct 5, 2018 at 9:11 AM Kaushal M  wrote:
>
>> On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa 
>> wrote:
>> >
>> >
>> >
>> > On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi 
>> wrote:
>> >>
>> >> Can you try below diff in your rfc, and let me know if it works?
>> >
>> >
>> > No. it didn't. I see the same error.
>> >  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
>> > + rebase_changes
>> > + GIT_EDITOR=./rfc.sh
>> > + git rebase -i origin/master
>> > [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
>> >  1 file changed, 10674 insertions(+), 11166 deletions(-)
>> >  rewrite xlators/cluster/dht/src/dht-common.c (88%)
>> > [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid linkto
>> file
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
>> >  1 file changed, 8 insertions(+), 3 deletions(-)
>> > Successfully rebased and updated refs/heads/1635145.
>> > + check_backport
>> > + moveon=N
>> > + '[' master = master ']'
>> > + return
>> > + assert_diverge
>> > + git diff origin/master..HEAD
>> > + grep -q .
>> > ++ git log -n1 --format=%b
>> > ++ grep -ow -E
>> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
>> > ++ awk -F '#' '{print $2}'
>> > + reference=1635145
>> > + '[' -z 1635145 ']'
>> > ++ clang-format --version
>> > + clang_format='LLVM (http://llvm.org/):
>> >   LLVM version 3.4.2
>> >   Optimized build.
>> >   Built Dec  7 2015 (09:37:36).
>> >   Default target: x86_64-redhat-linux-gnu
>> >   Host CPU: x86-64'
>>
>> This is a pretty old version of clang. Maybe this is the problem?
>>
>
> Yes. That's what I suspected too. Trying to get repos for the upgrade.
>

But, what's surprising is that script exits.


>
>
>>
>> >
>> >>
>> >> ```
>> >>>
>> >>> diff --git a/rfc.sh b/rfc.sh
>> >>> index 607fd7528f..4ffef26ca1 100755
>> >>> --- a/rfc.sh
>> >>> +++ b/rfc.sh
>> >>> @@ -321,21 +321,21 @@ main()
>> >>>  fi
>> >>>
>> >>>  # TODO: add clang-format command here. It will after the changes
>> are done everywhere else
>> >>> +set +e
>> >>>  clang_format=$(clang-format --version)
>> >>>  if [ ! -z "${clang_format}" ]; then
>> >>>  # Considering git show may not give any files as output
>> matching the
>> >>>  # criteria, good to tell script not to fail on error
>> >>> -set +e
>> >>>  list_of_files=$(git show --pretty="format:" --name-only |
>> >>>  grep -v "contrib/" | egrep --color=never
>> "*\.[ch]$");
>> >>>  if [ ! -z "${list_of_files}" ]; then
>> >>>  echo "${list_of_files}" | xargs clang-format -i
>> >>>  fi
>> >>> -set -e
>> >>>  else
>> >>>  echo "High probability of your patch not passing smoke due
>> to coding standard check"
>> >>>  echo "Please install 'clang-format' to format the patch
>> before submitting"
>> >>>  fi
>> >>> +set -e
>> >>>
>> >>>  if [ "$DRY_RUN" = 1 ]; then
>> >>>  drier='echo -e Please use the following command to send your
>> commits to review:\n\n'
>> >>
>> >> ```
>> >> -Amar
>> >>
>> >> On Fri, Oct 5, 2018 at 8:09 AM Raghavendra Gowdappa <
>> rgowd...@redhat.com> wrote:
>> >>>
>> >>> All,
>> >>>
>> >>> [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
>> >>> + rebase_changes
>> >>> + GIT_EDITOR=./rfc.sh
>> >>> + git rebase -i origin/master
>> >>> [detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
>> >>>  1 file changed, 10674 insertions(+), 11166 deletions(-)
>> >>>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
>> >>> [detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid
>> linkto file
>> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>> [detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
>> >>>  1 file changed, 8 insertions(+), 3 deletions(-)
>> >>> Successfully rebased and updated refs/heads/1635145.
>> >>> + check_backport
>> >>> + moveon=N
>> >>> + '[' master = master ']'
>> >>> + return
>> >>> + assert_diverge
>> >>> + git diff origin/master..HEAD
>> >>> + grep -q .
>> >>> ++ git log -n1 --format=%b
>> >>> ++ grep -ow -E
>> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
>> >>> ++ awk -F '#' '{print $2}'
>> >>> + reference=1635145
>> >>> + '[' -z 1635145 ']'
>> >>> ++ clang-format --version
>> >>> + clang_format='LLVM (http://llvm.org/):
>> >>>   LLVM version 3.4.2
>> >>>   Optimized build.
>> >>>   Built Dec  7 2015 (09:37:36).
>> >>>   Default target: x86_64-redhat-linux-gnu
>> >>>   Host CPU: x86-64'
>> >>>
>> >>> Looks like the script is exiting right after it completes
>> clang-format --version. Nothing after that statement gets executed (did it
>> crash? I don't see any cores). Any help is appreciated
>> >>>
>> >>> regards,
>> >>> Raghavendra
>> >>>
>> >>> ___
>> >>> 

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
On Fri, Oct 5, 2018 at 9:11 AM Kaushal M  wrote:

> On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa 
> wrote:
> >
> >
> >
> > On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi 
> wrote:
> >>
> >> Can you try below diff in your rfc, and let me know if it works?
> >
> >
> > No. it didn't. I see the same error.
> >  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
> > + rebase_changes
> > + GIT_EDITOR=./rfc.sh
> > + git rebase -i origin/master
> > [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
> >  1 file changed, 10674 insertions(+), 11166 deletions(-)
> >  rewrite xlators/cluster/dht/src/dht-common.c (88%)
> > [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid linkto
> file
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > Successfully rebased and updated refs/heads/1635145.
> > + check_backport
> > + moveon=N
> > + '[' master = master ']'
> > + return
> > + assert_diverge
> > + git diff origin/master..HEAD
> > + grep -q .
> > ++ git log -n1 --format=%b
> > ++ grep -ow -E
> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
> > ++ awk -F '#' '{print $2}'
> > + reference=1635145
> > + '[' -z 1635145 ']'
> > ++ clang-format --version
> > + clang_format='LLVM (http://llvm.org/):
> >   LLVM version 3.4.2
> >   Optimized build.
> >   Built Dec  7 2015 (09:37:36).
> >   Default target: x86_64-redhat-linux-gnu
> >   Host CPU: x86-64'
>
> This is a pretty old version of clang. Maybe this is the problem?
>

Yes. That's what I suspected too. Trying to get repos for the upgrade.


>
> >
> >>
> >> ```
> >>>
> >>> diff --git a/rfc.sh b/rfc.sh
> >>> index 607fd7528f..4ffef26ca1 100755
> >>> --- a/rfc.sh
> >>> +++ b/rfc.sh
> >>> @@ -321,21 +321,21 @@ main()
> >>>  fi
> >>>
> >>>  # TODO: add clang-format command here. It will after the changes
> are done everywhere else
> >>> +set +e
> >>>  clang_format=$(clang-format --version)
> >>>  if [ ! -z "${clang_format}" ]; then
> >>>  # Considering git show may not give any files as output
> matching the
> >>>  # criteria, good to tell script not to fail on error
> >>> -set +e
> >>>  list_of_files=$(git show --pretty="format:" --name-only |
> >>>  grep -v "contrib/" | egrep --color=never
> "*\.[ch]$");
> >>>  if [ ! -z "${list_of_files}" ]; then
> >>>  echo "${list_of_files}" | xargs clang-format -i
> >>>  fi
> >>> -set -e
> >>>  else
> >>>  echo "High probability of your patch not passing smoke due to
> coding standard check"
> >>>  echo "Please install 'clang-format' to format the patch
> before submitting"
> >>>  fi
> >>> +set -e
> >>>
> >>>  if [ "$DRY_RUN" = 1 ]; then
> >>>  drier='echo -e Please use the following command to send your
> commits to review:\n\n'
> >>
> >> ```
> >> -Amar
> >>
> >> On Fri, Oct 5, 2018 at 8:09 AM Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
> >>>
> >>> All,
> >>>
> >>> [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
> >>> + rebase_changes
> >>> + GIT_EDITOR=./rfc.sh
> >>> + git rebase -i origin/master
> >>> [detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
> >>>  1 file changed, 10674 insertions(+), 11166 deletions(-)
> >>>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
> >>> [detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid linkto
> file
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>> [detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
> >>>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>> Successfully rebased and updated refs/heads/1635145.
> >>> + check_backport
> >>> + moveon=N
> >>> + '[' master = master ']'
> >>> + return
> >>> + assert_diverge
> >>> + git diff origin/master..HEAD
> >>> + grep -q .
> >>> ++ git log -n1 --format=%b
> >>> ++ grep -ow -E
> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
> >>> ++ awk -F '#' '{print $2}'
> >>> + reference=1635145
> >>> + '[' -z 1635145 ']'
> >>> ++ clang-format --version
> >>> + clang_format='LLVM (http://llvm.org/):
> >>>   LLVM version 3.4.2
> >>>   Optimized build.
> >>>   Built Dec  7 2015 (09:37:36).
> >>>   Default target: x86_64-redhat-linux-gnu
> >>>   Host CPU: x86-64'
> >>>
> >>> Looks like the script is exiting right after it completes clang-format
> --version. Nothing after that statement gets executed (did it crash? I
> don't see any cores). Any help is appreciated
> >>>
> >>> regards,
> >>> Raghavendra
> >>>
> >>> ___
> >>> Gluster-devel mailing list
> >>> Gluster-devel@gluster.org
> >>> https://lists.gluster.org/mailman/listinfo/gluster-devel
> >>
> >>
> >>
> >> --
> >> Amar Tumballi (amarts)
> >
> > ___
> > Gluster-devel mailing list
> > 

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Kaushal M
On Fri, Oct 5, 2018 at 9:05 AM Raghavendra Gowdappa  wrote:
>
>
>
> On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi  wrote:
>>
>> Can you try below diff in your rfc, and let me know if it works?
>
>
> No. it didn't. I see the same error.
>  [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
> + rebase_changes
> + GIT_EDITOR=./rfc.sh
> + git rebase -i origin/master
> [detached HEAD e50667e] cluster/dht: clang-format dht-common.c
>  1 file changed, 10674 insertions(+), 11166 deletions(-)
>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
> [detached HEAD 0734847] cluster/dht: fixes to unlinking invalid linkto file
>  1 file changed, 1 insertion(+), 1 deletion(-)
> [detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
>  1 file changed, 8 insertions(+), 3 deletions(-)
> Successfully rebased and updated refs/heads/1635145.
> + check_backport
> + moveon=N
> + '[' master = master ']'
> + return
> + assert_diverge
> + git diff origin/master..HEAD
> + grep -q .
> ++ git log -n1 --format=%b
> ++ grep -ow -E 
> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
> ++ awk -F '#' '{print $2}'
> + reference=1635145
> + '[' -z 1635145 ']'
> ++ clang-format --version
> + clang_format='LLVM (http://llvm.org/):
>   LLVM version 3.4.2
>   Optimized build.
>   Built Dec  7 2015 (09:37:36).
>   Default target: x86_64-redhat-linux-gnu
>   Host CPU: x86-64'

This is a pretty old version of clang. Maybe this is the problem?

>
>>
>> ```
>>>
>>> diff --git a/rfc.sh b/rfc.sh
>>> index 607fd7528f..4ffef26ca1 100755
>>> --- a/rfc.sh
>>> +++ b/rfc.sh
>>> @@ -321,21 +321,21 @@ main()
>>>  fi
>>>
>>>  # TODO: add clang-format command here. It will after the changes are 
>>> done everywhere else
>>> +set +e
>>>  clang_format=$(clang-format --version)
>>>  if [ ! -z "${clang_format}" ]; then
>>>  # Considering git show may not give any files as output matching 
>>> the
>>>  # criteria, good to tell script not to fail on error
>>> -set +e
>>>  list_of_files=$(git show --pretty="format:" --name-only |
>>>  grep -v "contrib/" | egrep --color=never 
>>> "*\.[ch]$");
>>>  if [ ! -z "${list_of_files}" ]; then
>>>  echo "${list_of_files}" | xargs clang-format -i
>>>  fi
>>> -set -e
>>>  else
>>>  echo "High probability of your patch not passing smoke due to 
>>> coding standard check"
>>>  echo "Please install 'clang-format' to format the patch before 
>>> submitting"
>>>  fi
>>> +set -e
>>>
>>>  if [ "$DRY_RUN" = 1 ]; then
>>>  drier='echo -e Please use the following command to send your 
>>> commits to review:\n\n'
>>
>> ```
>> -Amar
>>
>> On Fri, Oct 5, 2018 at 8:09 AM Raghavendra Gowdappa  
>> wrote:
>>>
>>> All,
>>>
>>> [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
>>> + rebase_changes
>>> + GIT_EDITOR=./rfc.sh
>>> + git rebase -i origin/master
>>> [detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
>>>  1 file changed, 10674 insertions(+), 11166 deletions(-)
>>>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
>>> [detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid linkto file
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> [detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>> Successfully rebased and updated refs/heads/1635145.
>>> + check_backport
>>> + moveon=N
>>> + '[' master = master ']'
>>> + return
>>> + assert_diverge
>>> + git diff origin/master..HEAD
>>> + grep -q .
>>> ++ git log -n1 --format=%b
>>> ++ grep -ow -E 
>>> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
>>> ++ awk -F '#' '{print $2}'
>>> + reference=1635145
>>> + '[' -z 1635145 ']'
>>> ++ clang-format --version
>>> + clang_format='LLVM (http://llvm.org/):
>>>   LLVM version 3.4.2
>>>   Optimized build.
>>>   Built Dec  7 2015 (09:37:36).
>>>   Default target: x86_64-redhat-linux-gnu
>>>   Host CPU: x86-64'
>>>
>>> Looks like the script is exiting right after it completes clang-format 
>>> --version. Nothing after that statement gets executed (did it crash? I 
>>> don't see any cores). Any help is appreciated
>>>
>>> regards,
>>> Raghavendra
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
On Fri, Oct 5, 2018 at 8:53 AM Amar Tumballi  wrote:

> Can you try below diff in your rfc, and let me know if it works?
>

No. it didn't. I see the same error.
 [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
+ rebase_changes
+ GIT_EDITOR=./rfc.sh
+ git rebase -i origin/master
[detached HEAD e50667e] cluster/dht: clang-format dht-common.c
 1 file changed, 10674 insertions(+), 11166 deletions(-)
 rewrite xlators/cluster/dht/src/dht-common.c (88%)
[detached HEAD 0734847] cluster/dht: fixes to unlinking invalid linkto file
 1 file changed, 1 insertion(+), 1 deletion(-)
[detached HEAD 7aeba07] rfc.sh: test - DO NOT MERGE
 1 file changed, 8 insertions(+), 3 deletions(-)
Successfully rebased and updated refs/heads/1635145.
+ check_backport
+ moveon=N
+ '[' master = master ']'
+ return
+ assert_diverge
+ git diff origin/master..HEAD
+ grep -q .
++ git log -n1 --format=%b
++ grep -ow -E
'([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
++ awk -F '#' '{print $2}'
+ reference=1635145
+ '[' -z 1635145 ']'
++ clang-format --version
+ clang_format='LLVM (http://llvm.org/):
  LLVM version 3.4.2
  Optimized build.
  Built Dec  7 2015 (09:37:36).
  Default target: x86_64-redhat-linux-gnu
  Host CPU: x86-64'


> ```
>
>> diff --git a/rfc.sh b/rfc.sh
>> index 607fd7528f..4ffef26ca1 100755
>> --- a/rfc.sh
>> +++ b/rfc.sh
>> @@ -321,21 +321,21 @@ main()
>>  fi
>>
>>  # TODO: add clang-format command here. It will after the changes are
>> done everywhere else
>> +set +e
>>  clang_format=$(clang-format --version)
>>  if [ ! -z "${clang_format}" ]; then
>>  # Considering git show may not give any files as output matching
>> the
>>  # criteria, good to tell script not to fail on error
>> -set +e
>>  list_of_files=$(git show --pretty="format:" --name-only |
>>  grep -v "contrib/" | egrep --color=never
>> "*\.[ch]$");
>>  if [ ! -z "${list_of_files}" ]; then
>>  echo "${list_of_files}" | xargs clang-format -i
>>  fi
>> -set -e
>>  else
>>  echo "High probability of your patch not passing smoke due to
>> coding standard check"
>>  echo "Please install 'clang-format' to format the patch before
>> submitting"
>>  fi
>> +set -e
>>
>>  if [ "$DRY_RUN" = 1 ]; then
>>  drier='echo -e Please use the following command to send your
>> commits to review:\n\n'
>
> ```
> -Amar
>
> On Fri, Oct 5, 2018 at 8:09 AM Raghavendra Gowdappa 
> wrote:
>
>> All,
>>
>> [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
>> + rebase_changes
>> + GIT_EDITOR=./rfc.sh
>> + git rebase -i origin/master
>> [detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
>>  1 file changed, 10674 insertions(+), 11166 deletions(-)
>>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
>> [detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid linkto
>> file
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> [detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>> Successfully rebased and updated refs/heads/1635145.
>> + check_backport
>> + moveon=N
>> + '[' master = master ']'
>> + return
>> + assert_diverge
>> + git diff origin/master..HEAD
>> + grep -q .
>> ++ git log -n1 --format=%b
>> ++ grep -ow -E
>> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
>> ++ awk -F '#' '{print $2}'
>> + reference=1635145
>> + '[' -z 1635145 ']'
>> ++ clang-format --version
>> + clang_format='LLVM (http://llvm.org/):
>>   LLVM version 3.4.2
>>   Optimized build.
>>   Built Dec  7 2015 (09:37:36).
>>   Default target: x86_64-redhat-linux-gnu
>>   Host CPU: x86-64'
>>
>> Looks like the script is exiting right after it completes clang-format
>> --version. Nothing after that statement gets executed (did it crash? I
>> don't see any cores). Any help is appreciated
>>
>> regards,
>> Raghavendra
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> --
> Amar Tumballi (amarts)
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Amar Tumballi
Can you try below diff in your rfc, and let me know if it works?

```

> diff --git a/rfc.sh b/rfc.sh
> index 607fd7528f..4ffef26ca1 100755
> --- a/rfc.sh
> +++ b/rfc.sh
> @@ -321,21 +321,21 @@ main()
>  fi
>
>  # TODO: add clang-format command here. It will after the changes are
> done everywhere else
> +set +e
>  clang_format=$(clang-format --version)
>  if [ ! -z "${clang_format}" ]; then
>  # Considering git show may not give any files as output matching
> the
>  # criteria, good to tell script not to fail on error
> -set +e
>  list_of_files=$(git show --pretty="format:" --name-only |
>  grep -v "contrib/" | egrep --color=never
> "*\.[ch]$");
>  if [ ! -z "${list_of_files}" ]; then
>  echo "${list_of_files}" | xargs clang-format -i
>  fi
> -set -e
>  else
>  echo "High probability of your patch not passing smoke due to
> coding standard check"
>  echo "Please install 'clang-format' to format the patch before
> submitting"
>  fi
> +set -e
>
>  if [ "$DRY_RUN" = 1 ]; then
>  drier='echo -e Please use the following command to send your
> commits to review:\n\n'

```
-Amar

On Fri, Oct 5, 2018 at 8:09 AM Raghavendra Gowdappa 
wrote:

> All,
>
> [rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
> + rebase_changes
> + GIT_EDITOR=./rfc.sh
> + git rebase -i origin/master
> [detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
>  1 file changed, 10674 insertions(+), 11166 deletions(-)
>  rewrite xlators/cluster/dht/src/dht-common.c (88%)
> [detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid linkto file
>  1 file changed, 1 insertion(+), 1 deletion(-)
> [detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
>  1 file changed, 8 insertions(+), 3 deletions(-)
> Successfully rebased and updated refs/heads/1635145.
> + check_backport
> + moveon=N
> + '[' master = master ']'
> + return
> + assert_diverge
> + git diff origin/master..HEAD
> + grep -q .
> ++ git log -n1 --format=%b
> ++ grep -ow -E
> '([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
> ++ awk -F '#' '{print $2}'
> + reference=1635145
> + '[' -z 1635145 ']'
> ++ clang-format --version
> + clang_format='LLVM (http://llvm.org/):
>   LLVM version 3.4.2
>   Optimized build.
>   Built Dec  7 2015 (09:37:36).
>   Default target: x86_64-redhat-linux-gnu
>   Host CPU: x86-64'
>
> Looks like the script is exiting right after it completes clang-format
> --version. Nothing after that statement gets executed (did it crash? I
> don't see any cores). Any help is appreciated
>
> regards,
> Raghavendra
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] ./rfc.sh not pushing patch to gerrit

2018-10-04 Thread Raghavendra Gowdappa
All,

[rgowdapp@rgowdapp glusterfs]$ ./rfc.sh
+ rebase_changes
+ GIT_EDITOR=./rfc.sh
+ git rebase -i origin/master
[detached HEAD 34fabdd] cluster/dht: clang-format dht-common.c
 1 file changed, 10674 insertions(+), 11166 deletions(-)
 rewrite xlators/cluster/dht/src/dht-common.c (88%)
[detached HEAD 4bbcbf9] cluster/dht: fixes to unlinking invalid linkto file
 1 file changed, 1 insertion(+), 1 deletion(-)
[detached HEAD c5583ea] rfc.sh: test - DO NOT MERGE
 1 file changed, 8 insertions(+), 3 deletions(-)
Successfully rebased and updated refs/heads/1635145.
+ check_backport
+ moveon=N
+ '[' master = master ']'
+ return
+ assert_diverge
+ git diff origin/master..HEAD
+ grep -q .
++ git log -n1 --format=%b
++ grep -ow -E
'([fF][iI][xX][eE][sS]|[uU][pP][dD][aA][tT][eE][sS])(:)?[[:space:]]+(gluster\/glusterfs)?(bz)?#[[:digit:]]+'
++ awk -F '#' '{print $2}'
+ reference=1635145
+ '[' -z 1635145 ']'
++ clang-format --version
+ clang_format='LLVM (http://llvm.org/):
  LLVM version 3.4.2
  Optimized build.
  Built Dec  7 2015 (09:37:36).
  Default target: x86_64-redhat-linux-gnu
  Host CPU: x86-64'

Looks like the script is exiting right after it completes clang-format
--version. Nothing after that statement gets executed (did it crash? I
don't see any cores). Any help is appreciated

regards,
Raghavendra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] index_lookup segfault in glusterfsd brick process

2018-10-04 Thread 김경표
Thanks for reply.

2018년 10월 4일 (목) 오후 5:31, Ravishankar N 님이 작성:

>
>
> On 10/04/2018 01:57 PM, Pranith Kumar Karampuri wrote:
>
> it indicates that inode-table is NULL. Is there a possibility to upload
> the core somewhere for us to take a closer look?
>
> This is core file.

http://ac2repo.gluesys.com/ac2repo/down/core.50570.tgz

Best regrads.

- kpkim
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread FNU Raghavendra Manjunath
On Thu, Oct 4, 2018 at 2:47 PM Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

>
>
> On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
> wrote:
>
>> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
>> > RC1 would be around 24th of Sep. with final release tagging around 1st
>> > of Oct.
>>
>> RC1 now stands to be tagged tomorrow, and patches that are being
>> targeted for a back port include,
>>
>> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
>> mux cases)
>>
>> @RaBhat working on this.
>>
>>

The following patch addresses the issue in release-5 branch.

https://review.gluster.org/#/c/glusterfs/+/21347/


Regards,
Raghavendra

> 2) Py3 corrections in master
>>
>> @Kotresh are all changes made to master backported to release-5 (may not
>> be merged, but looking at if they are backported and ready for merge)?
>>
>
> All changes made to master are backported to release-5. But py3 support is
> still not complete.
>
>>
>> 3) Release notes review and updates with GD2 content pending
>>
>> @Kaushal/GD2 team can we get the updates as required?
>> https://review.gluster.org/c/glusterfs/+/21303
>>
>> 4) This bug [2] was filed when we released 4.0.
>>
>> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
>> missing and hence post-upgrade clients failing the mount). This is
>> possibly the last chance to fix it.
>>
>> Glusterd and protocol maintainers, can you chime in, if this bug needs
>> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>>
>> The tracker bug [1] does not have any other blockers against it, hence
>> assuming we are not tracking/waiting on anything other than the set above.
>>
>> Thanks,
>> Shyam
>>
>> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
>> [2] Potential upgrade bug:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>>
>
>
> --
> Thanks and Regards,
> Kotresh H R
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Kotresh Hiremath Ravishankar
On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
wrote:

> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> > RC1 would be around 24th of Sep. with final release tagging around 1st
> > of Oct.
>
> RC1 now stands to be tagged tomorrow, and patches that are being
> targeted for a back port include,
>
> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> mux cases)
>
> @RaBhat working on this.
>
> 2) Py3 corrections in master
>
> @Kotresh are all changes made to master backported to release-5 (may not
> be merged, but looking at if they are backported and ready for merge)?
>

All changes made to master are backported to release-5. But py3 support is
still not complete.

>
> 3) Release notes review and updates with GD2 content pending
>
> @Kaushal/GD2 team can we get the updates as required?
> https://review.gluster.org/c/glusterfs/+/21303
>
> 4) This bug [2] was filed when we released 4.0.
>
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
>
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>
> The tracker bug [1] does not have any other blockers against it, hence
> assuming we are not tracking/waiting on anything other than the set above.
>
> Thanks,
> Shyam
>
> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> [2] Potential upgrade bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>


-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Sanju Rakonde
Deepshika,

I see that tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t test
is failed in today's run #273. But I couldn't get logs from
https://ci-logs.gluster.org/distributed-regression-logs-273.tgz , I see 404
Not found error with message saying "The requested URL
/distributed-regression-logs-273.tgz was not found on this server."
Please help me in getting the logs.

On Thu, Oct 4, 2018 at 10:31 PM Atin Mukherjee  wrote:

> Deepshika,
>
> Please keep us posted on if you see the particular glusterd test failing
> again.  It’ll be great to see this nightly job green sooner than later :-) .
>
> On Thu, 4 Oct 2018 at 15:07, Deepshikha Khandelwal 
> wrote:
>
>> On Thu, Oct 4, 2018 at 6:10 AM Sanju Rakonde  wrote:
>> >
>> >
>> >
>> > On Wed, Oct 3, 2018 at 3:26 PM Deepshikha Khandelwal <
>> dkhan...@redhat.com> wrote:
>> >>
>> >> Hello folks,
>> >>
>> >> Distributed-regression job[1] is now a part of Gluster's
>> >> nightly-master build pipeline. The following are the issues we have
>> >> resolved since we started working on this:
>> >>
>> >> 1) Collecting gluster logs from servers.
>> >> 2) Tests failed due to infra-related issues have been fixed.
>> >> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>> >>
>> >> To get time down to 40 minutes needs your help!
>> >>
>> >> Currently, there is a test that is failing:
>> >>
>> >> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>> >>
>> >> This needs fixing first.
>> >
>> >
>> > Where can I get the logs of this test case? In
>> https://build.gluster.org/job/distributed-regression/264/console I see
>> this test case is failed and re-attempted. But I couldn't find logs.
>> There's a link in the end of console output where you can look for the
>> logs of failed tests.
>> We had a bug in the setup and the logs were not getting saved. We've
>> fixed this and future jobs should have the logs at the log collector's
>> link show up in the console output.
>>
>> >>
>> >>
>> >> There's a test that takes 14 minutes to complete -
>> >> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
>> >> 14 minutes is not something we can distribute. Can we look at how we
>> >> can speed this up[2]? When this test fails, it is re-attempted,
>> >> further increasing the time. This happens in the regular
>> >> centos7-regression job as well.
>> >>
>> >> If you see any other issues, please file a bug[3].
>> >>
>> >> [1]: https://build.gluster.org/job/distributed-regression
>> >> [2]: https://build.gluster.org/job/distributed-regression/264/console
>> >> [3]:
>> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
>> >>
>> >> Thanks,
>> >> Deepshikha Khandelwal
>> >> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
>> wrote:
>> >> >>
>> >> >>
>> >> >>
>> >> >>> There are currently a few known issues:
>> >> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
>> >> >>
>> >> >>
>> >> >> If I look at the activities involved with regression failures, this
>> can wait.
>> >> >
>> >> >
>> >> > Well, we can't debug the current failures without having the logs.
>> So this has to be fixed first.
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> * A few tests fail due to infra-related issues like geo-rep tests.
>> >> >>
>> >> >>
>> >> >> Please open bugs for this, so we can track them, and take it to
>> closure.
>> >> >
>> >> >
>> >> > These are failing due to infra reasons. Most likely subtle
>> differences in the setup of these nodes vs our normal nodes. We'll only be
>> able to debug them once we get the logs. I know the geo-rep ones are easy
>> to fix. The playbook for setting up geo-rep correctly just didn't make it
>> over to the playbook used for these images.
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60
>> minutes)
>> >> >>
>> >> >>
>> >> >> Time can change with more tests added, and also please plan to have
>> number of server as 1 to n.
>> >> >
>> >> >
>> >> > While the n is configurable, however it will be fixed to a single
>> digit number for now. We will need to place *some* limitation somewhere or
>> else we'll end up not being able to control our cloud bills.
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> * We've only tested plain regressions. ASAN and Valgrind are
>> currently untested.
>> >> >>
>> >> >>
>> >> >> Great to have it running not 'per patch', but as nightly, or weekly
>> to start with.
>> >> >
>> >> >
>> >> > This is currently not targeted until we phase out current
>> regressions.
>> >> >
>> >> >>>
>> >> >>>
>> >> >>> Before bringing it into production, we'll run this job nightly and
>> >> >>> watch it for a month to debug the other failures.
>> >> >>>
>> >> >>
>> >> >> I would say, bring it to production sooner, say 2 weeks, and also
>> plan to have the current regression as is with a special command like 'run
>> regression 

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Atin Mukherjee
Deepshika,

Please keep us posted on if you see the particular glusterd test failing
again.  It’ll be great to see this nightly job green sooner than later :-) .

On Thu, 4 Oct 2018 at 15:07, Deepshikha Khandelwal 
wrote:

> On Thu, Oct 4, 2018 at 6:10 AM Sanju Rakonde  wrote:
> >
> >
> >
> > On Wed, Oct 3, 2018 at 3:26 PM Deepshikha Khandelwal <
> dkhan...@redhat.com> wrote:
> >>
> >> Hello folks,
> >>
> >> Distributed-regression job[1] is now a part of Gluster's
> >> nightly-master build pipeline. The following are the issues we have
> >> resolved since we started working on this:
> >>
> >> 1) Collecting gluster logs from servers.
> >> 2) Tests failed due to infra-related issues have been fixed.
> >> 3) Time taken to run regression testing reduced to ~50-60 minutes.
> >>
> >> To get time down to 40 minutes needs your help!
> >>
> >> Currently, there is a test that is failing:
> >>
> >> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
> >>
> >> This needs fixing first.
> >
> >
> > Where can I get the logs of this test case? In
> https://build.gluster.org/job/distributed-regression/264/console I see
> this test case is failed and re-attempted. But I couldn't find logs.
> There's a link in the end of console output where you can look for the
> logs of failed tests.
> We had a bug in the setup and the logs were not getting saved. We've
> fixed this and future jobs should have the logs at the log collector's
> link show up in the console output.
>
> >>
> >>
> >> There's a test that takes 14 minutes to complete -
> >> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
> >> 14 minutes is not something we can distribute. Can we look at how we
> >> can speed this up[2]? When this test fails, it is re-attempted,
> >> further increasing the time. This happens in the regular
> >> centos7-regression job as well.
> >>
> >> If you see any other issues, please file a bug[3].
> >>
> >> [1]: https://build.gluster.org/job/distributed-regression
> >> [2]: https://build.gluster.org/job/distributed-regression/264/console
> >> [3]:
> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
> >>
> >> Thanks,
> >> Deepshikha Khandelwal
> >> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
> wrote:
> >> >>
> >> >>
> >> >>
> >> >>> There are currently a few known issues:
> >> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
> >> >>
> >> >>
> >> >> If I look at the activities involved with regression failures, this
> can wait.
> >> >
> >> >
> >> > Well, we can't debug the current failures without having the logs. So
> this has to be fixed first.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * A few tests fail due to infra-related issues like geo-rep tests.
> >> >>
> >> >>
> >> >> Please open bugs for this, so we can track them, and take it to
> closure.
> >> >
> >> >
> >> > These are failing due to infra reasons. Most likely subtle
> differences in the setup of these nodes vs our normal nodes. We'll only be
> able to debug them once we get the logs. I know the geo-rep ones are easy
> to fix. The playbook for setting up geo-rep correctly just didn't make it
> over to the playbook used for these images.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60
> minutes)
> >> >>
> >> >>
> >> >> Time can change with more tests added, and also please plan to have
> number of server as 1 to n.
> >> >
> >> >
> >> > While the n is configurable, however it will be fixed to a single
> digit number for now. We will need to place *some* limitation somewhere or
> else we'll end up not being able to control our cloud bills.
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>> * We've only tested plain regressions. ASAN and Valgrind are
> currently untested.
> >> >>
> >> >>
> >> >> Great to have it running not 'per patch', but as nightly, or weekly
> to start with.
> >> >
> >> >
> >> > This is currently not targeted until we phase out current regressions.
> >> >
> >> >>>
> >> >>>
> >> >>> Before bringing it into production, we'll run this job nightly and
> >> >>> watch it for a month to debug the other failures.
> >> >>>
> >> >>
> >> >> I would say, bring it to production sooner, say 2 weeks, and also
> plan to have the current regression as is with a special command like 'run
> regression in-one-machine' in gerrit (or something similar) with voting
> rights, so we can fall back to this method if something is broken in
> parallel testing.
> >> >>
> >> >> I have seen that regardless of amount of time we put some scripts in
> testing, the day we move to production, some thing would be broken. So, let
> that happen earlier than later, so it would help next release branching
> out. Don't want to be stuck for branching due to infra failures.
> >> >
> >> >
> >> > Having two regression jobs that can vote is going to cause more
> confusion than it's worth. There are a 

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Shyam Ranganathan
On 10/04/2018 12:01 PM, Atin Mukherjee wrote:
> 4) This bug [2] was filed when we released 4.0.
> 
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
> 
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
> 
> 
> This is a bad bug to live with. OTOH, I do not have an immediate
> solution in my mind on how to make sure (a) these options when
> reintroduced are made no-ops, especially they will be disallowed to tune
> (with out dirty option check hacks at volume set staging code) . If
> we're to tag RC1 tomorrow, I wouldn't be able to take a risk to commit
> this change.
> 
> Can we actually have a note in our upgrade guide to document that if
> you're upgrading to 4.1 or higher version make sure to disable these
> options before the upgrade to mitigate this?

Yes, adding this to the "Major Issues" section in the release notes as
well as noting it in the upgrade guide is possible. I will go with this
option for now, as we do not have complaints around this from 4.0/4.1
releases (which have the same issue as well).
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Atin Mukherjee
On Thu, Oct 4, 2018 at 9:03 PM Shyam Ranganathan 
wrote:

> On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> > RC1 would be around 24th of Sep. with final release tagging around 1st
> > of Oct.
>
> RC1 now stands to be tagged tomorrow, and patches that are being
> targeted for a back port include,
>
> 1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
> mux cases)
>
> @RaBhat working on this.
>
> 2) Py3 corrections in master
>
> @Kotresh are all changes made to master backported to release-5 (may not
> be merged, but looking at if they are backported and ready for merge)?
>
> 3) Release notes review and updates with GD2 content pending
>
> @Kaushal/GD2 team can we get the updates as required?
> https://review.gluster.org/c/glusterfs/+/21303
>
> 4) This bug [2] was filed when we released 4.0.
>
> The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
> missing and hence post-upgrade clients failing the mount). This is
> possibly the last chance to fix it.
>
> Glusterd and protocol maintainers, can you chime in, if this bug needs
> to be and can be fixed? (thanks to @anoopcs for pointing it out to me)
>

This is a bad bug to live with. OTOH, I do not have an immediate solution
in my mind on how to make sure (a) these options when reintroduced are made
no-ops, especially they will be disallowed to tune (with out dirty option
check hacks at volume set staging code) . If we're to tag RC1 tomorrow, I
wouldn't be able to take a risk to commit this change.

Can we actually have a note in our upgrade guide to document that if you're
upgrading to 4.1 or higher version make sure to disable these options
before the upgrade to mitigate this?


> The tracker bug [1] does not have any other blockers against it, hence
> assuming we are not tracking/waiting on anything other than the set above.
>
> Thanks,
> Shyam
>
> [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
> [2] Potential upgrade bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1540659
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Release 5: Branched and further dates

2018-10-04 Thread Shyam Ranganathan
On 09/13/2018 11:10 AM, Shyam Ranganathan wrote:
> RC1 would be around 24th of Sep. with final release tagging around 1st
> of Oct.

RC1 now stands to be tagged tomorrow, and patches that are being
targeted for a back port include,

1) https://review.gluster.org/c/glusterfs/+/21314 (snapshot volfile in
mux cases)

@RaBhat working on this.

2) Py3 corrections in master

@Kotresh are all changes made to master backported to release-5 (may not
be merged, but looking at if they are backported and ready for merge)?

3) Release notes review and updates with GD2 content pending

@Kaushal/GD2 team can we get the updates as required?
https://review.gluster.org/c/glusterfs/+/21303

4) This bug [2] was filed when we released 4.0.

The issue has not bitten us in 4.0 or in 4.1 (yet!) (i.e the options
missing and hence post-upgrade clients failing the mount). This is
possibly the last chance to fix it.

Glusterd and protocol maintainers, can you chime in, if this bug needs
to be and can be fixed? (thanks to @anoopcs for pointing it out to me)

The tracker bug [1] does not have any other blockers against it, hence
assuming we are not tracking/waiting on anything other than the set above.

Thanks,
Shyam

[1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-5.0
[2] Potential upgrade bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1540659
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] struct stat.st_blocks inconsistent across cluster nodes

2018-10-04 Thread Ralph Böhme

Hey folks,

On Mon, Sep 03, 2018 at 01:16:13PM +0200, Ralph Böhme wrote:

On Fri, Aug 31, 2018 at 09:55:33PM +0530, Amar Tumballi wrote:

A RFE would help to track this. Will have a look into this.


https://github.com/gluster/glusterfs/issues/509


is this on anyones radar? Thanks!

-slow

--
Ralph Boehme, Samba Team   https://samba.org/
Samba Developer, SerNet GmbH   https://sernet.de/en/samba/
GPG Key Fingerprint:   FAE2 C608 8A24 2520 51C5
  59E4 AA1E 9B71 2639 9E46
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Gluster prometheus: update

2018-10-04 Thread Ram Edara
Hello All,

Here is the status update for gluster-prometheus
 project.

last week we worked on following

   - discussion on how to ship GD1 vs GD2 exporters. PRs were sent for the
   same
   - design of interface for getting metrics from GD1 vs GD2. PRs were sent
   and discussion is going on
   - to have configuration to enable/disable certain prometheus collectors,
   PRs were sent for this .
   - to get volume metrics we need to chose one glusterd node, for this PR
   were sent
   - prometheus schema for gluster was designed and PR is approved for same.

Plans for Next week:

   - Discussion on how to ship rpms for gd1 exporter and gd2 exporter and
   dependencies for doing same.
   - ship GD1 rpm and edit the templates of gluster containers so that
   glusterd and exporter run in same pod.
   - work on making uniform interface for providing metrics for GD1 and GD2
   for gluster-prometheus. need to re-factor the code for same. work on PR
   comments and incorporate them.

-Thanks
Venkat
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Deepshikha Khandelwal
On Thu, Oct 4, 2018 at 6:10 AM Sanju Rakonde  wrote:
>
>
>
> On Wed, Oct 3, 2018 at 3:26 PM Deepshikha Khandelwal  
> wrote:
>>
>> Hello folks,
>>
>> Distributed-regression job[1] is now a part of Gluster's
>> nightly-master build pipeline. The following are the issues we have
>> resolved since we started working on this:
>>
>> 1) Collecting gluster logs from servers.
>> 2) Tests failed due to infra-related issues have been fixed.
>> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>>
>> To get time down to 40 minutes needs your help!
>>
>> Currently, there is a test that is failing:
>>
>> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>>
>> This needs fixing first.
>
>
> Where can I get the logs of this test case? In 
> https://build.gluster.org/job/distributed-regression/264/console I see this 
> test case is failed and re-attempted. But I couldn't find logs.
There's a link in the end of console output where you can look for the
logs of failed tests.
We had a bug in the setup and the logs were not getting saved. We've
fixed this and future jobs should have the logs at the log collector's
link show up in the console output.

>>
>>
>> There's a test that takes 14 minutes to complete -
>> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
>> 14 minutes is not something we can distribute. Can we look at how we
>> can speed this up[2]? When this test fails, it is re-attempted,
>> further increasing the time. This happens in the regular
>> centos7-regression job as well.
>>
>> If you see any other issues, please file a bug[3].
>>
>> [1]: https://build.gluster.org/job/distributed-regression
>> [2]: https://build.gluster.org/job/distributed-regression/264/console
>> [3]: 
>> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
>>
>> Thanks,
>> Deepshikha Khandelwal
>> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
>> >
>> >
>> >
>> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi  wrote:
>> >>
>> >>
>> >>
>> >>> There are currently a few known issues:
>> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
>> >>
>> >>
>> >> If I look at the activities involved with regression failures, this can 
>> >> wait.
>> >
>> >
>> > Well, we can't debug the current failures without having the logs. So this 
>> > has to be fixed first.
>> >
>> >>
>> >>
>> >>>
>> >>> * A few tests fail due to infra-related issues like geo-rep tests.
>> >>
>> >>
>> >> Please open bugs for this, so we can track them, and take it to closure.
>> >
>> >
>> > These are failing due to infra reasons. Most likely subtle differences in 
>> > the setup of these nodes vs our normal nodes. We'll only be able to debug 
>> > them once we get the logs. I know the geo-rep ones are easy to fix. The 
>> > playbook for setting up geo-rep correctly just didn't make it over to the 
>> > playbook used for these images.
>> >
>> >>
>> >>
>> >>>
>> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60 minutes)
>> >>
>> >>
>> >> Time can change with more tests added, and also please plan to have 
>> >> number of server as 1 to n.
>> >
>> >
>> > While the n is configurable, however it will be fixed to a single digit 
>> > number for now. We will need to place *some* limitation somewhere or else 
>> > we'll end up not being able to control our cloud bills.
>> >
>> >>
>> >>
>> >>>
>> >>> * We've only tested plain regressions. ASAN and Valgrind are currently 
>> >>> untested.
>> >>
>> >>
>> >> Great to have it running not 'per patch', but as nightly, or weekly to 
>> >> start with.
>> >
>> >
>> > This is currently not targeted until we phase out current regressions.
>> >
>> >>>
>> >>>
>> >>> Before bringing it into production, we'll run this job nightly and
>> >>> watch it for a month to debug the other failures.
>> >>>
>> >>
>> >> I would say, bring it to production sooner, say 2 weeks, and also plan to 
>> >> have the current regression as is with a special command like 'run 
>> >> regression in-one-machine' in gerrit (or something similar) with voting 
>> >> rights, so we can fall back to this method if something is broken in 
>> >> parallel testing.
>> >>
>> >> I have seen that regardless of amount of time we put some scripts in 
>> >> testing, the day we move to production, some thing would be broken. So, 
>> >> let that happen earlier than later, so it would help next release 
>> >> branching out. Don't want to be stuck for branching due to infra failures.
>> >
>> >
>> > Having two regression jobs that can vote is going to cause more confusion 
>> > than it's worth. There are a couple of intermittent memory issues with the 
>> > test script that we need to debug and fix before I'm comfortable in making 
>> > this job a voting job. We've worked around these problems right now. It 
>> > still pops up now and again. The fact that things break often is not an 
>> > excuse to prevent avoidable failures.  The one month timeline was taken 
>> > with all these 

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Pranith Kumar Karampuri
On Thu, Oct 4, 2018 at 2:15 PM Xavi Hernandez  wrote:

> On Thu, Oct 4, 2018 at 9:47 AM Amar Tumballi  wrote:
>
>>
>>
>> On Thu, Oct 4, 2018 at 12:54 PM Xavi Hernandez 
>> wrote:
>>
>>> On Wed, Oct 3, 2018 at 11:57 AM Deepshikha Khandelwal <
>>> dkhan...@redhat.com> wrote:
>>>
 Hello folks,

 Distributed-regression job[1] is now a part of Gluster's
 nightly-master build pipeline. The following are the issues we have
 resolved since we started working on this:

 1) Collecting gluster logs from servers.
 2) Tests failed due to infra-related issues have been fixed.
 3) Time taken to run regression testing reduced to ~50-60 minutes.

 To get time down to 40 minutes needs your help!

 Currently, there is a test that is failing:

 tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t

 This needs fixing first.

 There's a test that takes 14 minutes to complete -
 `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
 14 minutes is not something we can distribute. Can we look at how we
 can speed this up[2]? When this test fails, it is re-attempted,
 further increasing the time. This happens in the regular
 centos7-regression job as well.

>>>
>>> I made a change [1] to reduce the amount of time this tests needs. With
>>> this change the test completes in about 90 seconds. It would need some
>>> reviews from maintainers though.
>>>
>>> Do you want I send a patch with this change alone ?
>>>
>>> Xavi
>>>
>>> [1]
>>> https://review.gluster.org/#/c/glusterfs/+/19254/22/tests/bugs/index/bug-1559004-EMLINK-handling.t
>>>
>>>
>>
>> Yes please! It would be useful! We can merge it sooner that way!
>>
>
> Patch: https://review.gluster.org/21341
>

Merged!


>
>
>>
>> -Amar
>>
>>
>>>
 If you see any other issues, please file a bug[3].

 [1]: https://build.gluster.org/job/distributed-regression
 [2]: https://build.gluster.org/job/distributed-regression/264/console
 [3]:
 https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure

 Thanks,
 Deepshikha Khandelwal
 On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
 >
 >
 >
 > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
 wrote:
 >>
 >>
 >>
 >>> There are currently a few known issues:
 >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
 >>
 >>
 >> If I look at the activities involved with regression failures, this
 can wait.
 >
 >
 > Well, we can't debug the current failures without having the logs. So
 this has to be fixed first.
 >
 >>
 >>
 >>>
 >>> * A few tests fail due to infra-related issues like geo-rep tests.
 >>
 >>
 >> Please open bugs for this, so we can track them, and take it to
 closure.
 >
 >
 > These are failing due to infra reasons. Most likely subtle
 differences in the setup of these nodes vs our normal nodes. We'll only be
 able to debug them once we get the logs. I know the geo-rep ones are easy
 to fix. The playbook for setting up geo-rep correctly just didn't make it
 over to the playbook used for these images.
 >
 >>
 >>
 >>>
 >>> * Takes ~80 minutes with 7 distributed servers (targetting 60
 minutes)
 >>
 >>
 >> Time can change with more tests added, and also please plan to have
 number of server as 1 to n.
 >
 >
 > While the n is configurable, however it will be fixed to a single
 digit number for now. We will need to place *some* limitation somewhere or
 else we'll end up not being able to control our cloud bills.
 >
 >>
 >>
 >>>
 >>> * We've only tested plain regressions. ASAN and Valgrind are
 currently untested.
 >>
 >>
 >> Great to have it running not 'per patch', but as nightly, or weekly
 to start with.
 >
 >
 > This is currently not targeted until we phase out current regressions.
 >
 >>>
 >>>
 >>> Before bringing it into production, we'll run this job nightly and
 >>> watch it for a month to debug the other failures.
 >>>
 >>
 >> I would say, bring it to production sooner, say 2 weeks, and also
 plan to have the current regression as is with a special command like 'run
 regression in-one-machine' in gerrit (or something similar) with voting
 rights, so we can fall back to this method if something is broken in
 parallel testing.
 >>
 >> I have seen that regardless of amount of time we put some scripts in
 testing, the day we move to production, some thing would be broken. So, let
 that happen earlier than later, so it would help next release branching
 out. Don't want to be stuck for branching due to infra failures.
 >
 >
 > Having two regression jobs that can vote is going to cause more
 confusion than it's 

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Xavi Hernandez
On Thu, Oct 4, 2018 at 9:47 AM Amar Tumballi  wrote:

>
>
> On Thu, Oct 4, 2018 at 12:54 PM Xavi Hernandez 
> wrote:
>
>> On Wed, Oct 3, 2018 at 11:57 AM Deepshikha Khandelwal <
>> dkhan...@redhat.com> wrote:
>>
>>> Hello folks,
>>>
>>> Distributed-regression job[1] is now a part of Gluster's
>>> nightly-master build pipeline. The following are the issues we have
>>> resolved since we started working on this:
>>>
>>> 1) Collecting gluster logs from servers.
>>> 2) Tests failed due to infra-related issues have been fixed.
>>> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>>>
>>> To get time down to 40 minutes needs your help!
>>>
>>> Currently, there is a test that is failing:
>>>
>>> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>>>
>>> This needs fixing first.
>>>
>>> There's a test that takes 14 minutes to complete -
>>> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
>>> 14 minutes is not something we can distribute. Can we look at how we
>>> can speed this up[2]? When this test fails, it is re-attempted,
>>> further increasing the time. This happens in the regular
>>> centos7-regression job as well.
>>>
>>
>> I made a change [1] to reduce the amount of time this tests needs. With
>> this change the test completes in about 90 seconds. It would need some
>> reviews from maintainers though.
>>
>> Do you want I send a patch with this change alone ?
>>
>> Xavi
>>
>> [1]
>> https://review.gluster.org/#/c/glusterfs/+/19254/22/tests/bugs/index/bug-1559004-EMLINK-handling.t
>>
>>
>
> Yes please! It would be useful! We can merge it sooner that way!
>

Patch: https://review.gluster.org/21341


>
> -Amar
>
>
>>
>>> If you see any other issues, please file a bug[3].
>>>
>>> [1]: https://build.gluster.org/job/distributed-regression
>>> [2]: https://build.gluster.org/job/distributed-regression/264/console
>>> [3]:
>>> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
>>>
>>> Thanks,
>>> Deepshikha Khandelwal
>>> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
>>> >
>>> >
>>> >
>>> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
>>> wrote:
>>> >>
>>> >>
>>> >>
>>> >>> There are currently a few known issues:
>>> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
>>> >>
>>> >>
>>> >> If I look at the activities involved with regression failures, this
>>> can wait.
>>> >
>>> >
>>> > Well, we can't debug the current failures without having the logs. So
>>> this has to be fixed first.
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> * A few tests fail due to infra-related issues like geo-rep tests.
>>> >>
>>> >>
>>> >> Please open bugs for this, so we can track them, and take it to
>>> closure.
>>> >
>>> >
>>> > These are failing due to infra reasons. Most likely subtle differences
>>> in the setup of these nodes vs our normal nodes. We'll only be able to
>>> debug them once we get the logs. I know the geo-rep ones are easy to fix.
>>> The playbook for setting up geo-rep correctly just didn't make it over to
>>> the playbook used for these images.
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60
>>> minutes)
>>> >>
>>> >>
>>> >> Time can change with more tests added, and also please plan to have
>>> number of server as 1 to n.
>>> >
>>> >
>>> > While the n is configurable, however it will be fixed to a single
>>> digit number for now. We will need to place *some* limitation somewhere or
>>> else we'll end up not being able to control our cloud bills.
>>> >
>>> >>
>>> >>
>>> >>>
>>> >>> * We've only tested plain regressions. ASAN and Valgrind are
>>> currently untested.
>>> >>
>>> >>
>>> >> Great to have it running not 'per patch', but as nightly, or weekly
>>> to start with.
>>> >
>>> >
>>> > This is currently not targeted until we phase out current regressions.
>>> >
>>> >>>
>>> >>>
>>> >>> Before bringing it into production, we'll run this job nightly and
>>> >>> watch it for a month to debug the other failures.
>>> >>>
>>> >>
>>> >> I would say, bring it to production sooner, say 2 weeks, and also
>>> plan to have the current regression as is with a special command like 'run
>>> regression in-one-machine' in gerrit (or something similar) with voting
>>> rights, so we can fall back to this method if something is broken in
>>> parallel testing.
>>> >>
>>> >> I have seen that regardless of amount of time we put some scripts in
>>> testing, the day we move to production, some thing would be broken. So, let
>>> that happen earlier than later, so it would help next release branching
>>> out. Don't want to be stuck for branching due to infra failures.
>>> >
>>> >
>>> > Having two regression jobs that can vote is going to cause more
>>> confusion than it's worth. There are a couple of intermittent memory issues
>>> with the test script that we need to debug and fix before I'm comfortable
>>> in making this job a voting job. We've worked around these problems right
>>> now. It 

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Sankarshan Mukhopadhyay
On Thu, Oct 4, 2018 at 6:10 AM Sanju Rakonde  wrote:
> On Wed, Oct 3, 2018 at 3:26 PM Deepshikha Khandelwal  
> wrote:
>>
>> Hello folks,
>>
>> Distributed-regression job[1] is now a part of Gluster's
>> nightly-master build pipeline. The following are the issues we have
>> resolved since we started working on this:
>>
>> 1) Collecting gluster logs from servers.
>> 2) Tests failed due to infra-related issues have been fixed.
>> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>>
>> To get time down to 40 minutes needs your help!
>>
>> Currently, there is a test that is failing:
>>
>> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>>
>> This needs fixing first.
>
>
> Where can I get the logs of this test case? In 
> https://build.gluster.org/job/distributed-regression/264/console I see this 
> test case is failed and re-attempted. But I couldn't find logs.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] index_lookup segfault in glusterfsd brick process

2018-10-04 Thread Ravishankar N



On 10/04/2018 01:57 PM, Pranith Kumar Karampuri wrote:



On Wed, Oct 3, 2018 at 11:20 PM 김경표 > wrote:


Hello folks.

Few days ago I found my EC(4+2) volume was degraded.
I am using 3.12.13-1.el7.x86_64.
One brick was down, below is bricklog
I am suspicious loc->inode bug in index.c (see attached picture)
In GDB, loc->inode is null

inode_find (loc->inode->table, loc->gfid);


I see that loc->inode is coming from resolve_gfid() where the 
following should have been executed.

  0 resolve_loc->inode = server_inode_new (state->itable,
  1 resolve_loc->gfid);

As per the log:
"[2018-09-29 13:22:36.536579] W [inode.c:680:inode_new] 
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048) 
[0x7f9bd2494048] 
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d) 
[0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x

7f9be70900ba] ) 0-gluvol02-05-server: inode not found"

it indicates that inode-table is NULL. Is there a possibility to 
upload the core somewhere for us to take a closer look?


https://bugzilla.redhat.com/show_bug.cgi?id=1635784 has been raised by 
kpkim, best  to attach the core to the BZ.

-Ravi




Thansk for Gluster Community!!!

- kpkim

--
[2018-09-29 13:22:36.536532] W [inode.c:942:inode_find]
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd01c)
[0x7f9bd249401c]
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc638)
[0x7f9bd2493638] -->/lib64/libglusterfs.so.0(inode_find+0x92) [
0x7f9be7090a82] ) 0-gluvol02-05-server: table not found
[2018-09-29 13:22:36.536579] W [inode.c:680:inode_new]
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)
[0x7f9bd2494048]
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d)
[0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x
7f9be70900ba] ) 0-gluvol02-05-server: inode not found
[2018-09-29 13:22:36.537568] W [inode.c:2305:inode_is_linked]
(-->/usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)
[0x7f9bd2b1cfc6]
-->/usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bb9)
[0x7f9bd2d43bb9] -->/lib64/libglusterfs.so.0(inode_is_linke
d+0x8a) [0x7f9be70927ea] ) 0-gluvol02-05-index: inode not found
pending frames:
frame : type(0) op(18)
frame : type(0) op(18)
frame : type(0) op(28)
--snip --
frame : type(0) op(28)
frame : type(0) op(28)
frame : type(0) op(18)
patchset: git://git.gluster.org/glusterfs.git

signal received: 11
time of crash:
2018-09-29 13:22:36
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.12.13
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f9be70804c0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f9be708a3f4]
/lib64/libc.so.6(+0x362f0)[0x7f9be56e02f0]

/usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bc4)[0x7f9bd2d43bc4]

/usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)[0x7f9bd2b1cfc6]

/usr/lib64/glusterfs/3.12.13/xlator/debug/io-stats.so(+0x4e53)[0x7f9bd28eee53]
/lib64/libglusterfs.so.0(default_lookup+0xbd)[0x7f9be70fddfd]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc342)[0x7f9bd2493342]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)[0x7f9bd2494048]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd2c0)[0x7f9bd24942c0]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc89e)[0x7f9bd249389e]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd354)[0x7f9bd2494354]

/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0x2f829)[0x7f9bd24b6829]
/lib64/libgfrpc.so.0(rpcsvc_request_handler+0x96)[0x7f9be6e42246]
/lib64/libpthread.so.0(+0x7e25)[0x7f9be5edfe25]
/lib64/libc.so.6(clone+0x6d)[0x7f9be57a8bad]


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-devel



--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] index_lookup segfault in glusterfsd brick process

2018-10-04 Thread Pranith Kumar Karampuri
On Wed, Oct 3, 2018 at 11:20 PM 김경표  wrote:

> Hello folks.
>
> Few days ago I found my EC(4+2) volume was degraded.
> I am using 3.12.13-1.el7.x86_64.
> One brick was down, below is bricklog
> I am suspicious loc->inode bug in index.c (see attached picture)
> In GDB, loc->inode is null
>
>> inode_find (loc->inode->table, loc->gfid);
>>
>
I see that loc->inode is coming from resolve_gfid() where the following
should have been executed.
  0 resolve_loc->inode = server_inode_new
(state->itable,

  1resolve_loc->gfid);

As per the log:
"[2018-09-29 13:22:36.536579] W [inode.c:680:inode_new]
(-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)
[0x7f9bd2494048]
-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d)
[0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x
7f9be70900ba] ) 0-gluvol02-05-server: inode not found"

it indicates that inode-table is NULL. Is there a possibility to upload the
core somewhere for us to take a closer look?


> Thansk for Gluster Community!!!
>
> - kpkim
>
> --
> [2018-09-29 13:22:36.536532] W [inode.c:942:inode_find]
> (-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd01c)
> [0x7f9bd249401c]
> -->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc638)
> [0x7f9bd2493638] -->/lib64/libglusterfs.so.0(inode_find+0x92) [
> 0x7f9be7090a82] ) 0-gluvol02-05-server: table not found
> [2018-09-29 13:22:36.536579] W [inode.c:680:inode_new]
> (-->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)
> [0x7f9bd2494048]
> -->/usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc14d)
> [0x7f9bd249314d] -->/lib64/libglusterfs.so.0(inode_new+0x8a) [0x
> 7f9be70900ba] ) 0-gluvol02-05-server: inode not found
> [2018-09-29 13:22:36.537568] W [inode.c:2305:inode_is_linked]
> (-->/usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)
> [0x7f9bd2b1cfc6]
> -->/usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bb9)
> [0x7f9bd2d43bb9] -->/lib64/libglusterfs.so.0(inode_is_linke
> d+0x8a) [0x7f9be70927ea] ) 0-gluvol02-05-index: inode not found
> pending frames:
> frame : type(0) op(18)
> frame : type(0) op(18)
> frame : type(0) op(28)
> --snip --
> frame : type(0) op(28)
> frame : type(0) op(28)
> frame : type(0) op(18)
> patchset: git://git.gluster.org/glusterfs.git
> signal received: 11
> time of crash:
> 2018-09-29 13:22:36
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.12.13
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f9be70804c0]
> /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f9be708a3f4]
> /lib64/libc.so.6(+0x362f0)[0x7f9be56e02f0]
>
> /usr/lib64/glusterfs/3.12.13/xlator/features/index.so(+0x4bc4)[0x7f9bd2d43bc4]
>
> /usr/lib64/glusterfs/3.12.13/xlator/features/quota.so(+0x4fc6)[0x7f9bd2b1cfc6]
>
> /usr/lib64/glusterfs/3.12.13/xlator/debug/io-stats.so(+0x4e53)[0x7f9bd28eee53]
> /lib64/libglusterfs.so.0(default_lookup+0xbd)[0x7f9be70fddfd]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc342)[0x7f9bd2493342]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd048)[0x7f9bd2494048]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd2c0)[0x7f9bd24942c0]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xc89e)[0x7f9bd249389e]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0xd354)[0x7f9bd2494354]
>
> /usr/lib64/glusterfs/3.12.13/xlator/protocol/server.so(+0x2f829)[0x7f9bd24b6829]
> /lib64/libgfrpc.so.0(rpcsvc_request_handler+0x96)[0x7f9be6e42246]
> /lib64/libpthread.so.0(+0x7e25)[0x7f9be5edfe25]
> /lib64/libc.so.6(clone+0x6d)[0x7f9be57a8bad]
>

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Amar Tumballi
On Thu, Oct 4, 2018 at 12:54 PM Xavi Hernandez  wrote:

> On Wed, Oct 3, 2018 at 11:57 AM Deepshikha Khandelwal 
> wrote:
>
>> Hello folks,
>>
>> Distributed-regression job[1] is now a part of Gluster's
>> nightly-master build pipeline. The following are the issues we have
>> resolved since we started working on this:
>>
>> 1) Collecting gluster logs from servers.
>> 2) Tests failed due to infra-related issues have been fixed.
>> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>>
>> To get time down to 40 minutes needs your help!
>>
>> Currently, there is a test that is failing:
>>
>> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>>
>> This needs fixing first.
>>
>> There's a test that takes 14 minutes to complete -
>> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
>> 14 minutes is not something we can distribute. Can we look at how we
>> can speed this up[2]? When this test fails, it is re-attempted,
>> further increasing the time. This happens in the regular
>> centos7-regression job as well.
>>
>
> I made a change [1] to reduce the amount of time this tests needs. With
> this change the test completes in about 90 seconds. It would need some
> reviews from maintainers though.
>
> Do you want I send a patch with this change alone ?
>
> Xavi
>
> [1]
> https://review.gluster.org/#/c/glusterfs/+/19254/22/tests/bugs/index/bug-1559004-EMLINK-handling.t
>
>

Yes please! It would be useful! We can merge it sooner that way!

-Amar


>
>> If you see any other issues, please file a bug[3].
>>
>> [1]: https://build.gluster.org/job/distributed-regression
>> [2]: https://build.gluster.org/job/distributed-regression/264/console
>> [3]:
>> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
>>
>> Thanks,
>> Deepshikha Khandelwal
>> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
>> >
>> >
>> >
>> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
>> wrote:
>> >>
>> >>
>> >>
>> >>> There are currently a few known issues:
>> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
>> >>
>> >>
>> >> If I look at the activities involved with regression failures, this
>> can wait.
>> >
>> >
>> > Well, we can't debug the current failures without having the logs. So
>> this has to be fixed first.
>> >
>> >>
>> >>
>> >>>
>> >>> * A few tests fail due to infra-related issues like geo-rep tests.
>> >>
>> >>
>> >> Please open bugs for this, so we can track them, and take it to
>> closure.
>> >
>> >
>> > These are failing due to infra reasons. Most likely subtle differences
>> in the setup of these nodes vs our normal nodes. We'll only be able to
>> debug them once we get the logs. I know the geo-rep ones are easy to fix.
>> The playbook for setting up geo-rep correctly just didn't make it over to
>> the playbook used for these images.
>> >
>> >>
>> >>
>> >>>
>> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60 minutes)
>> >>
>> >>
>> >> Time can change with more tests added, and also please plan to have
>> number of server as 1 to n.
>> >
>> >
>> > While the n is configurable, however it will be fixed to a single digit
>> number for now. We will need to place *some* limitation somewhere or else
>> we'll end up not being able to control our cloud bills.
>> >
>> >>
>> >>
>> >>>
>> >>> * We've only tested plain regressions. ASAN and Valgrind are
>> currently untested.
>> >>
>> >>
>> >> Great to have it running not 'per patch', but as nightly, or weekly to
>> start with.
>> >
>> >
>> > This is currently not targeted until we phase out current regressions.
>> >
>> >>>
>> >>>
>> >>> Before bringing it into production, we'll run this job nightly and
>> >>> watch it for a month to debug the other failures.
>> >>>
>> >>
>> >> I would say, bring it to production sooner, say 2 weeks, and also plan
>> to have the current regression as is with a special command like 'run
>> regression in-one-machine' in gerrit (or something similar) with voting
>> rights, so we can fall back to this method if something is broken in
>> parallel testing.
>> >>
>> >> I have seen that regardless of amount of time we put some scripts in
>> testing, the day we move to production, some thing would be broken. So, let
>> that happen earlier than later, so it would help next release branching
>> out. Don't want to be stuck for branching due to infra failures.
>> >
>> >
>> > Having two regression jobs that can vote is going to cause more
>> confusion than it's worth. There are a couple of intermittent memory issues
>> with the test script that we need to debug and fix before I'm comfortable
>> in making this job a voting job. We've worked around these problems right
>> now. It still pops up now and again. The fact that things break often is
>> not an excuse to prevent avoidable failures.  The one month timeline was
>> taken with all these factors into consideration. The 2-week timeline is a
>> no-go at this point.
>> >
>> > When we are ready to make the 

Re: [Gluster-devel] POC- Distributed regression testing framework

2018-10-04 Thread Xavi Hernandez
On Wed, Oct 3, 2018 at 11:57 AM Deepshikha Khandelwal 
wrote:

> Hello folks,
>
> Distributed-regression job[1] is now a part of Gluster's
> nightly-master build pipeline. The following are the issues we have
> resolved since we started working on this:
>
> 1) Collecting gluster logs from servers.
> 2) Tests failed due to infra-related issues have been fixed.
> 3) Time taken to run regression testing reduced to ~50-60 minutes.
>
> To get time down to 40 minutes needs your help!
>
> Currently, there is a test that is failing:
>
> tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t
>
> This needs fixing first.
>
> There's a test that takes 14 minutes to complete -
> `tests/bugs/index/bug-1559004-EMLINK-handling.t`. A single test taking
> 14 minutes is not something we can distribute. Can we look at how we
> can speed this up[2]? When this test fails, it is re-attempted,
> further increasing the time. This happens in the regular
> centos7-regression job as well.
>

I made a change [1] to reduce the amount of time this tests needs. With
this change the test completes in about 90 seconds. It would need some
reviews from maintainers though.

Do you want I send a patch with this change alone ?

Xavi

[1]
https://review.gluster.org/#/c/glusterfs/+/19254/22/tests/bugs/index/bug-1559004-EMLINK-handling.t


>
> If you see any other issues, please file a bug[3].
>
> [1]: https://build.gluster.org/job/distributed-regression
> [2]: https://build.gluster.org/job/distributed-regression/264/console
> [3]:
> https://bugzilla.redhat.com/enter_bug.cgi?product=glusterfs=project-infrastructure
>
> Thanks,
> Deepshikha Khandelwal
> On Tue, Jun 26, 2018 at 9:02 AM Nigel Babu  wrote:
> >
> >
> >
> > On Mon, Jun 25, 2018 at 7:28 PM Amar Tumballi 
> wrote:
> >>
> >>
> >>
> >>> There are currently a few known issues:
> >>> * Not collecting the entire logs (/var/log/glusterfs) from servers.
> >>
> >>
> >> If I look at the activities involved with regression failures, this can
> wait.
> >
> >
> > Well, we can't debug the current failures without having the logs. So
> this has to be fixed first.
> >
> >>
> >>
> >>>
> >>> * A few tests fail due to infra-related issues like geo-rep tests.
> >>
> >>
> >> Please open bugs for this, so we can track them, and take it to closure.
> >
> >
> > These are failing due to infra reasons. Most likely subtle differences
> in the setup of these nodes vs our normal nodes. We'll only be able to
> debug them once we get the logs. I know the geo-rep ones are easy to fix.
> The playbook for setting up geo-rep correctly just didn't make it over to
> the playbook used for these images.
> >
> >>
> >>
> >>>
> >>> * Takes ~80 minutes with 7 distributed servers (targetting 60 minutes)
> >>
> >>
> >> Time can change with more tests added, and also please plan to have
> number of server as 1 to n.
> >
> >
> > While the n is configurable, however it will be fixed to a single digit
> number for now. We will need to place *some* limitation somewhere or else
> we'll end up not being able to control our cloud bills.
> >
> >>
> >>
> >>>
> >>> * We've only tested plain regressions. ASAN and Valgrind are currently
> untested.
> >>
> >>
> >> Great to have it running not 'per patch', but as nightly, or weekly to
> start with.
> >
> >
> > This is currently not targeted until we phase out current regressions.
> >
> >>>
> >>>
> >>> Before bringing it into production, we'll run this job nightly and
> >>> watch it for a month to debug the other failures.
> >>>
> >>
> >> I would say, bring it to production sooner, say 2 weeks, and also plan
> to have the current regression as is with a special command like 'run
> regression in-one-machine' in gerrit (or something similar) with voting
> rights, so we can fall back to this method if something is broken in
> parallel testing.
> >>
> >> I have seen that regardless of amount of time we put some scripts in
> testing, the day we move to production, some thing would be broken. So, let
> that happen earlier than later, so it would help next release branching
> out. Don't want to be stuck for branching due to infra failures.
> >
> >
> > Having two regression jobs that can vote is going to cause more
> confusion than it's worth. There are a couple of intermittent memory issues
> with the test script that we need to debug and fix before I'm comfortable
> in making this job a voting job. We've worked around these problems right
> now. It still pops up now and again. The fact that things break often is
> not an excuse to prevent avoidable failures.  The one month timeline was
> taken with all these factors into consideration. The 2-week timeline is a
> no-go at this point.
> >
> > When we are ready to make the switch, we won't be switching 100% of the
> job. We'll start with a sliding scale so that we can monitor failures and
> machine creation adequately.
> >
> > --
> > nigelb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
>