Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

2019-05-08 Thread Milind Changire
awesome! well done!
thank you for taking pain to fix the memory leak


On Wed, May 8, 2019 at 1:28 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> Hi 'Milind Changire' ,
>
> The leak is getting more and more clear to me now. the unsolved memory
> leak is because of in gluterfs version 3.12.15 (in my env)the ssl context
> is a shared one, while we do ssl_acept, ssl will allocate some read/write
> buffer to ssl object, however, ssl_free in socket_reset or fini function of
> socket.c, the buffer is returened back to ssl context free list instead of
> completely freed.
>
>
>
> So following patch is able to fix the memory leak issue
> completely.(created for gluster master branch)
>
>
>
> --- a/rpc/rpc-transport/socket/src/socket.c
> +++ b/rpc/rpc-transport/socket/src/socket.c
> @@ -446,6 +446,7 @@ ssl_setup_connection_postfix(rpc_transport_t *this)
>  gf_log(this->name, GF_LOG_DEBUG,
> "SSL verification succeeded (client: %s) (server: %s)",
> this->peerinfo.identifier, this->myinfo.identifier);
> +X509_free(peer);
>  return gf_strdup(peer_CN);
>
>  /* Error paths. */
> @@ -1157,7 +1158,21 @@ __socket_reset(rpc_transport_t *this)
>  memset(>incoming, 0, sizeof(priv->incoming));
>
>  event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
> -
> +if(priv->use_ssl&& priv->ssl_ssl)
> +{
> +  gf_log(this->name, GF_LOG_TRACE,
> + "clear and reset for socket(%d), free ssl ",
> + priv->sock);
> +   if(priv->ssl_ctx)
> + {
> +   SSL_CTX_free(priv->ssl_ctx);
> +   priv->ssl_ctx = NULL;
> + }
> +  SSL_shutdown(priv->ssl_ssl);
> +  SSL_clear(priv->ssl_ssl);
> +  SSL_free(priv->ssl_ssl);
> +  priv->ssl_ssl = NULL;
> +}
>  priv->sock = -1;
>  priv->idx = -1;
>  priv->connected = -1;
> @@ -4675,6 +4690,21 @@ fini(rpc_transport_t *this)
>  pthread_mutex_destroy(>out_lock);
>  pthread_mutex_destroy(>cond_lock);
>  pthread_cond_destroy(>cond);
> +   if(priv->use_ssl&& priv->ssl_ssl)
> +   {
> + gf_log(this->name, GF_LOG_TRACE,
> +"clear and reset for socket(%d), free ssl
> ",
> +priv->sock);
> + if(priv->ssl_ctx)
> + {
> +   SSL_CTX_free(priv->ssl_ctx);
> +       priv->ssl_ctx = NULL;
> + }
> + SSL_shutdown(priv->ssl_ssl);
> + SSL_clear(priv->ssl_ssl);
> + SSL_free(priv->ssl_ssl);
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Monday, May 06, 2019 2:12 PM
> *To:* 'Amar Tumballi Suryanarayan' 
> *Cc:* 'Milind Changire' ; 'gluster-devel@gluster.org'
> 
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> Hi,
>
> From our test valgrind and libleak all blame ssl3_accept
>
> ///from valgrind attached to
> glusterfds///
>
> ==16673== 198,720 bytes in 12 blocks are definitely lost in loss record
> 1,114 of 1,123
> ==16673==at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
> ==16673==by 0x63E1977: CRYPTO_malloc (in /usr/lib64/
> *libcrypto.so.1.0.2p*)
> ==16673==by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673==by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673==by 0xA8485D9: ssl3_accept (in /usr/lib64/*libssl.so.1.0.2p*)
> ==16673==by 0xA610DDF: ssl_complete_connection (socket.c:400)
> ==16673==by 0xA617F38: ssl_handle_server_connection_attempt
> (socket.c:2409)
> ==16673==by 0xA618420: socket_complete_connection (socket.c:2554)
> ==16673==by 0xA618788: socket_event_handler (socket.c:2613)
> ==16673==by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
> ==16673==by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
> ==16673==by 0x615C5D9: start_thread (in /usr/lib64/*libpthread-2.27.so
> <http://libpthread-2.27.so>*)
> ==16673==
> ==16673== 200,544 bytes in 12 blocks are definitely lost in loss record
> 1,115 of 1,123
> ==16673==at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
> ==16673==by 0x63E1977: CRYPTO_malloc (in /usr/lib64/
> *libcrypto.so.1.0.2p*)
> ==16673==by 0xA855D12: ssl3_setup_r

Re: [Gluster-devel] Should we enable contention notification by default ?

2019-05-02 Thread Milind Changire
On Thu, May 2, 2019 at 6:44 PM Xavi Hernandez  wrote:

> Hi Ashish,
>
> On Thu, May 2, 2019 at 2:17 PM Ashish Pandey  wrote:
>
>> Xavi,
>>
>> I would like to keep this option (features.lock-notify-contention)
>> enabled by default.
>> However, I can see that there is one more option which will impact the
>> working of this option which is "notify-contention-delay"
>>
>
Just a nit. I wish the option was called "notify-contention-interval"
The "delay" part doesn't really emphasize where the delay would be put in.


>  .description = "This value determines the minimum amount of time "
>> "(in seconds) between upcall contention notifications
>> "
>> "on the same inode. If multiple lock requests are "
>> "received during this period, only one upcall will "
>> "be sent."},
>>
>> I am not sure what should be the best value for this option if we want to
>> keep features.lock-notify-contention ON by default?
>> It looks like if we keep the value of notify-contention-delay more, say 5
>> sec, it will wait for this much time to send up call
>> notification which does not look good.
>>
>
> No, the first notification is sent immediately. What this option does is
> to define the minimum interval between notifications. This interval is per
> lock. This is done to avoid storms of notifications if many requests come
> referencing the same lock.
>
> Is my understanding correct?
>> What will be impact of this value and what should be the default value of
>> this option?
>>
>
> I think the current default value of 5 seconds seems good enough. If there
> are many bricks, each brick could send a notification per lock. 1000 bricks
> would mean a client would receive 1000 notifications every 5 seconds. It
> doesn't seem too much, but in those cases 10, and considering we could have
> other locks, maybe a higher value could be better.
>
> Xavi
>
>
>>
>> ---
>> Ashish
>>
>>
>>
>>
>>
>>
>> --
>> *From: *"Xavi Hernandez" 
>> *To: *"gluster-devel" 
>> *Cc: *"Pranith Kumar Karampuri" , "Ashish Pandey" <
>> aspan...@redhat.com>, "Amar Tumballi" 
>> *Sent: *Thursday, May 2, 2019 4:15:38 PM
>> *Subject: *Should we enable contention notification by default ?
>>
>> Hi all,
>>
>> there's a feature in the locks xlator that sends a notification to
>> current owner of a lock when another client tries to acquire the same lock.
>> This way the current owner is made aware of the contention and can release
>> the lock as soon as possible to allow the other client to proceed.
>>
>> This is specially useful when eager-locking is used and multiple clients
>> access the same files and directories. Currently both replicated and
>> dispersed volumes use eager-locking and can use contention notification to
>> force an early release of the lock.
>>
>> Eager-locking reduces the number of network requests required for each
>> operation, improving performance, but could add delays to other clients
>> while it keeps the inode or entry locked. With the contention notification
>> feature we avoid this delay, so we get the best performance with minimal
>> issues in multiclient environments.
>>
>> Currently the contention notification feature is controlled by the
>> 'features.lock-notify-contention' option and it's disabled by default.
>> Should we enable it by default ?
>>
>> I don't see any reason to keep it disabled by default. Does anyone
>> foresee any problem ?
>>
>> Regards,
>>
>> Xavi
>>
>> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

2019-04-22 Thread Milind Changire
According to BIO_new_socket() man page ...

*If the close flag is set then the socket is shut down and closed when the
BIO is freed.*

For Gluster to have more control over the socket shutdown, the BIO_NOCLOSE
flag is set. Otherwise, SSL takes control of socket shutdown whenever BIO
is freed.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

2019-04-22 Thread Milind Changire
Looks like using BIO_free() is not right. Here's what the SSL_set_bio() man
page says ...

SSL_set_bio() is similar to SSL_set0_rbio() and SSL_set0_wbio() except that
it connects both the rbio and the wbio at the same time, and *transfers the
ownership of rbio and wbio to ssl* according to the following set of rules:

So, I think you were right about SSL_free() doing the job for the bio.
However, SSL_free() has no reason to set the priv->ssl_sbio pointer to
NULL. I think priv->ssl_sbio should be set to NULL immediately after the
call to SSL_set_bio() is successful. And we need to add a comment while
setting priv->ssl_sbio to NULL that the ownership of the bio has now been
transferred to SSL and SSL will free the related memory appropriately.



On Mon, Apr 22, 2019 at 11:37 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> I tried to print priv->ssl_sbio after SSL_free() find this pointer is not
> null, so I add free ssl_sbio with BIO_free, however this cause glusterfd
> coredump
>
> (gdb) bt
>
> #0  0x7f3047867f9b in raise () from /lib64/libc.so.6
>
> #1  0x7f3047869351 in abort () from /lib64/libc.so.6
>
> #2  0x7f30478aa8c7 in __libc_message () from /lib64/libc.so.6
>
> #3  0x7f30478b0e6a in malloc_printerr () from /lib64/libc.so.6
>
> #4  0x7f30478b2835 in _int_free () from /lib64/libc.so.6
>
> #5  0x7f3047c5bbbd in CRYPTO_free () from /lib64/libcrypto.so.10
>
> #6  0x7f3047d07582 in BIO_free () from /lib64/libcrypto.so.10
>
> #7  0x7f3043f9ba4b in __socket_reset (this=0x7f303c1ae710) at
> socket.c:1032
>
> #8  0x7f3043f9c4aa in socket_event_poll_err (this=0x7f303c1ae710,
> gen=1, idx=17) at socket.c:1232
>
> #9  0x7f3043fa1b7d in socket_event_handler (fd=26, idx=17, gen=1,
> data=0x7f303c1ae710, poll_in=1, poll_out=0, poll_err=0) at socket.c:2669
>
> #10 0x7f3049307984 in event_dispatch_epoll_handler
> (event_pool=0x1035610, event=0x7f3043b14e84) at event-epoll.c:587
>
> #11 0x7f3049307c5b in event_dispatch_epoll_worker (data=0x107e3e0) at
> event-epoll.c:663
>
> #12 0x7f30480535da in start_thread () from /lib64/libpthread.so.0
>
> #13 0x7f3047929eaf in clone () from /lib64/libc.so.6
>
>
>
> @@ -1019,7 +1019,20 @@ static void __socket_reset(rpc_transport_t *this) {
>
>memset(>incoming, 0, sizeof(priv->incoming));
>
>event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
>
> -
>
> +  if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +gf_log(this->name, GF_LOG_INFO,
>
> +   "clear and reset for socket(%d), free ssl ",
>
> +   priv->sock);
>
> +SSL_shutdown(priv->ssl_ssl);
>
> +SSL_clear(priv->ssl_ssl);
>
> +SSL_free(priv->ssl_ssl);
>
> +gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p
> ",priv->sock,priv->ssl_sbio);
>
> + if(priv->ssl_sbio != NULL)
>
> + BIO_free(priv->ssl_sbio);
>
> +priv->ssl_ssl = NULL;
>
> + priv->ssl_sbio = NULL;
>
> +  }
>
>priv->sock = -1;
>
>priv->idx = -1;
>
>priv->connected = -1;
>
> @@ -4238,6 +4251,20 @@ void fini(rpc_transport_t *this) {
>
>  pthread_mutex_destroy(>out_lock);
>
>  pthread_mutex_destroy(>cond_lock);
>
>  pthread_cond_destroy(>cond);
>
> + if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +gf_log(this->name, GF_LOG_INFO,
>
> +   "clear and reset for socket(%d), free ssl ",
>
> +   priv->sock);
>
> +SSL_shutdown(priv->ssl_ssl);
>
> +SSL_clear(priv->ssl_ssl);
>
> +SSL_free(priv->ssl_ssl);
>
> + gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p
> ",priv->sock,priv->ssl_sbio);
>
> +if(priv->ssl_sbio != NULL)
>
> +BIO_free(priv->ssl_sbio);
>
> +priv->ssl_ssl = NULL;
>
> + priv->ssl_sbio = NULL;
>
> +  }
>
>  if (priv->ssl_private_key) {
>
>    GF_FREE(priv->ssl_private_key);
>
>
>
>
>
> *From:* Milind Changire 
> *Sent:* Monday, April 22, 2019 1:35 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
> *Cc:* Atin Mukherjee ; gluster-devel@gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> This probably went unnoticed until now.
>
>
>
>
>
>
>
> On Mon, Apr 22, 2019 at 10:45 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
> Why there is no bio_free called

Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

2019-04-21 Thread Milind Changire
This probably went unnoticed until now.



On Mon, Apr 22, 2019 at 10:45 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.z...@nokia-sbell.com> wrote:

> Why there is no bio_free called in ssl_teardown_connection then?
>
>
>
> cynthia
>
>
>
> *From:* Milind Changire 
> *Sent:* Monday, April 22, 2019 10:21 AM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) 
> *Cc:* Atin Mukherjee ; gluster-devel@gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> After patch 22334 <https://review.gluster.org/c/glusterfs/+/22334>, the
> priv->ssl_ctx is now maintained per socket connection and is no longer
> shared.
>
> So you might want to SSL_free(priv->ssl_ctx) as well and set priv->ssl_ctx
> to NULL.
>
>
>
> There might be some strings that are duplicated (gf_strdup()) via the
> socket_init() code path. Please take a look at those as well.
>
>
>
> Sorry about that. I missed it.
>
>
>
>
>
> On Mon, Apr 22, 2019 at 7:25 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.z...@nokia-sbell.com> wrote:
>
>
>
> Hi,
>
> From my code study it seems priv->ssl_ssl is not properly released, I made
> a patch and the glusterfsd memory leak is alleviated with my patch, but
> some otherwhere is still leaking, I have no clue about the other leak
> points.
>
>
>
> --- a/rpc/rpc-transport/socket/src/socket.c
>
> +++ b/rpc/rpc-transport/socket/src/socket.c
>
> @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) {
>
>memset(>incoming, 0, sizeof(priv->incoming));
>
>event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
>
> -
>
> +  if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +gf_log(this->name, GF_LOG_TRACE,
>
> +   "clear and reset for socket(%d), free ssl ",
>
> +   priv->sock);
>
> +SSL_shutdown(priv->ssl_ssl);
>
> +SSL_clear(priv->ssl_ssl);
>
> +SSL_free(priv->ssl_ssl);
>
> +priv->ssl_ssl = NULL;
>
> +  }
>
>priv->sock = -1;
>
>priv->idx = -1;
>
>priv->connected = -1;
>
> @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) {
>
>  pthread_mutex_destroy(>out_lock);
>
>  pthread_mutex_destroy(>cond_lock);
>
>  pthread_cond_destroy(>cond);
>
> + if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +gf_log(this->name, GF_LOG_TRACE,
>
> +   "clear and reset for socket(%d), free ssl ",
>
> +   priv->sock);
>
> +SSL_shutdown(priv->ssl_ssl);
>
> +SSL_clear(priv->ssl_ssl);
>
> +SSL_free(priv->ssl_ssl);
>
> +priv->ssl_ssl = NULL;
>
> +  }
>
>  if (priv->ssl_private_key) {
>
>GF_FREE(priv->ssl_private_key);
>
>  }
>
>
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Thursday, April 18, 2019 5:31 PM
> *To:* 'Atin Mukherjee' 
> *Cc:* 'Raghavendra Gowdappa' ; '
> gluster-devel@gluster.org' 
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> We scan it use memory-leak tool, there are following prints. We doubt some
> open ssl lib malloc is is not properly freed by glusterfs code.
>
> er+0x2af [*libglusterfs.so.0.0.1]\n\t\tstart_thread+0xda* [*libpthread-2.27.so
> <http://libpthread-2.27.so>*]'
> *13580* bytes in 175 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p*]'
> *232904* bytes in 14 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> [15:41:56] Top 10 stacks with outstanding allocations:
> *8792* bytes in 14 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> *9408* bytes in 42 allocations from stack
> b'CRYPTO_realloc+0x4d [*libcrypto.so.1.0.2p*]'
> *9723* bytes in 14 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> *10696* bytes in 21 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> *11319* bytes in 602 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> *11431* bytes in 518 allocations from stack
> b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> *11704* bytes in 371 allocations from stack
> 

Re: [Gluster-devel] [Gluster-users] Version uplift query

2019-02-27 Thread Milind Changire
you might want to check what build.log says ... especially at the very
bottom

Here's a hint from StackExhange
.

On Thu, Feb 28, 2019 at 12:42 PM ABHISHEK PALIWAL 
wrote:

> I am trying to build Gluster5.4 but getting below error at the time of
> configure
>
> conftest.c:11:28: fatal error: ac_nonexistent.h: No such file or directory
>
> Could you please help me what is the reason of the above error.
>
> Regards,
> Abhishek
>
> On Wed, Feb 27, 2019 at 8:42 PM Amar Tumballi Suryanarayan <
> atumb...@redhat.com> wrote:
>
>> GlusterD2 is not yet called out for standalone deployments.
>>
>> You can happily update to glusterfs-5.x (recommend you to wait for
>> glusterfs-5.4 which is already tagged, and waiting for packages to be
>> built).
>>
>> Regards,
>> Amar
>>
>> On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL 
>> wrote:
>>
>>> Hi,
>>>
>>> Could  you please update on this and also let us know what is GlusterD2
>>> (as it is under development in 5.0 release), so it is ok to uplift to 5.0?
>>>
>>> Regards,
>>> Abhishek
>>>
>>> On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL <
>>> abhishpali...@gmail.com> wrote:
>>>
 Hi,

 Currently we are using Glusterfs 3.7.6 and thinking to switch on
 Glusterfs 4.1 or 5.0, when I see there are too much code changes between
 these version, could you please let us know, is there any compatibility
 issue when we uplift any of the new mentioned version?

 Regards
 Abhishek

>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>> Regards
>>> Abhishek Paliwal
>>> ___
>>> Gluster-users mailing list
>>> gluster-us...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [master][FAILED] brick-mux-regression

2018-12-02 Thread Milind Changire
On Mon, Dec 3, 2018 at 8:32 AM Raghavendra Gowdappa 
wrote:

> On Mon, Dec 3, 2018 at 8:25 AM Raghavendra Gowdappa 
> wrote:
>
>> On Sat, Dec 1, 2018 at 11:02 AM Milind Changire 
>> wrote:
>>
>>> failed brick-mux-regression job:
>>> https://build.gluster.org/job/regression-on-demand-multiplex/411/console
>>>
>>> patch:
>>> https://review.gluster.org/c/glusterfs/+/21719
>>>
>>
>> Does this happen only with the above patch? Does brick-mux regression
>> succeed on current master without this patch? Wondering whether the
>> parallelism introduced by bumping up event-threads to 2, is opening up some
>> races in multiplexed environment (though there were always more than one
>> event-thread when more than one brick is multiplexed).
>>
>
> Also, is this bug locally reproducible on your setup if you run test
> following test with brick-mux enabled (with and without your patch)?
>
> ./tests/bugs/glusterd/add-brick-and-validate-replicated-volume-options.t
>
> running the above test with brick-mux enabled works well with and without
the "event-threads bump" patch on my centos7 setup
so, the issue is not reproducible on my setup


>>
>>>
>>> stack trace:
>>> $ gdb -ex 'set sysroot ./' -ex 'core-file
>>> ./build/install/cores/glfs_epoll000-964.core'
>>> ./build/install/sbin/glusterfsd
>>> GNU gdb (GDB) Fedora 8.2-4.fc29
>>> Copyright (C) 2018 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <
>>> http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.
>>> Type "show copying" and "show warranty" for details.
>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>> Type "show configuration" for configuration details.
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>.
>>> Find the GDB manual and other documentation resources online at:
>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>
>>> For help, type "help".
>>> Type "apropos word" to search for commands related to "word"...
>>> Reading symbols from ./build/install/sbin/glusterfsd...done.
>>> [New LWP 970]
>>> [New LWP 992]
>>> [New LWP 993]
>>> [New LWP 1005]
>>> [New LWP 1241]
>>> [New LWP 964]
>>> [New LWP 968]
>>> [New LWP 996]
>>> [New LWP 995]
>>> [New LWP 994]
>>> [New LWP 967]
>>> [New LWP 969]
>>> [New LWP 1003]
>>> [New LWP 1181]
>>> [New LWP 1242]
>>> [New LWP 966]
>>> [New LWP 965]
>>> [New LWP 999]
>>> [New LWP 1000]
>>> [New LWP 1002]
>>> [New LWP 989]
>>> [New LWP 990]
>>> [New LWP 991]
>>> [New LWP 971]
>>> warning: Ignoring non-absolute filename: <./lib64/libz.so.1>
>>> Missing separate debuginfo for ./lib64/libz.so.1
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/ea/8e45dc8e395cc5e26890470112d97a1f1e0b65.debug
>>> warning: Ignoring non-absolute filename: <./lib64/libuuid.so.1>
>>> Missing separate debuginfo for ./lib64/libuuid.so.1
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/71/de190dc0c93504abacc17b9747cd772a1e4b0d.debug
>>> warning: Ignoring non-absolute filename: <./lib64/libm.so.6>
>>> Missing separate debuginfo for ./lib64/libm.so.6
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/f4/cae74047f9aa2d5a71fdec67c4285d75753eba.debug
>>> warning: Ignoring non-absolute filename: <./lib64/librt.so.1>
>>> Missing separate debuginfo for ./lib64/librt.so.1
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/d3/3989ec31efe745eb0d3b68a92d19e77d7ddfda.debug
>>> warning: Ignoring non-absolute filename: <./lib64/libdl.so.2>
>>> Missing separate debuginfo for ./lib64/libdl.so.2
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/5c/db5a56336e7e2bd14ffa189411e44a834afcd8.debug
>>> warning: Ignoring non-absolute filename: <./lib64/libpthread.so.0>
>>> Missing separate debuginfo for ./lib64/libpthread.so.0
>>> Try: dnf --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/f4/c04bce85d2d269d0a2af4972fc69805b50345b.debug
>>> warning:

[Gluster-devel] [master][FAILED] brick-mux-regression

2018-11-30 Thread Milind Changire
failed brick-mux-regression job:
https://build.gluster.org/job/regression-on-demand-multiplex/411/console

patch:
https://review.gluster.org/c/glusterfs/+/21719

stack trace:
$ gdb -ex 'set sysroot ./' -ex 'core-file
./build/install/cores/glfs_epoll000-964.core'
./build/install/sbin/glusterfsd
GNU gdb (GDB) Fedora 8.2-4.fc29
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./build/install/sbin/glusterfsd...done.
[New LWP 970]
[New LWP 992]
[New LWP 993]
[New LWP 1005]
[New LWP 1241]
[New LWP 964]
[New LWP 968]
[New LWP 996]
[New LWP 995]
[New LWP 994]
[New LWP 967]
[New LWP 969]
[New LWP 1003]
[New LWP 1181]
[New LWP 1242]
[New LWP 966]
[New LWP 965]
[New LWP 999]
[New LWP 1000]
[New LWP 1002]
[New LWP 989]
[New LWP 990]
[New LWP 991]
[New LWP 971]
warning: Ignoring non-absolute filename: <./lib64/libz.so.1>
Missing separate debuginfo for ./lib64/libz.so.1
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/ea/8e45dc8e395cc5e26890470112d97a1f1e0b65.debug
warning: Ignoring non-absolute filename: <./lib64/libuuid.so.1>
Missing separate debuginfo for ./lib64/libuuid.so.1
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/71/de190dc0c93504abacc17b9747cd772a1e4b0d.debug
warning: Ignoring non-absolute filename: <./lib64/libm.so.6>
Missing separate debuginfo for ./lib64/libm.so.6
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/f4/cae74047f9aa2d5a71fdec67c4285d75753eba.debug
warning: Ignoring non-absolute filename: <./lib64/librt.so.1>
Missing separate debuginfo for ./lib64/librt.so.1
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/d3/3989ec31efe745eb0d3b68a92d19e77d7ddfda.debug
warning: Ignoring non-absolute filename: <./lib64/libdl.so.2>
Missing separate debuginfo for ./lib64/libdl.so.2
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/5c/db5a56336e7e2bd14ffa189411e44a834afcd8.debug
warning: Ignoring non-absolute filename: <./lib64/libpthread.so.0>
Missing separate debuginfo for ./lib64/libpthread.so.0
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/f4/c04bce85d2d269d0a2af4972fc69805b50345b.debug
warning: Expected absolute pathname for libpthread in the inferior, but got
./lib64/libpthread.so.0.
warning: Unable to find libthread_db matching inferior's thread library,
thread debugging will not be available.
warning: Ignoring non-absolute filename: <./lib64/libcrypto.so.10>
Missing separate debuginfo for ./lib64/libcrypto.so.10
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/67/ceb4edd36bfe0eb31cd92da2694aca5377a599.debug
warning: Ignoring non-absolute filename: <./lib64/libc.so.6>
Missing separate debuginfo for ./lib64/libc.so.6
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/cb/4b7554d1adbef2f001142dd6f0a5139fc9aa69.debug
warning: Ignoring non-absolute filename: <./lib64/ld-linux-x86-64.so.2>
Missing separate debuginfo for ./lib64/ld-linux-x86-64.so.2
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/d2/66b1f6650927e18108323bcca8f7b68e68eb92.debug
warning: Ignoring non-absolute filename: <./lib64/libssl.so.10>
Missing separate debuginfo for ./lib64/libssl.so.10
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/64/68a4e28a19cdd885a3cbc30e009589ca4c2e92.debug
warning: Ignoring non-absolute filename: <./lib64/libgssapi_krb5.so.2>
Missing separate debuginfo for ./lib64/libgssapi_krb5.so.2
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/16/fe0dc6cefc5f444bc876516d02efe9cc2d432f.debug
warning: Ignoring non-absolute filename: <./lib64/libkrb5.so.3>
Missing separate debuginfo for ./lib64/libkrb5.so.3
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/d1/cd1b94855a85fbc735c745db39bc096f7d8cc3.debug
warning: Ignoring non-absolute filename: <./lib64/libcom_err.so.2>
Missing separate debuginfo for ./lib64/libcom_err.so.2
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/2c/7ef64ef0c5af8bcfa8f9e628e5605a7d8c52d3.debug
warning: Ignoring non-absolute filename: <./lib64/libk5crypto.so.3>
Missing separate debuginfo for ./lib64/libk5crypto.so.3
Try: dnf --enablerepo='*debug*' install
/usr/lib/debug/.build-id/a2/0f715c514b3ea873f4cc77d585a50cb670e266.debug
warning: Ignoring non-absolute filename: <./lib64/libkrb5support.so.0>
Missing separate debuginfo for 

Re: [Gluster-devel] [Gluster-Maintainers] Release 5: Master branch health report (Week of 30th July)

2018-08-03 Thread Milind Changire
On Fri, Aug 3, 2018 at 11:04 AM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

> On Thu, Aug 2, 2018 at 10:03 PM Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>> On Thu, Aug 2, 2018 at 7:19 PM Atin Mukherjee 
>> wrote:
>>
>>> New addition - tests/basic/volume.t - failed twice atleast with shd core.
>>>
>>> One such ref - https://build.gluster.org/job/centos7-regression/2058/
>>> console
>>>
>>
>> I will take a look.
>>
>
> The crash is happening inside libc and there are no line numbers to debug
> further. Is there anyway to get symbols, line numbers even for that? We can
> find hints as to what could be going wrong. Let me try to re-create it on
> the machines I have in the meanwhile.
>
> (gdb) bt
> #0  0x7feae916bb4f in _IO_cleanup () from ./lib64/libc.so.6
> #1  0x7feae9127b8b in __run_exit_handlers () from ./lib64/libc.so.6
> #2  0x7feae9127c27 in exit () from ./lib64/libc.so.6
> #3  0x00408ba5 in cleanup_and_exit (signum=15) at
> /home/jenkins/root/workspace/centos7-regression/glusterfsd/
> src/glusterfsd.c:1570
> #4  0x0040a75f in glusterfs_sigwaiter (arg=0x7ffe6faa7540) at
> /home/jenkins/root/workspace/centos7-regression/glusterfsd/
> src/glusterfsd.c:2332
> #5  0x7feae9b27e25 in start_thread () from ./lib64/libpthread.so.0
> #6  0x7feae91ecbad in clone () from ./lib64/libc.so.6
>
> You could install the glibc-debuginfo and other relevant debuginfos on the
system you are trying to reproduce this issue on. That will get you the
line numbers  and symbols.


>>
>>>
>>>
>>> On Thu, Aug 2, 2018 at 6:28 PM Sankarshan Mukhopadhyay <
>>> sankarshan.mukhopadh...@gmail.com> wrote:
>>>
 On Thu, Aug 2, 2018 at 5:48 PM, Kotresh Hiremath Ravishankar
  wrote:
 > I am facing different issue in softserve machines. The fuse mount
 itself is
 > failing.
 > I tried day before yesterday to debug geo-rep failures. I discussed
 with
 > Raghu,
 > but could not root cause it. So none of the tests were passing. It
 happened
 > on
 > both machine instances I tried.
 >

 Ugh! -infra team should have an issue to work with and resolve this.


 --
 sankarshan mukhopadhyay
 
 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 https://lists.gluster.org/mailman/listinfo/gluster-devel

>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Pranith
>>
>
>
> --
> Pranith
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [master][FAILED] ./tests/bugs/rpc/bug-954057.t

2018-06-15 Thread Milind Changire
Jenkins Job: https://build.gluster.org/job/centos7-regression/1421/console

Patch: https://review.gluster.org/15811


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [master][FAILED] test ./tests/bugs/cli/bug-1169302.t

2018-06-10 Thread Milind Changire
The test fails for my patch https://review.gluster.org/15811

Could somebody take a look and see what the issue is
https://build.gluster.org/job/centos7-regression/1372/consoleFull

I've tried to reproduce the issue on my CentOS 7.x VM system but the test
passes without problems.

@Poornima
Since git log on that test file shows your name at the topmost commit, I've
decided to ping you for the same.


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [master] double-free corruption

2018-05-09 Thread Milind Changire
double-free corruption when running mutrace during a CREATE 5000 test run
using small-file test script on upstream master

following is the backtrace:

# mutrace --max=30 -d /usr/local/sbin/glusterfsd -s testsystem1
--volfile-id testvol.testsystem1.gluster-brick1-testvol -p
/var/run/gluster/vols/testvol/testsystem1-gluster-brick1-testvol.pid -S
/var/run/gluster/003b1b6d97b89ba7.socket --brick-name
/gluster/brick1/testvol -l
/var/log/glusterfs/bricks/gluster-brick1-testvol.log --xlator-option
*-posix.glusterd-uuid=94a150c5-d14f-4ecf-aa58-1caf45f40b90 --process-name
brick --brick-port 49152 --xlator-option testvol-server.listen-port=49152
--no-daemon
mutrace: 0.2 successfully initialized for process glusterfsd (PID: 7249).
BFD: /usr/local/lib/libglusterfs.so.0: warning: loop in section
dependencies detected
*** Error in `/usr/local/sbin/glusterfsd': double free or corruption
(!prev): 0x7f30dc0ed1f0 ***
=== Backtrace: =
/lib64/libc.so.6(+0x81429)[0x7f31429f6429]
/usr/local/lib/libmutrace-backtrace-symbols.so(+0x31fd8)[0x7f3144695fd8]
/usr/local/lib/libmutrace-backtrace-symbols.so(backtrace_symbols+0xda)[0x7f31446960da]
/usr/local/lib/libmutrace.so(backtrace_symbols+0x40)[0x7f3144991dc0]
/usr/local/lib/libglusterfs.so.0(+0x2aa87)[0x7f314435ba87]
/usr/local/lib/libglusterfs.so.0(_gf_msg+0x19d)[0x7f314435e044]
/usr/local/lib/libglusterfs.so.0(dict_get_bin+0x175)[0x7f3144351fdc]
/usr/local/lib/libglusterfs.so.0(gf_replace_old_iatt_in_dict+0x3b)[0x7f314436adb1]
/usr/local/lib/glusterfs/4.2dev/xlator/protocol/server.so(+0x3038d)[0x7f312f4f938d]
/usr/local/lib/glusterfs/4.2dev/xlator/debug/io-stats.so(+0x12254)[0x7f312f991254]
/usr/local/lib/glusterfs/4.2dev/xlator/features/marker.so(+0x15ab2)[0x7f313439cab2]
/usr/local/lib/glusterfs/4.2dev/xlator/features/selinux.so(+0x3868)[0x7f31345ba868]
/usr/local/lib/libglusterfs.so.0(default_setxattr_cbk+0x383)[0x7f31444148a2]
/usr/local/lib/glusterfs/4.2dev/xlator/features/upcall.so(+0x18055)[0x7f31349e5055]
/usr/local/lib/glusterfs/4.2dev/xlator/features/locks.so(+0x1c964)[0x7f313524e964]
/usr/local/lib/glusterfs/4.2dev/xlator/features/access-control.so(+0x127aa)[0x7f31354877aa]
/usr/local/lib/glusterfs/4.2dev/xlator/features/changelog.so(+0x10aa3)[0x7f31358c2aa3]
/usr/local/lib/glusterfs/4.2dev/xlator/features/changetimerecorder.so(+0xed2c)[0x7f3135fb8d2c]
/usr/local/lib/glusterfs/4.2dev/xlator/storage/posix.so(+0x3826f)[0x7f3136a3f26f]
/usr/local/lib/libglusterfs.so.0(default_setxattr+0x1f8)[0x7f314442ceb3]
/usr/local/lib/glusterfs/4.2dev/xlator/features/changetimerecorder.so(+0xf4e2)[0x7f3135fb94e2]
/usr/local/lib/glusterfs/4.2dev/xlator/features/changelog.so(+0x118f1)[0x7f31358c38f1]
/usr/local/lib/glusterfs/4.2dev/xlator/features/bitrot-stub.so(+0xc314)[0x7f313569c314]
/usr/local/lib/glusterfs/4.2dev/xlator/features/access-control.so(+0x12c0c)[0x7f3135487c0c]
/usr/local/lib/glusterfs/4.2dev/xlator/features/locks.so(+0x1d51e)[0x7f313524f51e]
/usr/local/lib/libglusterfs.so.0(default_setxattr+0x1f8)[0x7f314442ceb3]
/usr/local/lib/glusterfs/4.2dev/xlator/features/read-only.so(+0x8a4a)[0x7f3134e19a4a]
/usr/local/lib/libglusterfs.so.0(default_setxattr+0x1f8)[0x7f314442ceb3]
/usr/local/lib/glusterfs/4.2dev/xlator/features/upcall.so(+0x184fa)[0x7f31349e54fa]
/usr/local/lib/libglusterfs.so.0(default_setxattr_resume+0x3d0)[0x7f3144420c51]
/usr/local/lib/libglusterfs.so.0(call_resume_wind+0x637)[0x7f314437bcea]
/usr/local/lib/libglusterfs.so.0(call_resume+0xc7)[0x7f314438dc6e]
/usr/local/lib/glusterfs/4.2dev/xlator/performance/io-threads.so(+0x58bb)[0x7f31347c28bb]
/lib64/libpthread.so.0(+0x7dd5)[0x7f31431aadd5]
/lib64/libc.so.6(clone+0x6d)[0x7f3142a73b3d]
=== Memory map: 
0040-00418000 r-xp  fd:00 68968720
/usr/local/sbin/glusterfsd
00617000-00618000 r--p 00017000 fd:00 68968720
/usr/local/sbin/glusterfsd
00618000-0061a000 rw-p 00018000 fd:00 68968720
/usr/local/sbin/glusterfsd
00fad000-01362000 rw-p  00:00 0
[heap]
7f30d400-7f30d4068000 rw-p  00:00 0
7f30d4068000-7f30d800 ---p  00:00 0
7f30dc00-7f30dc307000 rw-p  00:00 0
7f30dc307000-7f30e000 ---p  00:00 0
7f30e000-7f30e029b000 rw-p  00:00 0
7f30e029b000-7f30e400 ---p  00:00 0
7f30e400-7f30e42ec000 rw-p  00:00 0
7f30e42ec000-7f30e800 ---p  00:00 0
7f30e800-7f30e82dd000 rw-p  00:00 0
7f30e82dd000-7f30ec00 ---p  00:00 0
7f30ec00-7f30ec021000 rw-p  00:00 0
7f30ec021000-7f30f000 ---p  00:00 0
7f30f000-7f30f0331000 rw-p  00:00 0
7f30f0331000-7f30f400 ---p  00:00 0
7f30f400-7f30f4021000 rw-p  00:00 0
7f30f4021000-7f30f800 ---p  00:00 0
7f30f800-7f30f8021000 rw-p  00:00 0
7f30f8021000-7f30fc00 ---p  00:00 0
7f30fc00-7f30fc021000 rw-p  00:00 0
7f30fc021000-7f31 ---p  00:00 0
7f31-7f3100021000 rw-p  00:00 0
7f3100021000-7f310400 

Re: [Gluster-devel] Release 3.12.8: Scheduled for the 12th of April

2018-04-13 Thread Milind Changire
On Wed, Apr 11, 2018 at 8:46 AM, Jiffin Tony Thottan 
wrote:

> Hi,
>
> It's time to prepare the 3.12.8 release, which falls on the 10th of
> each month, and hence would be 12-04-2018 this time around.
>
> This mail is to call out the following,
>
> 1) Are there any pending **blocker** bugs that need to be tracked for
> 3.12.7? If so mark them against the provided tracker [1] as blockers
> for the release, or at the very least post them as a response to this
> mail
>
> 2) Pending reviews in the 3.12 dashboard will be part of the release,
> **iff** they pass regressions and have the review votes, so use the
> dashboard [2] to check on the status of your patches to 3.12 and get
> these going
>
> 3) I have made checks on what went into 3.10 post 3.12 release and if
> these fixes are already included in 3.12 branch, then status on this is
> **green**
> as all fixes ported to 3.10, are ported to 3.12 as well.
>
> @Mlind
>
> IMO https://review.gluster.org/19659 is like a minor feature to me. Can
> please provide a justification for why it need to include in 3.12 stable
> release?
>
If rpcsvc request handler threads are not scaled, the rpc request handling
will be serialzed (not concurrent) until the request is handed over to the
io-thread pool. This might come back as a performance issue.

> And please rebase the change as well
>
> @Raghavendra
>
> The smoke failed for https://review.gluster.org/#/c/19818/. Can please
> check the same?
> Thanks,
> Jiffin
>
> [1] Release bug tracker:
> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.12.8
>
> [2] 3.12 review dashboard:
> https://review.gluster.org/#/projects/glusterfs,dashboards/
> dashboard:3-12-dashboard
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [release-4.0] FAILED ./tests/bugs/ec/bug-1236065.t

2018-03-28 Thread Milind Changire
This is now failing with brick-mux *on*:
https://build.gluster.org/job/centos7-regression/535/consoleFull

Patch: https://review.gluster.org/19786


On Tue, Mar 20, 2018 at 11:57 PM, Raghavendra Gowdappa <rgowd...@redhat.com>
wrote:

> Patch at https://review.gluster.org/19746
>
> On Tue, Mar 20, 2018 at 8:42 PM, Milind Changire <mchan...@redhat.com>
> wrote:
>
>> Jenkins Job: https://build.gluster.org/job/centos7-regression/405/console
>> Full
>>
>> --
>> Milind
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [release-4.0] FAILED ./tests/bugs/ec/bug-1236065.t

2018-03-20 Thread Milind Changire
Jenkins Job:
https://build.gluster.org/job/centos7-regression/405/consoleFull

-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] tests/bugs/rpc/bug-921072.t - fails almost all the times in mainline

2018-02-20 Thread Milind Changire
wow! the very first test run on my CentOS 6.9 VM passed successfully within
1 minute
I'll now try this on a CentOS 7 VM


On Wed, Feb 21, 2018 at 9:59 AM, Nigel Babu  wrote:

> The immediate cause of this failure is that we merged the timeout patch
> which gives each test 200 seconds to finish. This test and another one
> takes over 200 seconds on regression nodes.
>
> I have a patch up to change the timeout https://review.gluster.org/#/
> c/19605/1
>
> However, tests/bugs/rpc/bug-921072.t taking 897 seconds is in itself an
> abnormality and is worth looking into.
>
> On Wed, Feb 21, 2018 at 7:47 AM, Atin Mukherjee 
> wrote:
>
>>
>>
>> *https://build.gluster.org/job/centos7-regression/15/consoleFull 
>> 20:24:36* 
>> [20:24:39] Running tests in file ./tests/bugs/rpc/bug-921072.t*20:27:56* 
>> ./tests/bugs/rpc/bug-921072.t timed out after 200 seconds*20:27:56* 
>> ./tests/bugs/rpc/bug-921072.t: bad status 124
>>
>> This is just one of the instances, but I have seen this test failing in last 
>> 3-4 days at least 10 times.
>>
>> Unfortunately, it doesn't look like the regression actually passes in 
>> mainline for any of the patches atm.
>>
>>
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> nigelb
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] gluster volume stop and the regressions

2018-02-13 Thread Milind Changire
The volume stop, in brick-mux mode reveals a race with my patch [1]
Although this behavior is 100% reproducible with my patch, this, by no
means, implies that my patch is buggy.

In brick-mux mode, during volume stop, when glusterd sends a brick-detach
message to the brick process for the last brick, the brick process responds
back to glusterd with an acknowledgment and then kills itself with a
SIGTERM signal. All this sounds fine. However, somehow, the response
doesn't reach glusterd and instead a socket disconnect notification reaches
glusterd before the response. This causes glusterd to presume that
something has gone wrong during volume stop and glusterd then fails the
volume stop operation causing the test to fail.

This race is reproducible by running the test
tests/basic/distribute/rebal-all-nodes-migrate.t in brick-mux mode for my
patch [1]

[1] https://review.gluster.org/19308


On Thu, Feb 1, 2018 at 9:54 AM, Atin Mukherjee <amukh...@redhat.com> wrote:

> I don't think that's the right way. Ideally the test shouldn't be
> attempting to stop a volume if rebalance session is in progress. If we do
> see such a situation even with we check for rebalance status and wait till
> it finishes for 30 secs and still volume stop fails with rebalance session
> in progress error, that means either (a) rebalance session took more than
> the timeout which has been passed to EXPECT_WITHIN or (b) there's a bug in
> the code.
>
> On Thu, Feb 1, 2018 at 9:46 AM, Milind Changire <mchan...@redhat.com>
> wrote:
>
>> If a *volume stop* fails at a user's production site with a reason like
>> *rebalance session is active* then the admin will wait for the session to
>> complete and then reissue a *volume stop*;
>>
>> So, in essence, the failed volume stop is not fatal; for the regression
>> tests, I would like to propose to change a single volume stop to
>> *EXPECT_WITHIN 30* so that a if a volume cannot be stopped even after 30
>> seconds, then it could be termed fatal in the regressions scenario
>>
>> Any comments about the proposal ?
>>
>> --
>> Milind
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] gluster volume stop and the regressions

2018-01-31 Thread Milind Changire
If a *volume stop* fails at a user's production site with a reason like
*rebalance session is active* then the admin will wait for the session to
complete and then reissue a *volume stop*;

So, in essence, the failed volume stop is not fatal; for the regression
tests, I would like to propose to change a single volume stop to
*EXPECT_WITHIN 30* so that a if a volume cannot be stopped even after 30
seconds, then it could be termed fatal in the regressions scenario

Any comments about the proposal ?

-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [FAILED][master] tests/basic/afr/durability-off.t

2018-01-25 Thread Milind Changire
could AFR engineers check why tests/basic/afr/durability-off.t fails in
brick-mux mode;

here's the job URL:
https://build.gluster.org/job/centos6-regression/8654/console

-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Setting up dev environment

2018-01-19 Thread Milind Changire
If you are a Gluster contributor then you'll need to have a GitHub Account
with your Public SSH Key uploaded at GitHub to use the ssh transport.

If you are not a Gluster contributor, then you might just want to use the
https transport instead of the ssh transport to clone the glusterfs repo
off git.gluster.org.


On Sat, Jan 20, 2018 at 7:47 AM, Ram Ankireddypalle 
wrote:

> Hi,
>
>   I am trying to set  a dev environment and send out the code that I
> worked on out for review.
>
>   I tried setting up a build environment using doc @
> http://docs.gluster.org/en/latest/Developer-guide/Development-Workflow/
>
>
>
>  I am seeing the following error.
>
>
>
>  git clone ssh://ram-ankireddypa...@git.gluster.org/glusterfs.git
> glusterfs
>
>  Cloning into 'glusterfs'...
>
>  ssh: connect to host git.gluster.org port 22: Connection timed out
>
>
>
> Please suggest what could be the issue here.
>
>
>
> Thanks and Regards,
>
> Ram
>
>
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] cluster/dht: restrict migration of opened files

2018-01-18 Thread Milind Changire
On Tue, Jan 16, 2018 at 2:52 PM, Raghavendra Gowdappa 
wrote:

> All,
>
> Patch [1] prevents migration of opened files during rebalance operation.
> If patch [1] affects you, please voice out your concerns. [1] is a stop-gap
> fix for the problem discussed in issues [2][3]
>
> [1] https://review.gluster.org/#/c/19202/
> [2] https://github.com/gluster/glusterfs/issues/308
> [3] https://github.com/gluster/glusterfs/issues/347
>
> regards,
> Raghavendra
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel



Would this patch affect tiering as well ?
Do we need to worry about tiering anymore ?

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Integration of GPU with glusterfs

2018-01-10 Thread Milind Changire
bit-rot is another feature that consumes much CPU to calculate the file
content hash


On Thu, Jan 11, 2018 at 11:42 AM, Ashish Pandey  wrote:

> Hi,
>
> We have been thinking of exploiting GPU capabilities to enhance
> performance of glusterfs. We would like to know others thoughts on this.
> In EC, we have been doing CPU intensive computations to encode and decode
> data before writing and reading. This requires a lot of CPU cycles and we
> have
> been observing 100% CPU usage on client side. Data healing will also have
> the same impact as it also needs to do read-decode-encode-write cycle.
> As most of the  modern servers comes with GPU feature, having glusterfs
> GPU ready might give us performance improvements.
> This is not only specific to EC volume, there are other features which
> will require a lot of computations and could use this capability; For
> Example:
> 1 - Encryption/Decryption
> 2 - Compression and de-duplication
> 3 - Hashing
> 4 - Any other? [Please add if you have something in mind]
>
> Before proceeding further we would like to have your inputs on this.
> Do you have any other use case (existing or future) which could perform
> better on GPU?
> Do you think that it is worth to integrate GPU with glusterfs? The effort
> to have this performance gain could be achieved by some other better ways.
> Any input on the way we should implement it.
>
> There is a gihub issue opened for this. Please provide your comment or
> reply to this mail.
>
> A - https://github.com/gluster/glusterfs/issues/388
>
> ---
> Ashish
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] trash.t failure with brick multiplexing [Was Re: Build failed in Jenkins: regression-test-with-multiplex #574]

2018-01-02 Thread Milind Changire
On Tue, Jan 2, 2018 at 5:32 PM, Milind Changire <mchan...@redhat.com> wrote:

>
> On Tue, Jan 2, 2018 at 10:44 AM, Atin Mukherjee <amukh...@redhat.com>
> wrote:
>
>>
>>
>> On Thu, Dec 21, 2017 at 7:27 PM, Atin Mukherjee <amukh...@redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Dec 20, 2017 at 11:58 AM, Atin Mukherjee <amukh...@redhat.com>
>>> wrote:
>>>
>>>> ./tests/bugs/glusterd/bug-1230121-replica_subvol_count_correct_cal.t
>>>>
>>>
>>> Unfortunately the above is passing in my setup. I'll be checking the
>>> logs to see if I can figure out the issue.
>>>
>>> ./tests/features/trash.t
>>>>
>>>
>>> Rebalance fails here consistently when brick mux is enabled with the
>>> following message:
>>>
>>> [2017-12-21 13:55:31.881268] I [MSGID: 109081]
>>> [dht-common.c:5538:dht_setxattr] 0-patchy-dht: fixing the layout of /
>>> [2017-12-21 13:55:31.881289] W [MSGID: 109016]
>>> [dht-selfheal.c:1930:dht_fix_layout_of_directory] 0-patchy-dht: Layout
>>> fix failed: 1 subvolume(s) are down. Skipping fix layout. path:/
>>> gfid:----0001
>>> [2017-12-21 13:55:31.881525] E [MSGID: 109026]
>>> [dht-rebalance.c::gf_defrag_start_crawl] 0-patchy-dht: fix layout
>>> on / failed [Transport endpoint is not connected]
>>>
>>> When I revert the commit 56e5fda "rpc: merge ssl infra with epoll infra"
>>> the test passes through.
>>> I tend to believe that the other failures especially ssl-ciphers.t
>>> <http://git.gluster.org/cgit/glusterfs.git/tree/tests/features/ssl-ciphers.t>
>>> from fstat.gluster.log could be due to the same patch.
>>>
>>> @Milind - Need your attention here.
>>>
>>
>> Since the test failures are constant, I propose to revert the commit
>> 56e5fda from mainline till these issues are looked at.
>>
>
> After setting the SSL CRL path to NULL, if the volume is restarted, the
> test tests/features/ssl-ciphers.t passes in multiplex mode.
> SSL socket options aren't handled in the reconfigure() entry-point for
> the socket transport.
> I'll post a patch soon to fix the test case.
>

Or maybe not.
Need to take a closer look here.


>
>>
>>>
>>>> The above two are new failures since day before yesterday. Job link is
>>>> at https://build.gluster.org/job/regression-test-with-multiplex
>>>> /574/consoleFull .
>>>>
>>>>
>>>>
>>>> -- Forwarded message --
>>>> From: <jenk...@build.gluster.org>
>>>> Date: Wed, Dec 20, 2017 at 12:24 AM
>>>> Subject: Build failed in Jenkins: regression-test-with-multiplex #574
>>>> To: maintain...@gluster.org, amukh...@redhat.com, j...@pl.atyp.us,
>>>> jaher...@redhat.com, jda...@fb.com, kdhan...@redhat.com,
>>>> rgowd...@redhat.com, khire...@redhat.com, ama...@redhat.com,
>>>> nbala...@redhat.com, nig...@redhat.com, srang...@redhat.com
>>>>
>>>>
>>>> See <https://build.gluster.org/job/regression-test-with-multiple
>>>> x/574/display/redirect?page=changes>
>>>>
>>>> Changes:
>>>>
>>>> [Jeff Darcy] protocol/server: add dump_metrics method
>>>>
>>>> [Jeff Darcy] snapshot: Fix several coverity issues in
>>>> glusterd-snapshot-utils.c
>>>>
>>>> [Kotresh H R] feature/bitrot: remove internal xattrs from lookup cbk
>>>>
>>>> --
>>>> [...truncated 784.50 KB...]
>>>> ./tests/bugs/nfs/bug-904065.t  -  9 second
>>>> ./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  9 second
>>>> ./tests/bugs/md-cache/bug-1211863.t  -  9 second
>>>> ./tests/bugs/glusterd/bug-949930.t  -  9 second
>>>> ./tests/bugs/glusterd/bug-1420637-volume-sync-fix.t  -  9 second
>>>> ./tests/bugs/glusterd/bug-1121584-brick-existing-validation-
>>>> for-remove-brick-status-stop.t  -  9 second
>>>> ./tests/bugs/glusterd/bug-1104642.t  -  9 second
>>>> ./tests/bugs/distribute/bug-961615.t  -  9 second
>>>> ./tests/bugs/distribute/bug-1247563.t  -  9 second
>>>> ./tests/bugs/distribute/bug-1086228.t  -  9 second
>>>> ./tests/bugs/cli/bug-1087487.t  -  9 second
>>>> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
>>>> -  9 second

Re: [Gluster-devel] trash.t failure with brick multiplexing [Was Re: Build failed in Jenkins: regression-test-with-multiplex #574]

2018-01-02 Thread Milind Changire
On Tue, Jan 2, 2018 at 10:44 AM, Atin Mukherjee  wrote:

>
>
> On Thu, Dec 21, 2017 at 7:27 PM, Atin Mukherjee 
> wrote:
>
>>
>>
>> On Wed, Dec 20, 2017 at 11:58 AM, Atin Mukherjee 
>> wrote:
>>
>>> ./tests/bugs/glusterd/bug-1230121-replica_subvol_count_correct_cal.t
>>>
>>
>> Unfortunately the above is passing in my setup. I'll be checking the logs
>> to see if I can figure out the issue.
>>
>> ./tests/features/trash.t
>>>
>>
>> Rebalance fails here consistently when brick mux is enabled with the
>> following message:
>>
>> [2017-12-21 13:55:31.881268] I [MSGID: 109081]
>> [dht-common.c:5538:dht_setxattr] 0-patchy-dht: fixing the layout of /
>> [2017-12-21 13:55:31.881289] W [MSGID: 109016]
>> [dht-selfheal.c:1930:dht_fix_layout_of_directory] 0-patchy-dht: Layout
>> fix failed: 1 subvolume(s) are down. Skipping fix layout. path:/
>> gfid:----0001
>> [2017-12-21 13:55:31.881525] E [MSGID: 109026]
>> [dht-rebalance.c::gf_defrag_start_crawl] 0-patchy-dht: fix layout on
>> / failed [Transport endpoint is not connected]
>>
>> When I revert the commit 56e5fda "rpc: merge ssl infra with epoll infra"
>> the test passes through.
>> I tend to believe that the other failures especially ssl-ciphers.t
>> 
>> from fstat.gluster.log could be due to the same patch.
>>
>> @Milind - Need your attention here.
>>
>
> Since the test failures are constant, I propose to revert the commit
> 56e5fda from mainline till these issues are looked at.
>

After setting the SSL CRL path to NULL, if the volume is restarted, the
test tests/features/ssl-ciphers.t passes in multiplex mode.
SSL socket options aren't handled in the reconfigure() entry-point for the
socket transport.
I'll post a patch soon to fix the test case.


>
>>
>>> The above two are new failures since day before yesterday. Job link is
>>> at https://build.gluster.org/job/regression-test-with-multiplex
>>> /574/consoleFull .
>>>
>>>
>>>
>>> -- Forwarded message --
>>> From: 
>>> Date: Wed, Dec 20, 2017 at 12:24 AM
>>> Subject: Build failed in Jenkins: regression-test-with-multiplex #574
>>> To: maintain...@gluster.org, amukh...@redhat.com, j...@pl.atyp.us,
>>> jaher...@redhat.com, jda...@fb.com, kdhan...@redhat.com,
>>> rgowd...@redhat.com, khire...@redhat.com, ama...@redhat.com,
>>> nbala...@redhat.com, nig...@redhat.com, srang...@redhat.com
>>>
>>>
>>> See >> x/574/display/redirect?page=changes>
>>>
>>> Changes:
>>>
>>> [Jeff Darcy] protocol/server: add dump_metrics method
>>>
>>> [Jeff Darcy] snapshot: Fix several coverity issues in
>>> glusterd-snapshot-utils.c
>>>
>>> [Kotresh H R] feature/bitrot: remove internal xattrs from lookup cbk
>>>
>>> --
>>> [...truncated 784.50 KB...]
>>> ./tests/bugs/nfs/bug-904065.t  -  9 second
>>> ./tests/bugs/nfs/bug-1157223-symlink-mounting.t  -  9 second
>>> ./tests/bugs/md-cache/bug-1211863.t  -  9 second
>>> ./tests/bugs/glusterd/bug-949930.t  -  9 second
>>> ./tests/bugs/glusterd/bug-1420637-volume-sync-fix.t  -  9 second
>>> ./tests/bugs/glusterd/bug-1121584-brick-existing-validation-
>>> for-remove-brick-status-stop.t  -  9 second
>>> ./tests/bugs/glusterd/bug-1104642.t  -  9 second
>>> ./tests/bugs/distribute/bug-961615.t  -  9 second
>>> ./tests/bugs/distribute/bug-1247563.t  -  9 second
>>> ./tests/bugs/distribute/bug-1086228.t  -  9 second
>>> ./tests/bugs/cli/bug-1087487.t  -  9 second
>>> ./tests/bugs/bitrot/1209752-volume-status-should-show-bitrot-scrub-info.t
>>> -  9 second
>>> ./tests/basic/tier/ctr-rename-overwrite.t  -  9 second
>>> ./tests/basic/stats-dump.t  -  9 second
>>> ./tests/basic/quota_aux_mount.t  -  9 second
>>> ./tests/basic/inode-quota-enforcing.t  -  9 second
>>> ./tests/basic/fop-sampling.t  -  9 second
>>> ./tests/gfid2path/get-gfid-to-path.t  -  8 second
>>> ./tests/bugs/upcall/bug-1227204.t  -  8 second
>>> ./tests/bugs/replicate/bug-1448804-check-quorum-type-values.t  -  8
>>> second
>>> ./tests/bugs/glusterfs/bug-902610.t  -  8 second
>>> ./tests/bugs/glusterd/bug-889630.t  -  8 second
>>> ./tests/bugs/glusterd/bug-859927.t  -  8 second
>>> ./tests/bugs/glusterd/bug-1323287-real_path-handshake-test.t  -  8
>>> second
>>> ./tests/bugs/glusterd/bug-1213295-snapd-svc-uninitialized.t  -  8 second
>>> ./tests/bugs/glusterd/bug-1109741-auth-mgmt-handshake.t  -  8 second
>>> ./tests/bugs/glusterd/bug-1046308.t  -  8 second
>>> ./tests/bugs/ec/bug-1179050.t  -  8 second
>>> ./tests/bugs/distribute/bug-1122443.t  -  8 second
>>> ./tests/bugs/distribute/bug-1088231.t  -  8 second
>>> ./tests/bugs/changelog/bug-1208470.t  -  8 second
>>> ./tests/bugs/bitrot/1209818-vol-info-show-scrub-process-properly.t  -
>>> 8 second
>>> ./tests/bugs/bitrot/1207029-bitrot-daemon-should-start-on-valid-node.t

Re: [Gluster-devel] Need inputs on patch #17985

2017-12-06 Thread Milind Changire
With the tests conducted, I could not find any evidence of a performance
regression in quick-read.



On Thu, Nov 30, 2017 at 11:01 AM, Raghavendra G 
wrote:

> I think this caused regression in quick-read. On going through code, I
> realized Quick-read doesn't fetch content of a file pointed by dentry in
> readdirplus. Since, the patch in question prevents any lookup from
> resolver, reads on the file till the duration of "entry-timeout" (a cmdline
> option to fuse mount, whose default value is 1 sec) after the entry was
> discovered in readdirplus will not be served by quick-read even though the
> size of file is eligible to be cached. This may cause perf regression in
> read heavy workloads on smallfiles. We'll be doing more testing to identify
> this.
>
> On Tue, Sep 12, 2017 at 11:31 AM, Raghavendra G 
> wrote:
>
>> Update. Two more days to go for the deadline. Till now, there are no open
>> issues identified against this patch.
>>
>> On Fri, Sep 8, 2017 at 6:54 AM, Raghavendra Gowdappa > > wrote:
>>
>>>
>>>
>>> - Original Message -
>>> > From: "FNU Raghavendra Manjunath" 
>>> > To: "Raghavendra Gowdappa" 
>>> > Cc: "Raghavendra G" , "Nithya Balachandran"
>>> , anoo...@redhat.com,
>>> > "Gluster Devel" , "Raghavendra Bhat" <
>>> raghaven...@redhat.com>
>>> > Sent: Thursday, September 7, 2017 6:44:51 PM
>>> > Subject: Re: [Gluster-devel] Need inputs on patch #17985
>>> >
>>> > From snapview client perspective one important thing to note. For
>>> building
>>> > the context for the entry point (by default ".snaps") a explicit
>>> lookup has
>>> > to be done on it. The dentry for ".snaps" is not returned when readdir
>>> is
>>> > done on its parent directory (Not even when ls -a is done). So for
>>> building
>>> > the context of .snaps (in the context snapview client saves the
>>> information
>>> > about whether it is a real inode or virtual inode) we need a lookup.
>>>
>>> Since the dentry corresponding to ".snaps" is not returned, there won't
>>> be an inode for this directory linked in itable. Also, glusterfs wouldn't
>>> have given nodeid corresponding to ".snaps" during readdir response (as
>>> dentry itself is not returned). So, kernel would do an explicit lookup
>>> before doing any operation on ".snaps" (unlike for those dentries which
>>> contain nodeid kernel can choose to skip a lookup) and we are safe. So,
>>> #17985 is safe in its current form.
>>>
>>> >
>>> > From snapview server perspective as well a lookup might be needed. In
>>> > snapview server a glfs handle is established between the snapview
>>> server
>>> > and the snapshot brick. So a inode in snapview server process contains
>>> the
>>> > glfs handle for the object being accessed from snapshot.  In snapview
>>> > server readdirp does not build the inode context (which contains the
>>> glfs
>>> > handle etc) because glfs handle is returned only in lookup.
>>>
>>> Same argument I've given holds good for this case too. Important point
>>> to note is that "there is no dentry and hence no nodeid corresponding to
>>> .snaps is passed to kernel and kernel is forced to do an explicit lookup".
>>>
>>> >
>>> > Regards,
>>> > Raghavendra
>>> >
>>> >
>>> > On Tue, Aug 29, 2017 at 12:53 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com>
>>> > wrote:
>>> >
>>> > >
>>> > >
>>> > > - Original Message -
>>> > > > From: "Raghavendra G" 
>>> > > > To: "Nithya Balachandran" 
>>> > > > Cc: "Raghavendra Gowdappa" ,
>>> anoo...@redhat.com,
>>> > > "Gluster Devel" ,
>>> > > > raghaven...@redhat.com
>>> > > > Sent: Tuesday, August 29, 2017 8:52:28 AM
>>> > > > Subject: Re: [Gluster-devel] Need inputs on patch #17985
>>> > > >
>>> > > > On Thu, Aug 24, 2017 at 2:53 PM, Nithya Balachandran <
>>> > > nbala...@redhat.com>
>>> > > > wrote:
>>> > > >
>>> > > > > It has been a while but iirc snapview client (loaded abt
>>> dht/tier etc)
>>> > > had
>>> > > > > some issues when we ran tiering tests. Rafi might have more info
>>> on
>>> > > this -
>>> > > > > basically it was expecting to find the inode_ctx populated but
>>> it was
>>> > > not.
>>> > > > >
>>> > > >
>>> > > > Thanks Nithya. @Rafi, @Raghavendra Bhat, is it possible to take the
>>> > > > ownership of,
>>> > > >
>>> > > > * Identifying whether the patch in question causes the issue?
>>> > >
>>> > > gf_svc_readdirp_cbk is setting relevant state in inode [1]. I quickly
>>> > > checked whether its the same state stored by gf_svc_lookup_cbk and
>>> it looks
>>> > > like the same state. So, I guess readdirp is handled correctly by
>>> > > snapview-client and an explicit lookup is not required. But, will
>>> wait for
>>> > > inputs from rabhat and rafi.
>>> > >
>>> > > [1] 

[Gluster-devel] [FAILED] [master] snapshot test failed and generated core on master

2017-12-01 Thread Milind Changire
Snapshot team,
Please take a look at
https://build.gluster.org/job/centos6-regression/7804/console to help me
understand if there's anything amiss from my end.

Job URL: https://build.gluster.org/job/centos6-regression/7804/console

-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [FAILED] [master] ./tests/basic/afr/split-brain-favorite-child-policy.t

2017-11-23 Thread Milind Changire
Request AFR team to take a peek at:
https://build.gluster.org/job/centos6-regression/7623/console

FYI: My patch addresses changes related to SSL communication.

-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-Maintainers] Changing Submit Type on review.gluster.org

2017-09-07 Thread Milind Changire
*Squashed Patches*
I believe, individual engineers have to own the responsibility of
maintaining history of all appropriate Change-Ids as part of the commit
message when multiple patches have been squashed/merged into one commit.




On Thu, Sep 7, 2017 at 11:50 AM, Nigel Babu  wrote:

> Hello folks,
>
> A few times, we've merged dependent patches out of order because the Submit
> type[1] did not block us from doing so. The last few times we've talked
> about
> this, we didn't actually take a strong decision either way. In yesterday's
> maintainers meeting, we agreed to change the Submit type to
> Rebase-If-Necessary. This change will happen on 18th September 2017.
>
> What this means:
> * No more metadata flags added by Gerrit. There will only be a Change-Id,
>   Signed-off-by, and BUG (if you've added it). Gerrit itself will not add
> any
>   metadata.
> * If you push a patch on top of another patch, the Submit button will
> either be
>   grayed out because the dependent patches cannot be merged or they will be
>   submited in the correct order in one go.
>
> Some of the concerns that have been raised:
> Q: With the Reviewed-on flag gone, how do we keep track of changesets
>(especially backports)?
> A: The Change-Id will get you all the data directly on Gerrit. As long you
>retain the Change-Id, Gerrit will get you the matching changesets.
>
> Q: Will who-wrote-what continue to work?
> A: As far as I can see, it continues to work. I ran the script against
>build-jobs repo and it works correctly. Additionally, we'll be setting
> up an
>instance of Gerrit Stats[2] to provide more detailed stats.
>
> Q: Can we have some of the metadata if not all?
> Q: Why can't we have the metadata if we change the submit type?
> A: There's no good answer to this other than, this is how Gerrit works and
>I can neither change it nor control it.
>
> [1]: https://review.gluster.org/Documentation/intro-project-
> owner.html#submit-type
> [2]: http://gerritstats-demo.firebaseapp.com/
>
> --
> nigelb
> ___
> maintainers mailing list
> maintain...@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>



-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Glusterd2 - Some anticipated changes to glusterfs source

2017-08-03 Thread Milind Changire
On Thu, Aug 3, 2017 at 12:56 PM, Kaushal M  wrote:

> On Thu, Aug 3, 2017 at 2:14 AM, Niels de Vos  wrote:
> > On Wed, Aug 02, 2017 at 05:03:35PM +0530, Prashanth Pai wrote:
> >> Hi all,
> >>
> >> The ongoing work on glusterd2 necessitates following non-breaking and
> >> non-exhaustive list of changes to glusterfs source code:
> >>
> >> Port management
> >> - Remove hard-coding of glusterd's port as 24007 in clients and
> elsewhere.
> >>   Glusterd2 can be configured to listen to clients on any port (still
> >> defaults to
> >>   24007 though)
> >> - Let the bricks and daemons choose any available port and if needed
> report
> >>   the port used to glusterd during the "sign in" process. Prasanna has a
> >> patch
> >>   to do this.
> >> - Glusterd <--> brick (or any other local daemon) communication should
> >>   always happen over Unix Domain Socket. Currently glusterd and brick
> >>   process communicates over UDS and also port 24007. This will allow us
> >>   to set better authentication and rules for port 24007 as it shall
> only be
> >> used
> >>   by clients.
> >
> > I prefer this last point to be configurable. At least for debugging we
> > should be able to capture network traces and display the communication
> > in Wireshark. Defaulting to UNIX Domain Sockets is fine though.
>
> This is the communication between GD2 and bricks, of which there is
> not a lot happening, and not much to capture.
> But I agree, it will be nice to have this configurable.
>
>
Could glusterd start attempting port binding at 24007 and progress on to
higher port numbers until successful and register the bound port number
with rpcbind ? This way the setup will be auto-configurable and admins need
not scratch their heads to decide upon one port number. Gluster clients
could always talk to rpcbind on the nodes to get glusterd service port
whenever a reconnect is required.


> >
> >
> >> Changes to xlator options
> >> - Xlator authors do not have to modify glusterd2 code to expose new
> xlator
> >>   options. IOW, glusterd2 will not contain the "glusterd_volopt_map"
> table.
> >>   Most of its fields will be moved to the xlator itself. Glusterd2 can
> load
> >>   xlator's shared object and read it's volume_options table. This also
> means
> >>   xlators have to adhere to some naming conventions for options.
> >> - Add following additional fields (names are indicative) to
> volume_option_t:
> >> - Tag: This is to enable users to list only options having a certain
> >> tag.
> >>  IOW, it allows us to filter "volume set help" like output.
> >>  Example of tags: debug, perf, network etc.
> >> - Opversion: The minimum (or a range) op-version required by the
> xlator.
> >> - Configurable: A bool to indicate whether this option is
> >> user-configurable.
> >>   This may also be clubbed with DOC/NO_DOC
> >> functionality.
> >
> > This is something I have been thinking about to do as well. libgfapi
> > users would like to list all the valid options before mounting (and
> > receiving the .vol file) is done. Similar to how many mount options are
> > set over FUSE, the options should be available through libgfapi.
> > Hardcoding the options is just wrong, inspecting the available xlators
> > (.so files) seems to make more sense. Each option would have to describe
> > if it can be client-side so that we can apply some resonable filters by
> > default.
> >
>
> Looks like we'd missed this. All the fields available in the vol opt
> map will move to xlator option tables, including the client flag.
>
> > A GitHub Issue with this feature request is at
> > https://github.com/gluster/glusterfs/issues/263. I appreciate additional
> > comments and ideas about it :-)
> >
>
> We need to open an issue for our requested changes are well, which
> will be a superset of this request. We'll make sure to mention this
> feature request in it.
> Or we could use a single issue as a tracker for all the xlator option
> changes, in which case I'd prefer we update the existing issue.
>
> >
> >> - Xlators like AFR, changelog require non-static information such as
> brick
> >> path
> >>   to be present in it's options in the volfile. Currently, xlator
> authors
> >> have
> >>   to modify glusterd code to get it.
> >>   This can rather be indicated by the xlator itself using
> >> templates/placehoders.
> >>   For example, "changelog-dir" can be set in xlator's option as as
> >>   <>/.glusterfs/changelogs and then glusterd2 will ensure
> to
> >> replace
> >>   <> with actual path during volfile generation.
> >
> > I suggest to stick with whatever is a common syntax for other
> > configuration files that uses placeholders. Maybe just {variable} or
> > $VARIABLE, the <> looks a bit awkward.
>
> The exact syntax for these variables hasn't been decided yet. But I'm
> leaning towards '{{ variable }}' used in the Go template package,
> which is what we'll mostly end up using to 

Re: [Gluster-devel] brick multiplexing and memory consumption

2017-06-24 Thread Milind Changire
could we GF_DISABLE_MEMPOOL and use -fsanitize=address, -fsanitize=thread
and -fsanitize=leak ?

Or have these been tried and tested or are implied with the other compiler
options we use already ?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] glusterfind: requerst for reviews

2017-06-08 Thread Milind Changire
tools/glusterfind: add --end-time option 

tools/glusterfind: add --field-separator option


-- 
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [master] [FAILED] ./tests/bugs/core/bug-1432542-mpx-restart-crash.t: 12 new core files

2017-05-22 Thread Milind Changire

Job: https://build.gluster.org/job/centos6-regression/4731/console

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [release-3.8] FAILED ./tests/bitrot/br-state-check.t: 1 new core files

2017-05-18 Thread Milind Changire

FYI: https://build.gluster.org/job/netbsd7-regression/4167/consoleFull

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Gluster RPC Internals - Lecture #2 - recording

2017-03-07 Thread Milind Changire

https://bluejeans.com/s/G4Nx@/

To download the recording:
Please hover mouse over the thumbnail at the bottom and you should see
a download icon appear at the right bottom corner of the thumbnail.
The icon remains hidden until you move the mouse over the thumbnail.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [REMINDER] Gluster RPC Internals - Lecture #2 - TODAY

2017-03-06 Thread Milind Changire

Blue Jeans Meeting ID: 1546612044
Start Time: 7:30pm India Time (UTC+0530)
Duration: 2 hours

https://www.bluejeans.com/

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Gluster RPC Internals - Lecture #2

2017-03-02 Thread Milind Changire via Blue Jeans Network
BEGIN:VCALENDAR
VERSION:2.0
METHOD:REQUEST
PRODID:-//PYVOBJECT//NONSGML Version 1//EN
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
TZURL:http://tzurl.org/zoneinfo-outlook/Asia/Kolkata
X-LIC-LOCATION:Asia/Kolkata
BEGIN:STANDARD
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
DTSTART:19700101T00
END:STANDARD
END:VTIMEZONE

BEGIN:VEVENT
UID:YrxBp_DXyAhT9RVh@LdIq9gPgQSCuNY4
DTSTART;TZID=Asia/Kolkata:20170307T193000
DTEND;TZID=Asia/Kolkata:20170307T213000
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP
 =TRUE:mailto:gluster-devel@gluster.org
CREATED:20170302T080844Z
DESCRIPTION:\n\n\nTo join the meeting on a computer or mobile phone: https
 ://bluejeans.com/1546612044?src=calendarLink\n\nMilind Changire has invite
 d you to a video meeting.\n\nTo join from a Red Hat Deskphone or Softphone
 \, dial: 84336.\n---\nConnecting directly 
 from a room system?\n1) Dial: 199.48.152.152 or bjn.vc\n2) Enter Meeting I
 D: 1546612044\n\nJust want to dial in on your phone?\n1) + 656 484 0858 \n
  (https://www.intercallonline.com/listNumbersByCode.action?confCode=15
 46612044)\n2) Enter Meeting ID: 1546612044\n3) Press #\n--
 -\nWant to test your video connection?\nhttp://bluejeans.c
 om/111\n\n
DTSTAMP:20170302T080845Z
LAST-MODIFIED:20170302T080844Z
LOCATION:
ORGANIZER:mailto:mchan...@redhat.com
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Gluster RPC Internals - Lecture #2
TRANSP:OPAQUE
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Gluster RPC Internals - Lecture #1 - recording

2017-03-01 Thread Milind Changire

https://bluejeans.com/s/e59Wh/

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] [FAILED] [centos6] tests/bitrot/bug-1373520.t

2017-02-28 Thread Milind Changire

2 of 2 runs failed:

https://build.gluster.org/job/centos6-regression/3451/consoleFull

https://build.gluster.org/job/centos6-regression/3458/consoleFull
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] updatedb and gluster volumes

2017-02-25 Thread Milind Changire

Would it be wise to prevent updatedb from crawling ALL Gluster volumes ?
i.e. at the brick for servers as well as on the mount point for clients

The implementation would be to add glusterfs as a file system type to
updatedb.conf against the PRUNEFS variable setting.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Gluster RPC Internals Lecture by Raghavendra G

2017-02-21 Thread Milind Changire via Blue Jeans Network
BEGIN:VCALENDAR
VERSION:2.0
METHOD:REQUEST
PRODID:-//PYVOBJECT//NONSGML Version 1//EN
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
TZURL:http://tzurl.org/zoneinfo-outlook/Asia/Kolkata
X-LIC-LOCATION:Asia/Kolkata
BEGIN:STANDARD
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
DTSTART:19700101T00
END:STANDARD
END:VTIMEZONE

BEGIN:VEVENT
UID:MTo2D_NIs0Cglp98@O9KUT9PFMK44sbm
DTSTART;TZID=Asia/Kolkata:20170228T203000
DTEND;TZID=Asia/Kolkata:20170228T213000
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP
 =TRUE:mailto:gluster-devel@gluster.org
CREATED:20170222T070435Z
DESCRIPTION:\n\n\nTo join the meeting on a computer or mobile phone: https
 ://bluejeans.com/1546612044?src=calendarLink\n\nMilind Changire has invite
 d you to a video meeting.\n\nTo join from a Red Hat Deskphone or Softphone
 \, dial: 84336.\n---\nConnecting directly 
 from a room system?\n1) Dial: 199.48.152.152 or bjn.vc\n2) Enter Meeting I
 D: 1546612044\n\nJust want to dial in on your phone?\n1) + 656 484 0858 \n
  (https://www.intercallonline.com/listNumbersByCode.action?confCode=15
 46612044)\n2) Enter Meeting ID: 1546612044\n3) Press #\n--
 -\nDescription:\nNOTE:\nPlease accept the invitation recei
 ved in the channel most relevant to you so that it shows up in the correct
  Calendar\n---\nWant to test your video co
 nnection?\nhttp://bluejeans.com/111\n\n
DTSTAMP:20170222T070436Z
LAST-MODIFIED:20170222T070435Z
LOCATION:
ORGANIZER:mailto:mchan...@redhat.com
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Gluster RPC Internals Lecture by Raghavendra G
TRANSP:OPAQUE
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] patch for "limited performance for disperse volumes"

2017-02-10 Thread Milind Changire

Here's a quote from a paper titled: Non-blocking Writes to Files
https://www.usenix.org/conference/fast15/technical-sessions/presentation/campello

-
Ordering of Page Updates.
Non-blocking writes may alter the sequence in which patches to
different pages get applied since the page fetches may complete
out-of-order. Non-blocking writes only replace writes that are
to memory that are not guaranteed to be reflected to persistent
storage in any particular sequence. Thus, ordering violations in
updates of in-memory pages are crash-safe.

Page Persistence and Syncs.
If an application would like explicit disk ordering for memory
page updates, it would execute a blocking flush operation
(e.g., fsync ) subsequent to each operation. The flush operation
causes the OS to force the fetch of any page indexed as NBW even
if it has not been allocated yet. The OS then obtains the page
lock, waits for the page fetch, and applies any outstanding
patches, before flushing the page and returning control to the
application. Ordering of disk writes are thus preserved with
non-blocking writes.
-

Milind

On 02/10/2017 01:37 PM, Xavier Hernandez wrote:

Hi Raghavendra,

On 10/02/17 04:51, Raghavendra Gowdappa wrote:

+gluster-devel

- Original Message -

From: "Milind Changire" <mchan...@redhat.com>
To: "Raghavendra Gowdappa" <rgowd...@redhat.com>
Cc: "rhs-zteam" <rhs-zt...@redhat.com>
Sent: Thursday, February 9, 2017 11:00:18 PM
Subject: patch for "limited performance for disperse volumes"

My first comment was:
looks like patch for "limited performance for disperse volume" [1] is
going
to be helpful for all other types of volumes as well; but how do we
guarantee ordering for writes over the same fd for the same offset and
length in the file ?

then thinking over a bit and in case you missed my comment over IRC:
I was thinking about network multi-pathing and rpc requests(two writes)
being routed through different interfaces to gluster nodes which might
lead to a non-increasing transaction ID sequence and hence might lead
to incorrect final value if the older write is committed to the same
offset+length

then it dawned on me that for blocking operations the write() call
wont return until the data is safe on the disk across the network or
the intermediate translators have cached it appropriately to be
written behind.

so would the patch work for two non-blocking writes originating for the
same fd from the same thread for the same offset+length and being
routed over multi-pathing and write #2 getting routed quicker than
write #1 ?


To be honest I've not considered the case of asynchronous writes from
application till now. What is the ordering guarantee the
OS/filesystems provide for two async writes? For eg., if there are two
writes w1 and w2, when is w2 issued?
* After cbk of w1 is called or
* parallely just after async_write (w1) returns (cbk of w1 is not
invoked yet)?

What do POSIX or other standards (or expectation from OS) say about
ordering in case 2 above?


I'm not an expert on POSIX. But I've found this [1]:

2.9.7 Thread Interactions with Regular File Operations

All of the following functions shall be atomic with respect to
each other in the effects specified in POSIX.1-2008 when they
operate on regular files or symbolic links: [...] write [...]

If two threads each call one of these functions, each call shall
either see all of the specified effects of the other call, or none
of them. The requirement on the close() function shall also apply
whenever a file descriptor is successfully closed, however caused
(for example, as a consequence of calling close(), calling dup2(),
or of process termination).

Not sure if this also applies to write requests issued asynchronously
from the same thread, but this would be the worst case (if the OS
already orders it, we won't have any problem).

As I see it, this is already satisfied by EC because it doesn't allow
two concurrent writes to happen at the same time. They can be reordered
if the second one arrives before the first one, but they are executed
atomically as POSIX requires. Not sure if AFR also satisfies this
condition, but I think so.

From the point of view of EC it's irrelevant if the write comes from the
same thread or from different processes on different clients. They are
handled in the same way.

However a thing to be aware of (from the man page of write):

[...] among the effects that should be atomic across threads (and
processes) are updates of the file offset. However, on Linux before
version 3.14, this was not the case: if two processes that share an
open file description (see open(2)) perform a write() (or
writev(2)) at the same time, then the I/O operations were not atomic
with respect updating the file offset, with the result that the
blocks of data output by the two processes might (incorrectly

Re: [Gluster-devel] decoupling network.ping-timeout and transport.tcp-user-timeout

2017-01-11 Thread Milind Changire

+gluster-users

Milind

On 01/11/2017 03:21 PM, Milind Changire wrote:

The management connection uses network.ping-timeout to time out and
retry connection to a different server if the existing connection
end-point is unreachable from the client.
Due to the nature of the parameters involved in the TCP/IP network
stack, it becomes imperative to control the other network connections
using the socket level tunables:
* SO_KEEPALIVE
* TCP_KEEPIDLE
* TCP_KEEPINTVL
* TCP_KEEPCNT

So, I'd like to decouple the network.ping-timeout and
transport.tcp-user-timeout since they are tunables for different
aspects of gluster application. network-ping-timeout monitors the
brick/node level responsiveness and transport.tcp-user-timeout is one
of the attributes that is used to manage the state of the socket.

Saying so, we could do away with network.ping-timeout altogether and
stick with transport.tcp-user-timeout for types of sockets. It becomes
increasingly difficult to work with different tunables across gluster.

I believe, there have not been many cases in which the community has
found the existing defaults for socket timeout unusable. So we could
stick with the system defaults and add the following socket level
tunables and make them open for configuration:
* client.tcp-user-timeout
 which sets transport.tcp-user-timeout
* client.keepalive-time
 which sets transport.socket.keepalive-time
* client.keepalive-interval
 which sets transport.socket.keepalive-interval
* client.keepalive-count
 which sets transport.socket.keepalive-count
* server.tcp-user-timeout
 which sets transport.tcp-user-timeout
* server.keepalive-time
 which sets transport.socket.keepalive-time
* server.keepalive-interval
 which sets transport.socket.keepalive-interval
* server.keepalive-count
 which sets transport.socket.keepalive-count

However, these settings would effect all sockets in gluster.
In cases where aggressive timeouts are needed, the community can find
gluster options which have 1:1 mapping with socket level options as
documented in tcp(7).

Please share your thoughts about the risks or effectiveness of the
decoupling.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] decoupling network.ping-timeout and transport.tcp-user-timeout

2017-01-11 Thread Milind Changire

The management connection uses network.ping-timeout to time out and
retry connection to a different server if the existing connection
end-point is unreachable from the client.
Due to the nature of the parameters involved in the TCP/IP network
stack, it becomes imperative to control the other network connections
using the socket level tunables:
* SO_KEEPALIVE
* TCP_KEEPIDLE
* TCP_KEEPINTVL
* TCP_KEEPCNT

So, I'd like to decouple the network.ping-timeout and
transport.tcp-user-timeout since they are tunables for different
aspects of gluster application. network-ping-timeout monitors the
brick/node level responsiveness and transport.tcp-user-timeout is one
of the attributes that is used to manage the state of the socket.

Saying so, we could do away with network.ping-timeout altogether and
stick with transport.tcp-user-timeout for types of sockets. It becomes
increasingly difficult to work with different tunables across gluster.

I believe, there have not been many cases in which the community has
found the existing defaults for socket timeout unusable. So we could
stick with the system defaults and add the following socket level
tunables and make them open for configuration:
* client.tcp-user-timeout
 which sets transport.tcp-user-timeout
* client.keepalive-time
 which sets transport.socket.keepalive-time
* client.keepalive-interval
 which sets transport.socket.keepalive-interval
* client.keepalive-count
 which sets transport.socket.keepalive-count
* server.tcp-user-timeout
 which sets transport.tcp-user-timeout
* server.keepalive-time
 which sets transport.socket.keepalive-time
* server.keepalive-interval
 which sets transport.socket.keepalive-interval
* server.keepalive-count
 which sets transport.socket.keepalive-count

However, these settings would effect all sockets in gluster.
In cases where aggressive timeouts are needed, the community can find
gluster options which have 1:1 mapping with socket level options as
documented in tcp(7).

Please share your thoughts about the risks or effectiveness of the
decoupling.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Release 3.10 feature proposal: multi-threaded promotions and demotions in tiering

2016-12-09 Thread Milind Changire

Currently, there's a single promotion thread and a single demotion
thread serving every tier volume. The individual threads iterate
over the bricks every pass. This can be improved in performance by
assigning multiple promotion and demotion threads to a single brick.
Different bricks can independently promote heated files as well as
demote cooler files quicker to make space for new heated files.

Issue opened at [1]

--
Milind

[1] https://github.com/gluster/glusterfs/issues/53
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tiering: emergency demotions

2016-10-13 Thread Milind Changire

Dilemma:
*without* my patch, the demotions in degraded (hi-watermark breached)
mode happen every 10 seconds by listing *all* files colder than the
last 10 seconds and sorting them in ascending order w.r.t. the
(write,read) access time ... so the existing query could take more than
a minute to list files if there are millions of them

*with* my patch we currently select a random set of 20 files and demote
them ... even if they are actively used ... so we either wait for more
than a minute for the exact listing of cold files in the worst case or
trade off by demoting hot files without imposing a file selection
criteria for a quicker turnaround time

The exponential time window schema to select files discussed over Google 
Hangout has an issue with deciding the start time of the time window, 
although we know the end time being the current time


So, I think it would be either of the strategies discussed above with a 
trade-off in one way or the other.


Comments are requested regarding the approach to take for the
implementation.

Rafi has also suggested to avoid file creation on the hot tier if the
hot tier has hi-watermark breached to avoid further stress on storage
capacity and eventual file migration to the cold tier.

Do we introduce demotion policies like "strict" and "approximate" to
let user choose the demotion strategy ?
1. strict
   Choosing this strategy could mean we wait for the full and ordered
   query to complete and only then start demoting the coldest file first

2. approximate
   Choosing this strategy could mean we choose the the first available
   file from the database query and demote it even if it is hot and
   actively written to


Milind

On 08/12/2016 08:25 PM, Milind Changire wrote:

Patch for review: http://review.gluster.org/15158

Milind

On 08/12/2016 07:27 PM, Milind Changire wrote:

On 08/10/2016 12:06 PM, Milind Changire wrote:

Emergency demotions will be required whenever writes breach the
hi-watermark. Emergency demotions are required to avoid ENOSPC in case
of continuous writes that originate on the hot tier.

There are two concerns in this area:

1. enforcing max-cycle-time during emergency demotions
   max-cycle-time is the time the tiering daemon spends in promotions or
   demotions
   I tend to think that the tiering daemon skip this check for the
   emergency situation and continue demotions until the watermark drops
   below the hi-watermark


Update:
To keep matters simple and manageable, it has been decided to *enforce*
max-cycle-time to yield the worker threads to attend to impending tier
management tasks if the need arises.



2. file demotion policy
   I tend to think that evicting the largest file with the most recent
   *write* should be chosen for eviction when write-freq-threshold is
   NON-ZERO.
   Choosing a least written file is just going to delay file migration
   of an active file which might consume hot tier disk space resulting
   in a ENOSPC, in the worst case.
   In cases where write-freq-threshold are ZERO, the most recently
   *written* file can be chosen for eviction.
   In the case of choosing the largest file within the
   write-freq-threshold, a stat() on the files would be required to
   calculate the number of files that need to be demoted to take the
   watermark below the hi-watermark. Finding the number of most recently
   written files to demote could also help make demotions in parallel
   rather than in the sequential manner currently in place.


Update:
The idea of choosing the files wrt file size has been dropped.
Iteratively, the most recently written file will be chosen for eviction
from the hot tier in case of a hi-watermark breach and until the
watermark drops below hi-watermark.
The idea of parallelizing multiple promotions/demotions has been
deferred.

-

Sustained writes creating larges files in the hot tier which
cumulatively breach the hi-watermark does NOT seem to be a good
workload for making use of tiering. The assumption is that, to make the
most of of the hot tier, the hi-watermark would be closer to 100.
In this case a sustained large file copy might easily breach the
hi-watermark and may even consume the entire hot tier space, resulting
in a ENOSPC.

eg. an example of a sustained write

# cp file1 /mnt/glustervol/dir

Workloads that would seem to make the most of tiering are:
1. Many smaller files, which are created in small bursts of write
   activity and then closed
2. Few large files where updates are in-place and the file size
   does not grow beyond the hi-watermark eg. database, with frequent
   in-line compaction/de-fragmentation policy enabled
3. Frequent reads of few large files, mostly static in size, which
   cumulatively don't breach the hi-watermark. Frequently reading
   a large number of smaller, mostly static, files would be good
   tiering workload candidates as well.




Comments are requested.


___
Gluster-devel mailin

Re: [Gluster-devel] tiering: emergency demotions

2016-08-12 Thread Milind Changire

Patch for review: http://review.gluster.org/15158

Milind

On 08/12/2016 07:27 PM, Milind Changire wrote:

On 08/10/2016 12:06 PM, Milind Changire wrote:

Emergency demotions will be required whenever writes breach the
hi-watermark. Emergency demotions are required to avoid ENOSPC in case
of continuous writes that originate on the hot tier.

There are two concerns in this area:

1. enforcing max-cycle-time during emergency demotions
   max-cycle-time is the time the tiering daemon spends in promotions or
   demotions
   I tend to think that the tiering daemon skip this check for the
   emergency situation and continue demotions until the watermark drops
   below the hi-watermark


Update:
To keep matters simple and manageable, it has been decided to *enforce*
max-cycle-time to yield the worker threads to attend to impending tier
management tasks if the need arises.



2. file demotion policy
   I tend to think that evicting the largest file with the most recent
   *write* should be chosen for eviction when write-freq-threshold is
   NON-ZERO.
   Choosing a least written file is just going to delay file migration
   of an active file which might consume hot tier disk space resulting
   in a ENOSPC, in the worst case.
   In cases where write-freq-threshold are ZERO, the most recently
   *written* file can be chosen for eviction.
   In the case of choosing the largest file within the
   write-freq-threshold, a stat() on the files would be required to
   calculate the number of files that need to be demoted to take the
   watermark below the hi-watermark. Finding the number of most recently
   written files to demote could also help make demotions in parallel
   rather than in the sequential manner currently in place.


Update:
The idea of choosing the files wrt file size has been dropped.
Iteratively, the most recently written file will be chosen for eviction
from the hot tier in case of a hi-watermark breach and until the
watermark drops below hi-watermark.
The idea of parallelizing multiple promotions/demotions has been
deferred.

-

Sustained writes creating larges files in the hot tier which
cumulatively breach the hi-watermark does NOT seem to be a good
workload for making use of tiering. The assumption is that, to make the
most of of the hot tier, the hi-watermark would be closer to 100.
In this case a sustained large file copy might easily breach the
hi-watermark and may even consume the entire hot tier space, resulting
in a ENOSPC.

eg. an example of a sustained write

# cp file1 /mnt/glustervol/dir

Workloads that would seem to make the most of tiering are:
1. Many smaller files, which are created in small bursts of write
   activity and then closed
2. Few large files where updates are in-place and the file size
   does not grow beyond the hi-watermark eg. database, with frequent
   in-line compaction/de-fragmentation policy enabled
3. Frequent reads of few large files, mostly static in size, which
   cumulatively don't breach the hi-watermark. Frequently reading
   a large number of smaller, mostly static, files would be good
   tiering workload candidates as well.




Comments are requested.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tiering: emergency demotions

2016-08-12 Thread Milind Changire

On 08/10/2016 12:06 PM, Milind Changire wrote:

Emergency demotions will be required whenever writes breach the
hi-watermark. Emergency demotions are required to avoid ENOSPC in case
of continuous writes that originate on the hot tier.

There are two concerns in this area:

1. enforcing max-cycle-time during emergency demotions
   max-cycle-time is the time the tiering daemon spends in promotions or
   demotions
   I tend to think that the tiering daemon skip this check for the
   emergency situation and continue demotions until the watermark drops
   below the hi-watermark


Update:
To keep matters simple and manageable, it has been decided to *enforce*
max-cycle-time to yield the worker threads to attend to impending tier
management tasks if the need arises.



2. file demotion policy
   I tend to think that evicting the largest file with the most recent
   *write* should be chosen for eviction when write-freq-threshold is
   NON-ZERO.
   Choosing a least written file is just going to delay file migration
   of an active file which might consume hot tier disk space resulting
   in a ENOSPC, in the worst case.
   In cases where write-freq-threshold are ZERO, the most recently
   *written* file can be chosen for eviction.
   In the case of choosing the largest file within the
   write-freq-threshold, a stat() on the files would be required to
   calculate the number of files that need to be demoted to take the
   watermark below the hi-watermark. Finding the number of most recently
   written files to demote could also help make demotions in parallel
   rather than in the sequential manner currently in place.


Update:
The idea of choosing the files wrt file size has been dropped.
Iteratively, the most recently written file will be chosen for eviction
from the hot tier in case of a hi-watermark breach and until the
watermark drops below hi-watermark.
The idea of parallelizing multiple promotions/demotions has been
deferred.

-

Sustained writes creating larges files in the hot tier which
cumulatively breach the hi-watermark does NOT seem to be a good
workload for making use of tiering. The assumption is that, to make the 
most of of the hot tier, the hi-watermark would be closer to 100.

In this case a sustained large file copy might easily breach the
hi-watermark and may even consume the entire hot tier space, resulting
in a ENOSPC.

eg. an example of a sustained write

# cp file1 /mnt/glustervol/dir

Workloads that would seem to make the most of tiering are:
1. Many smaller files, which are created in small bursts of write
   activity and then closed
2. Few large files where updates are in-place and the file size
   does not grow beyond the hi-watermark eg. database, with frequent
   in-line compaction/de-fragmentation policy enabled
3. Frequent reads of few large files, mostly static in size, which
   cumulatively don't breach the hi-watermark. Frequently reading
   a large number of smaller, mostly static, files would be good
   tiering workload candidates as well.




Comments are requested.


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] tiering: emergency demotions

2016-08-10 Thread Milind Changire

Emergency demotions will be required whenever writes breach the
hi-watermark. Emergency demotions are required to avoid ENOSPC in case
of continuous writes that originate on the hot tier.

There are two concerns in this area:

1. enforcing max-cycle-time during emergency demotions
   max-cycle-time is the time the tiering daemon spends in promotions or
   demotions
   I tend to think that the tiering daemon skip this check for the
   emergency situation and continue demotions until the watermark drops
   below the hi-watermark

2. file demotion policy
   I tend to think that evicting the largest file with the most recent
   *write* should be chosen for eviction when write-freq-threshold is
   NON-ZERO.
   Choosing a least written file is just going to delay file migration
   of an active file which might consume hot tier disk space resulting
   in a ENOSPC, in the worst case.
   In cases where write-freq-threshold are ZERO, the most recently
   *written* file can be chosen for eviction.
   In the case of choosing the largest file within the
   write-freq-threshold, a stat() on the files would be required to
   calculate the number of files that need to be demoted to take the
   watermark below the hi-watermark. Finding the number of most recently
   written files to demote could also help make demotions in parallel
   rather than in the sequential manner currently in place.

Comments are requested.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] mount_dir value seems clobbered in all /var/lib/glusterd/vols//bricks/: files

2016-08-05 Thread Milind Changire

The bricks are NOT lvm mounted.
The bricks are just directories on the root file-system.

Milind

On 08/05/2016 11:25 AM, Avra Sengupta wrote:

Hi Milind,

Are the bricks lvm mounterd bricks. This field is populated for lvm
mounted bricks, and used by them. For regular bricks, which don't have a
mount point this valus is ignored.

Regards,
Avra

On 08/04/2016 07:44 PM, Atin Mukherjee wrote:

glusterd_get_brick_mount_dir () does a brick_dir++  which seems to
cause this problem and removing this line fixes the problem. Commit
f846e54b introduced it.

Ccing Avra/Rajesh

mount_dir is used by snapshot, however I am just wondering how are we
surviving this case.

~Atin

On Thu, Aug 4, 2016 at 5:39 PM, Milind Changire <mchan...@redhat.com
<mailto:mchan...@redhat.com>> wrote:

here's one of the brick definition files for a volume named "twoXtwo"

[root@f24node0 bricks]# cat f24node1\:-glustervols-twoXtwo-dir
hostname=f24node1
path=/glustervols/twoXtwo/dir
real_path=/glustervols/twoXtwo/dir
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=twoXtwo-client-1
mount_dir=/lustervols/twoXtwo/dir  <-- shouldn't the value be
   /glustervols/...
   there's a missing 'g'
   after the first '/'
snap-status=0


This *should* happen for all volumes and for all such brick definition
files or whatever they are called.
BTW, I'm working with the upstream mainline sources, if that helps.

I'm running a 2x2 distribute-replicate volume.
4 nodes with 1 brick per node.
1 brick for the hot tier for tiering.

As far as I can tell, I haven't done anything fancy with the setup.
And I have confirmed that there is no directory named '/lustervols'
on any of my cluster nodes.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org>
http://www.gluster.org/mailman/listinfo/gluster-devel




--

--Atin



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] regression burn-in summary over the last 7 days

2016-08-04 Thread Milind Changire

On 08/04/2016 05:40 PM, Kaleb KEITHLEY wrote:

On 08/04/2016 08:07 AM, Niels de Vos wrote:

On Thu, Aug 04, 2016 at 12:00:53AM +0200, Niels de Vos wrote:

On Wed, Aug 03, 2016 at 10:30:28AM -0400, Vijay Bellur wrote:


...

./tests/bugs/gfapi/bug-1093594.t ; Failed 1 times
Regression Links:
https://build.gluster.org/job/regression-test-burn-in/1423/consoleFull


I have not seen this fail yet... All gfapi tests are running in a loop
on a test-system now, we'll see if it reproducible in a few days or so.


It seems that glfs_fini() returns -1 every now and then (once after 1027
iterations, once after 287). Some of the gfapi test cases actually
succeed their intended test, but still return an error when glfs_fini()
fails. I am tempted to just skip this error in most tests and have only
tests/basic/gfapi/libgfapi-fini-hang error out on it. (Obviously also
intend to fix the failure.)


If you fix the bug in glfs_fini() then it should not be necessary to
ignore the failure in the tests, right?

Just fix the bug, don't hack the test.

--

Kaleb




I've faced similar issues with glfs_fini() while working on the bareos
integration. When using a libgfapi built with --enable-debug, an assert
causes the process to dump core.

https://bugzilla.redhat.com/show_bug.cgi?id=1233136 may be worth addressing.

Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] mount_dir value seems clobbered in all /var/lib/glusterd/vols//bricks/: files

2016-08-04 Thread Milind Changire

here's one of the brick definition files for a volume named "twoXtwo"

[root@f24node0 bricks]# cat f24node1\:-glustervols-twoXtwo-dir
hostname=f24node1
path=/glustervols/twoXtwo/dir
real_path=/glustervols/twoXtwo/dir
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=twoXtwo-client-1
mount_dir=/lustervols/twoXtwo/dir  <-- shouldn't the value be
   /glustervols/...
   there's a missing 'g'
   after the first '/'
snap-status=0


This *should* happen for all volumes and for all such brick definition
files or whatever they are called.
BTW, I'm working with the upstream mainline sources, if that helps.

I'm running a 2x2 distribute-replicate volume.
4 nodes with 1 brick per node.
1 brick for the hot tier for tiering.

As far as I can tell, I haven't done anything fancy with the setup.
And I have confirmed that there is no directory named '/lustervols'
on any of my cluster nodes.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tier: breaking down the monolith processing function

2016-07-21 Thread Milind Changire

On 07/21/2016 03:03 AM, Vijay Bellur wrote:

On 07/19/2016 07:54 AM, Milind Changire wrote:

I've attempted to break the tier_migrate_using_query_file() function
into relatively smaller functions. The important one is
tier_migrate_link().




Can tier_migrate_link() be broken down further? Having more than 80-100
LOC in a function does normally look excessive to me.

Thanks,
Vijay



I've broken it [1] down some more ... please take a look.

1. http://review.gluster.org/14957

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tier: breaking down the monolith processing function for glusterfs-3.9.0

2016-07-19 Thread Milind Changire

I'm planning to get this into the upstream 3.9 release.

Milind

On 07/19/2016 05:24 PM, Milind Changire wrote:

I've attempted to break the tier_migrate_using_query_file() function
into relatively smaller functions. The important one is
tier_migrate_link().

Please take a look at http://review.gluster.org/14957 and voice your
opinions.

A prelude to this effort is similar work as part of
http://review.gluster.org/14780


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] tier: breaking down the monolith processing function

2016-07-19 Thread Milind Changire

I've attempted to break the tier_migrate_using_query_file() function
into relatively smaller functions. The important one is
tier_migrate_link().

Please take a look at http://review.gluster.org/14957 and voice your
opinions.

A prelude to this effort is similar work as part of
http://review.gluster.org/14780

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Reduce memcpy in glfs read and write

2016-06-21 Thread Milind Changire

Would https://bugzilla.redhat.com/show_bug.cgi?id=1233136 be related to
Sachin's problem?

Milind

On 06/21/2016 06:28 PM, Pranith Kumar Karampuri wrote:

Hey!!
Hope you are doing good. I took a look at the bt. So when flush
comes write-behind has to flush all the writes down. I see the following
frame hung in iob_unref:
Thread 7 (Thread 0x7fa601a30700 (LWP 16218)):
#0  0x7fa60cc55225 in pthread_spin_lock () from
/lib64/libpthread.so.0 << Does it always hang there?
#1  0x7fa60e1f373e in iobref_unref (iobref=0x19dc7e0) at iobuf.c:907
#2  0x7fa60e246fb2 in args_wipe (args=0x19e70ec) at default-args.c:1593
#3  0x7fa60e1ea534 in call_stub_wipe_args (stub=0x19e709c) at
call-stub.c:2466
#4  0x7fa60e1ea5de in call_stub_destroy (stub=0x19e709c) at
call-stub.c:2482

Is this on top of master branch? It seems like we missed an unlock of
the spin-lock or the iobref has junk value which gives the feeling that
it is in locked state (May be double free?). Do you have any extra
patches you have in your repo which make changes in iobuf?

On Tue, Jun 21, 2016 at 4:07 AM, Sachin Pandit > wrote:

Hi all,

__ __

I bid adieu to you all with the hope of crossing path again, and the
time has come rather quickly. It feels great to work on GlusterFS
again.



Currently we are trying to write data backed up by Commvault Simpana
to glusterfs volume (Disperse volume). To improve the performance, I
have implemented the proposal put forward my Rafi  K C [1]. I have
some questions regarding libgfapi and iobuf pool. 

__ __

To reduce an extra level of copy in glfs read and write, I have
implemented few APIs to request a buffer (similar to the one
represented in  [1]) from iobuf pool which can be used by the
application to write data to. With this implementation, when I try
to reuse the buffer for consecutive writes, I could see a hang in
syncop_flush of glfs_close (BT of the hang can be found in [2]). I
wanted to know if reusing the buffer is recommended. If not, do we
need to request buffer for each writes?

__ __

Setup : Distributed-Disperse ( 4 * (2+1)). Bricks scattered over 3
nodes.

__ __

[1]
http://www.gluster.org/pipermail/gluster-devel/2015-February/043966.html

[2] Attached file -  bt.txt

__ __

Thanks & Regards,

Sachin Pandit.

***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or 
distribution
by others is strictly prohibited. If you have received the message by 
mistake,
please advise the sender by reply email and delete the message. Thank you."
**


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel




--
Pranith


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] FAILED jobs: freebsd-smoke and glusterfs-devrpms

2016-05-27 Thread Milind Changire

These seem to be failing at the same points for the same patch:

http://build.gluster.org/job/freebsd-smoke/15026/
http://build.gluster.org/job/freebsd-smoke/15024/

http://build.gluster.org/job/glusterfs-devrpms/16744/
http://build.gluster.org/job/glusterfs-devrpms/16742/

Any advice for mitigation.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [release-3.7] smoke build failed

2016-05-13 Thread Milind Changire

Job: https://build.gluster.org/job/smoke/27731/console
Error:  bin/mkdir: cannot create directory 
`/usr/lib/python2.6/site-packages/gluster': Permission denied


Please advise.
Do I just resubmit the job?
Would restarting the VM be of help here?

This is the second time the smoke test has failed for this patch.
First time round, it failed with a different compiler error.
Job: https://build.gluster.org/job/netbsd6-smoke/13646/console

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [release-3.8] Need update on the status of "Glusterfind and Bareos Integration"

2016-05-11 Thread Milind Changire

Looks like all relevant patches to glusterfind and libgfapi have made
it to the release-3.8 branch. Once the official release has been done
it can be communicated to Bareos and they can resume testing against
the release.

Milind

On 05/11/2016 11:46 PM, Niels de Vos wrote:

Hi Milind,

could you reply to this email with a status update of "Glusterfind and
Bareos Integration" that is listed on the roadmap?

   https://www.gluster.org/community/roadmap/3.8/

The last status that is listed is "Implementation ready, needs
communication and testing by Bareos developers". Please pass on any of
the missing details so that they can get added to the roadmap and
release notes so that users (or the Bareos devs?) can start testing.

Thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] MKDIR_P and mkdir_p

2016-05-09 Thread Milind Changire

On Mon, May 09, 2016 at 12:02:56PM +0530, Milind Changire wrote:
> Niels, Kaleb,
> With Niels' commit 4ac2ff18db62db192c49affd8591e846c810667a
> reverting Manu's commit 1fbcecb72ef3525823536b640d244d1e5127a37f
> on upstream master, and w.r.t. Kaleb's patch
> http://review.gluster.org/14243
> with Niels' comment to move the lines to the Makefile, it looks
> like the Makefile.am in tools/glusterfind/ is already coded to do
> what Kaleb has proposed in the patch.
>
> If you look at the upstream SPEC file there's been an %install
> entry added to create a %{_sharedstatedir/glusterd/glusterfind/.keys
> directory. But looking at tools/glusterfind/Makefile.am, this looks
> like an already solved problem. However, I think this came by due to
> the MKDIR_P/mkdir_p not working on some platforms and hence leading to
> the SPEC file kludges.
>
> How do we get out of this MKDIR_P vs mkdir_p issue once and for all
> for all the platforms? This is especially painful during downstream
> packaging.

Current upstream only seems to use mkdir_p in Makefile.am files:

$ git grep MKDIR_P -- '*/Makefile.am'

$ git grep mkdir_p -- '*/Makefile.am'
cli/src/Makefile.am:$(mkdir_p) 
$(DESTDIR)$(localstatedir)/run/gluster

extras/Makefile.am: $(mkdir_p) $(DESTDIR)$(tmpfilesdir); \
extras/Makefile.am: $(mkdir_p) $(DESTDIR)$(GLUSTERD_WORKDIR)/groups
extras/init.d/Makefile.am:  $(mkdir_p) 
$(DESTDIR)$(INIT_DIR); \

extras/init.d/Makefile.am:  $(mkdir_p) $(DESTDIR)$(LAUNCHD_DIR)
extras/systemd/Makefile.am: $(mkdir_p) 
$(DESTDIR)$(SYSTEMD_DIR); \
tools/glusterfind/Makefile.am:  $(mkdir_p) 
$(DESTDIR)$(GLUSTERD_WORKDIR)/glusterfind/.keys
tools/glusterfind/Makefile.am:  $(mkdir_p) 
$(DESTDIR)$(GLUSTERD_WORKDIR)/hooks/1/delete/post/
xlators/mgmt/glusterd/src/Makefile.am:  $(mkdir_p) 
$(DESTDIR)$(GLUSTERD_WORKDIR)


This might, or mught not be correct. Because you found the commit where
Manu mentioned that mkdir_p is not available on all distributions, you
may want to get his opinion too? Just send the email to the devel list
and we can discuss it there.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [master] FAILED: NetBSD regression for tests/performance/open-behind.t

2016-03-14 Thread Milind Changire
Thanks manu.

--
Milind


- Original Message -
From: "Emmanuel Dreyfus" <m...@netbsd.org>
To: "Milind Changire" <mchan...@redhat.com>, gluster-devel@gluster.org
Sent: Tuesday, March 15, 2016 8:12:09 AM
Subject: Re: [Gluster-devel] [master] FAILED: NetBSD regression for 
tests/performance/open-behind.t

Milind Changire <mchan...@redhat.com> wrote:

> not ok 16 Got "" instead of "hello-this-is-a-test-message1", LINENUM:59
> FAILED COMMAND: hello-this-is-a-test-message1 cat
> /mnt/glusterfs/1/test-file1

I was not able to reproduce it after 350 runs, hence I tried retrigger
in Jenkins, and it passed. File into the rare spurious failure category.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] FAILED: NetBSD regression for tests/performance/open-behind.t

2016-03-14 Thread Milind Changire
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/15170/


[06:30:49] Running tests in file ./tests/performance/open-behind.t
tar: Failed open to read/write on 
/build/install/var/log/glusterfs/open-behind.tar (No such file or directory)
tar: Unexpected EOF on archive file
cat: /mnt/glusterfs/1/test-file0: No such file or directory
cat: /mnt/glusterfs/1/test-file1: No such file or directory
tar: Failed open to read/write on 
/build/install/var/log/glusterfs/open-behind.tar (No such file or directory)
tar: Unexpected EOF on archive file
./tests/performance/open-behind.t .. 
1..18
ok 1, LINENUM:8
ok 2, LINENUM:9
ok 3, LINENUM:10
ok 4, LINENUM:12
ok 5, LINENUM:14
ok 6, LINENUM:17
ok 7, LINENUM:19
ok 8, LINENUM:33
ok 9, LINENUM:34
ok 10, LINENUM:40
ok 11, LINENUM:42
ok 12, LINENUM:49
ok 13, LINENUM:51
ok 14, LINENUM:53
ok 15, LINENUM:58
not ok 16 Got "" instead of "hello-this-is-a-test-message1", LINENUM:59
FAILED COMMAND: hello-this-is-a-test-message1 cat /mnt/glusterfs/1/test-file1
ok 17, LINENUM:61
ok 18, LINENUM:64
Failed 1/18 subtests 

Test Summary Report
---
./tests/performance/open-behind.t (Wstat: 0 Tests: 18 Failed: 1)
  Failed test:  16
Files=1, Tests=18, 22 wallclock secs ( 0.03 usr  0.02 sys +  1.51 cusr  1.77 
csys =  3.33 CPU)
Result: FAIL
End of test ./tests/performance/open-behind.t




Please advise.

--
Milind

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] FAILED: freebsd smoke

2016-03-11 Thread Milind Changire

https://build.gluster.org/job/freebsd-smoke/12914/

=


Making install in nsr-server
--- install-recursive ---
Making install in src
--- nsr-cg.c ---
/usr/local/bin/python 
/usr/home/jenkins/root/workspace/freebsd-smoke/xlators/experimental/nsr-server/src/gen-fops.py
 
/usr/home/jenkins/root/workspace/freebsd-smoke/xlators/experimental/nsr-server/src/all-templates.c
 
/usr/home/jenkins/root/workspace/freebsd-smoke/xlators/experimental/nsr-server/src/nsr.c
 > nsr-cg.c
--- nsr-cg.lo ---
  CC   nsr-cg.lo
nsr-cg.c: In function 'nsr_get_changelog_dir':
nsr-cg.c:9666:24: error: 'ENODATA' undeclared (first use in this function)
 return ENODATA;
^
nsr-cg.c:9666:24: note: each undeclared identifier is reported only once for 
each function it appears in
nsr-cg.c: In function 'nsr_get_terms':
nsr-cg.c:9692:20: error: 'ENODATA' undeclared (first use in this function)
 op_errno = ENODATA; /* Most common error after this. */
^
nsr-cg.c: In function 'nsr_open_term':
nsr-cg.c:9872:28: error: 'ENODATA' undeclared (first use in this function)
 op_errno = ENODATA;
^
nsr-cg.c: In function 'nsr_next_entry':
nsr-cg.c:9929:28: error: 'ENODATA' undeclared (first use in this function)
 op_errno = ENODATA;
^
*** [nsr-cg.lo] Error code 1


=

Please advise.

--
Milind

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] FAILED NetBSD regression: quota.t

2016-03-08 Thread Milind Changire
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14776/

==
Running tests in file ./tests/basic/quota.t
[06:08:03] ./tests/basic/quota.t .. 
not ok 75 
not ok 76 
Failed 2/76 subtests 
[06:08:03]

Test Summary Report
---
./tests/basic/quota.t (Wstat: 0 Tests: 76 Failed: 2)
  Failed tests:  75-76
Files=1, Tests=76, 432 wallclock secs ( 0.06 usr  0.01 sys +  6.89 cusr  8.00 
csys = 14.96 CPU)
Result: FAIL
End of test ./tests/basic/quota.t
==

Please advise.

--
Milind

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [master] FAILED: bug-1303028-Rebalance-glusterd-rpc-connection-issue.t

2016-03-08 Thread Milind Changire
oops! how did I miss that :)
https://build.gluster.org/job/rackspace-regression-2GB-triggered/18683/


--
Milind


- Original Message -
From: "Mohammed Rafi K C" <rkavu...@redhat.com>
To: "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Milind Changire" 
<mchan...@redhat.com>
Cc: gluster-devel@gluster.org
Sent: Tuesday, March 8, 2016 1:56:51 PM
Subject: Re: [Gluster-devel] [master] FAILED: 
bug-1303028-Rebalance-glusterd-rpc-connection-issue.t

HI Du,

I will take a look.

Milind,

Can you please provide a link for the failed case.

Rafi

On 03/08/2016 12:59 PM, Raghavendra Gowdappa wrote:
> +rafi.
>
> Rafi, can you have an initial analysis on this?
>
> regards,
> Raghavendra.
>
> - Original Message -
>> From: "Milind Changire" <mchan...@redhat.com>
>> To: gluster-devel@gluster.org
>> Sent: Tuesday, March 8, 2016 12:53:27 PM
>> Subject: [Gluster-devel] [master] FAILED: 
>> bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
>>
>> ==
>> Running tests in file
>> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
>> [07:27:48]
>> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
>> ..
>> not ok 11 Got "1" instead of "0"
>> not ok 14 Got "1" instead of "0"
>> not ok 15 Got "1" instead of "0"
>> not ok 16 Got "1" instead of "0"
>> Failed 4/16 subtests
>> [07:27:48]
>>
>> Test Summary Report
>> ---
>> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
>> (Wstat: 0 Tests: 16 Failed: 4)
>>   Failed tests:  11, 14-16
>> Files=1, Tests=16, 23 wallclock secs ( 0.02 usr  0.00 sys +  1.13 cusr  0.39
>> csys =  1.54 CPU)
>> Result: FAIL
>> End of test
>> ./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
>> ==
>>
>> Please advise.
>>
>> --
>> Milind
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [master] FAILED: bug-1303028-Rebalance-glusterd-rpc-connection-issue.t

2016-03-07 Thread Milind Changire
==
Running tests in file 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
[07:27:48] 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t .. 
not ok 11 Got "1" instead of "0"
not ok 14 Got "1" instead of "0"
not ok 15 Got "1" instead of "0"
not ok 16 Got "1" instead of "0"
Failed 4/16 subtests 
[07:27:48]

Test Summary Report
---
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t 
(Wstat: 0 Tests: 16 Failed: 4)
  Failed tests:  11, 14-16
Files=1, Tests=16, 23 wallclock secs ( 0.02 usr  0.00 sys +  1.13 cusr  0.39 
csys =  1.54 CPU)
Result: FAIL
End of test 
./tests/bugs/glusterd/bug-1303028-Rebalance-glusterd-rpc-connection-issue.t
==

Please advise.

--
Milind

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/quota-anon-fd-nfs.t, ./tests/basic/tier/fops-during-migration.t, ./tests/basic/tier/record-metadata-heat.t

2016-02-08 Thread Milind Changire
http://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14096/consoleFull

[11:56:33] ./tests/basic/quota-anon-fd-nfs.t ..
not ok 21
not ok 22
not ok 24
not ok 26
not ok 28
not ok 30
not ok 32
not ok 34
not ok 36
Failed 9/40 subtests



[12:10:07] ./tests/basic/tier/fops-during-migration.t ..
not ok 22
Failed 1/22 subtests



[12:14:30] ./tests/basic/tier/record-metadata-heat.t ..
not ok 16 Got "no" instead of "yes"
Failed 1/18 subtests


Looks like some cores are available as well.


Please advise.


--

Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [FAILED] NetBSD-regression for ./tests/basic/afr/self-heald.t

2016-02-08 Thread Milind Changire
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/14089/consoleFull


[08:44:20] ./tests/basic/afr/self-heald.t ..
not ok 37 Got "0" instead of "1"
not ok 52 Got "0" instead of "1"
not ok 67
Failed 4/83 subtests


Please advise.

--

Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] compare-bug-version-and-git-branch.sh FAILING

2016-01-06 Thread Milind Changire
for patch: http://review.gluster.org/13186
Jenkins failed job:
https://build.gluster.org/job/compare-bug-version-and-git-branch/14201/


I had mistakenly entered a downstream BUG ID for rfc.sh and then later
amended the commit message with the correct mainline BUG ID and resubmitted
via rfc.sh. I also corrected the Topic tag in Gerritt to use the correct
BUG ID.

But the job for this patch is failing even after corrections.

Please advise.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] RENAME syscall semantics

2015-12-11 Thread Milind Changire
Gluster uses changelogs to perform geo-replication. The changelogs record
syscalls which are forwarded from the master cluster and played on slave
cluster to provide the geo-replication feature.

If two hard-links (h1 and h2) point to the same inode and a Python
statement of os.rename(h1, h2) is executed, then no syscall gets logged to
the changelog i.e. the syscall never reaches Gluster.

Is this behavior of renaming hard-links pointing to same inode guaranteed
to NOT reach the file-system specific code?

I'm repeating myself, but I think an example would help me explain much
better:
Consider the following sequence of syscalls:

CREATE f1   /* create file f1 */
LINK   f1   h1  /* create hard-link h1 pointing to f1 */
RENAME h1   h2  /* rename hard-link h1 to h2 */

All of the above goes well and we have f1 and h2 existing on the master and
slave clusters.

However, if geo-replication is stopped and restarted, then due to clock
synchronization issues between nodes, the last changelog is replayed on the
slave cluster. This replay causes problems during hard-link renames. So,
the previously defined set of syscalls are replayed on the slave:

CREATE f1   /* ignored by Gluster due to same gfid exists */
LINK   f1   h1  /* h1 created since it does not exist */
RENAME h1   h2  /* silently ignored since it never reaches
 * Gluster since h1 and h2 point to the same inode
 */

So, at the slave cluster, we now have f1, h1 and h2.

The issue now is, how to do away with the extra link h1 getting accumulated
on the Gluster file-system. Ideally it shouldn't exist after changelogs are
replayed.

Can Gluster assume that if the operands to a RENAME syscall point to the
same inode, then file-system specific code to handle the rename syscall
will never be invoked?

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-11 Thread Milind Changire
On Wed, Dec 9, 2015 at 8:02 PM, Jeff Darcy  wrote:

>
>
>
> On December 9, 2015 at 7:07:06 AM, Ira Cooper (i...@redhat.com) wrote:
> > A simple "abort on failure" and let the higher levels clean it up is
> > probably right for the type of compounding I propose. It is what SMB2
> > does. So, if you get an error return value, cancel the rest of the
> > request, and have it return ECOMPOUND as the errno.
>
> This is exactly the part that worries me.  If a compound operation
> fails, some parts of it will often need to be undone.  “Let the higher
> levels clean it up” means that rollback code will be scattered among all
> of the translators that use compound operations.  Some of them will do
> it right.  Others . . . less so.  ;)  All willl have to be tested
> separately.  If we centralize dispatch of compound operations into one
> piece of code, we can centralize error detection and recovery likewise.
> That ensures uniformity of implementation, and facilitates focused
> testing (or even formal proof) of that implementation.
>
> Can we gain the same benefits with a more generic design?  Perhaps.  It
> would require that the compounding translator know how to reverse each
> type of operation, so that it can do so after an error.  That’s
> feasible, though it does mean maintaining a stack of undo actions
> instead of a simple state.  It might also mean testing combinations and
> scenarios that will actually never occur in other components’ usage of
> the compounding feature.  More likely it means that people will *think*
> they can use the facility in unanticipated ways, until their
> unanticipated usage creates a combination or scenario that was never
> tested and doesn’t work.  Those are going to be hard problems to debug.
> I think it’s better to be explicit about which permutations we actually
> expect to work, and have those working earlier.
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



Could we have a dry-run phase and a commit phase for the compound operation.
The dry-run phase phase could test the validity of the transaction and the
commit phase can actually perform the operation.

If any of the operation in the dry-run operation sequence returns error,
the compound operation can be aborted immediately without the complexity of
an undo ... scattered or centralized.

But if the subsequent operations depend on the changed state of the system
from earlier operations, then we'll have to introduce a system state object
for such transactions ... and maybe serialize such operations. The system
state object can be passed through the operation sequence. How well this
idea would work in a multi-threaded world is not clear to me too.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] what rpm sub-package do /usr/{libexec, sbin}/gfind_missing_files belong to?

2015-11-11 Thread Milind Changire
They are indeed part of the geo-rep sub package ... the package listing (rpm
-qlp) says so.

But I guess if somebody attempts a server build without geo-replication
then the gfind_missing_files will get appended to the %files ganesha
section which is defined just above the %files geo-replication section.
This needs to be corrected to push the gfind_missing_files inside the
%{!?_without_georeplication:1} section under the %files geo-replication
section.

Need to get gfind_missing_files package ownership confirmed from Aravinda.



On Mon, Nov 9, 2015 at 8:03 PM, Kaleb S. KEITHLEY 
wrote:

>
> the in-tree glusterfs.spec(.in) has them immediately following the
> geo-rep sub-package, but outside the %if ... %endif.
>
> Are they part of geo-rep? Or something else?
>
> --
>
> Kaleb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] RHEL-5 Client build failed

2015-10-16 Thread Milind Changire
Following commit to release-3.7 branch causes RHEL-5 Client build to fail
because there isn't any  available on RHEL-5

ca5b466d rpc/rpc-transport/socket/src/socket.h  (Emmanuel
Dreyfus   2015-07-30 14:02:43 +0200  22) #include 


This commit is also not available in upstream master yet.

Link to failed build:
RHGS-3.1.2-CLIENT-RHEL-5:
http://brewweb.devel.redhat.com/brew/taskinfo?taskID=9962404


Looks like we need to upgrade RHEL-5 with latest OpenSSL headers and
libraries.
How else do we fix this?

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] TEST FAILED ./tests/basic/mount-nfs-auth.t

2015-10-09 Thread Milind Changire
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/10776/consoleFull


says


[06:18:00] ./tests/basic/mount-nfs-auth.t ..
not ok 62 Got "N" instead of "Y"
not ok 64 Got "N" instead of "Y"
not ok 65 Got "N" instead of "Y"
not ok 67 Got "N" instead of "Y"
Failed 4/87 subtests


Please advise.


--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] BUILD FAILED while copying log tarball

2015-10-08 Thread Milind Changire
https://build.gluster.org/job/rackspace-regression-2GB-triggered/14782/consoleFull

says

Going to copy log tarball for processing on http://elk.cloud.gluster.org/
scp: 
/srv/jenkins-logs/upload/jenkins-rackspace-regression-2GB-triggered-14782.tgz:
No space left on device
Build step 'Execute shell' marked build as failure
Finished: FAILURE


Please advise.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [FAILED] patch verification builds for release-3.7 branch

2015-08-05 Thread Milind Changire
http://build.gluster.org/job/compare-bug-version-and-git-branch/10799/

http://build.gluster.org/job/rackspace-regression-2GB-triggered/13106/consoleFull

Please advise.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [FAILED] regression tests: tests/bugs/distribute/bug-1066798.t, tests/basic/volume-snapshot.t

2015-07-20 Thread Milind Changire
http://build.gluster.org/job/rackspace-regression-2GB-triggered/12541/consoleFull

http://build.gluster.org/job/rackspace-regression-2GB-triggered/12499/consoleFull


Please advise.

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [FAILED] /opt/qa/tools/posix-compliance/tests/chmod/00.t

2015-06-09 Thread Milind Changire
Job Console Output: http://build.gluster.org/job/smoke/18470/console

My patch is Python code and does not change Gluster Internals behavior.
This test failure doesn't seem to be directly associated with my patch
implementation.

Please look into the issue.

-

Test Summary Report
---
/opt/qa/tools/posix-compliance/tests/chmod/00.t   (Wstat: 0 Tests: 58 Failed: 1)
  Failed test:  43
Files=191, Tests=1960, 110 wallclock secs ( 0.70 usr  0.27 sys +  4.57
cusr  1.63 csys =  7.17 CPU)
Result: FAIL
+ finish
+ RET=1
+ '[' 1 -ne 0 ']'
++ date +%Y%m%d%T
+ filename=/d/logs/smoke/glusterfs-logs-2015060906:40:27.tgz
+ tar -czf /d/logs/smoke/glusterfs-logs-2015060906:40:27.tgz
/build/install/var/log
tar: Removing leading `/' from member names
tar: tar (child): /d/logs/smoke/glusterfs-logs-2015060906\:40\:27.tgz:
Cannot open/build/install/var/log: Cannot stat: No such file or
directory
: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
smoke.sh returned 2
Build step 'Execute shell' marked build as failure
Finished: FAILURE

-

Regards,
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] [FAILED] tests/bugs/glusterd/bug-857330/xml.t

2015-06-02 Thread Milind Changire
Please see
http://build.gluster.org/job/rackspace-regression-2GB-triggered/9994/consoleFull
for details

Kindly advise regarding resolution

--
Milind
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel