Re: [Gluster-devel] Readdir plus implementation in tier xlator

2016-04-22 Thread Raghavendra Gowdappa


- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Vijay Bellur" 
> Cc: "Gluster Devel" 
> Sent: Friday, April 22, 2016 11:34:35 PM
> Subject: Re: [Gluster-devel] Readdir plus implementation in tier xlator
> 
> 
> 
> - Original Message -
> > From: "Vijay Bellur" 
> > To: "Mohammed Rafi K C" 
> > Cc: "Gluster Devel" 
> > Sent: Friday, April 22, 2016 9:41:34 AM
> > Subject: Re: [Gluster-devel] Readdir plus implementation in tier xlator
> > 
> > On Mon, Apr 18, 2016 at 3:28 AM, Mohammed Rafi K C 
> > wrote:
> > >
> > > Hi All,
> > >
> > > Currently we are experiencing some issues with the implementation of
> > > readdirp in data tiering.
> > >
> > > Problem statement:
> > >
> > > When we do a readdirp, tiering reads entries only from cold tier. Since
> > > the hashed subvol for all files has been set as cold tier by default we
> > > will have all the files in cold tier. Some of them will be data files
> > > and remaining will be pointer files(T files), which points to original
> > > file in hot tier. The motivation behind this implementation was to
> > > increase the performance of readdir by only looking up entries in one
> > > tier. Also we ran into an issue where some files were not listed while
> > > using the default dht_readdirp. This is because dht_readdir reads
> > > entries from each subvol sequentially. Since tiering migrates files
> > > frequently this led to an issue where if a file was migrated off a
> > > subvol before the readdir got to it, but after the readdir had processed
> > > the target subvol, it would not show up in the listing [1].
> 
> IIRC, missing of some files in directory listing was the primary motivation
> for existing implementation of tier_readdirp rather than performance. With
> cold tier being some sort of MDS (at-least for all dentries of a directory),
> we don't have to address lots of complexity that comes with frequent
> migration of files across subvols.
> 
> > >
> > > So for the files residing in hot tier we will fallback to readdir i.e,
> > > we won't give stat for such entries to application. This is because the
> > > corresponding pointer file in cold tier won't be having a proper stat.
> > > So we forced fuse clients to do a explicit lookup/stat for such entries
> > > by setting nodeid as null. Similarly in case of native nfs, we marked
> > > such entries as stale stat by setting attributes_follow = FALSE.
> > >
> > 
> > Is the explicit lookup done by the kernel fuse module or is it done in
> > our bridge layer?
> > 
> > Also does md-cache handle the case where nodeid is NULL in a readdirp
> > response?
> > 
> > 
> > 
> > > But the problem comes when we use gf_api, where we don't have any
> > > control over client behavior. So to fix this issue we have to give stat
> > > information for all the entries.
> > >
> > 
> > Apart from Samba, what other consumers of gfapi have this problem?
> > 
> > 
> > > Possible solutions:
> > > 1. Revert the tier_readdirp to something similiar to dht_readdirp, then
> > > fix problem in [1].
> > > 2. Have the tier readdirp do a lookup for every linkfile entry it finds
> > > and populate the data (which would cause a performance drop). This would
> > > mean that other translators do not need to be aware of the tier
> > > behaviour.
> > > 3. Do some sort of batched lookup in the tier readdirp layer to improve
> > > the performance.
> > >
> > > Both 2 and 3 won't give any performance benefit, but solve the problem
> > > in [1]. In fact this also not complete, because when we do the lookup
> > > (batched or single), by the time the file could have moved from the hot
> > > tier or vice versa which will again result in stale data.
> 
> Doesn't this mean lookup (irrespective whether its done independently or as
> part of readdirp) in tier (for that matter dht_lookup during rebalance) is
> broken? The file is present in the volume, but lookup returns ENOENT.
> Probably we should think about ways of fixing that. I cannot think of a
> solution right now as not finding a data file even after finding a linkto
> file is a valid scenario (imagine a lookup racing with unlink). But
> nevertheless, this is something that needs to be fixed.

Note that At any instance of time, there are three possibilities:
1. a data-file is guaranteed to be present on either of hot/cold tier - a 
condition which matches for most of the lifetime of a file.
2. two data-files are present - one on each hot and cold tier. This is a minor 
race-window at the end of migration. But in this window both files are equal in 
all respects (at least in terms of major attributes of iatt). So thats not an 
issue.
3. datafile is not present - somebody unlinked it.

so, for 1 and 2 if we do lookup "simultaneously" on both hot and cold tier 
using only gfid we got from reading entry from cold-tier, 

Re: [Gluster-devel] Readdir plus implementation in tier xlator

2016-04-22 Thread Niels de Vos
On Fri, Apr 22, 2016 at 11:16:48AM +0530, Mohammed Rafi K C wrote:
...
> >> But the problem comes when we use gf_api, where we don't have any
> >> control over client behavior. So to fix this issue we have to give stat
> >> information for all the entries.
> >>
> > Apart from Samba, what other consumers of gfapi have this problem?
> 
> In nfs-ganesha, What I understand is, they are not sending readdirp. So
> there we are good. But any other app which always expect a valid
> response from readdirp will fail.

glusterfs-coreutils uses glfs_readdirplus() too. I do not know if it is
a problem there, but you should be able to check that pretty easily.

I do not know what other libgfapi applications do, and I've not seen the
sources of all of the applications in any case.

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Readdir plus implementation in tier xlator

2016-04-22 Thread Raghavendra Gowdappa


- Original Message -
> From: "Vijay Bellur" 
> To: "Mohammed Rafi K C" 
> Cc: "Gluster Devel" 
> Sent: Friday, April 22, 2016 9:41:34 AM
> Subject: Re: [Gluster-devel] Readdir plus implementation in tier xlator
> 
> On Mon, Apr 18, 2016 at 3:28 AM, Mohammed Rafi K C 
> wrote:
> >
> > Hi All,
> >
> > Currently we are experiencing some issues with the implementation of
> > readdirp in data tiering.
> >
> > Problem statement:
> >
> > When we do a readdirp, tiering reads entries only from cold tier. Since
> > the hashed subvol for all files has been set as cold tier by default we
> > will have all the files in cold tier. Some of them will be data files
> > and remaining will be pointer files(T files), which points to original
> > file in hot tier. The motivation behind this implementation was to
> > increase the performance of readdir by only looking up entries in one
> > tier. Also we ran into an issue where some files were not listed while
> > using the default dht_readdirp. This is because dht_readdir reads
> > entries from each subvol sequentially. Since tiering migrates files
> > frequently this led to an issue where if a file was migrated off a
> > subvol before the readdir got to it, but after the readdir had processed
> > the target subvol, it would not show up in the listing [1].

IIRC, missing of some files in directory listing was the primary motivation for 
existing implementation of tier_readdirp rather than performance. With cold 
tier being some sort of MDS (at-least for all dentries of a directory), we 
don't have to address lots of complexity that comes with frequent migration of 
files across subvols.

> >
> > So for the files residing in hot tier we will fallback to readdir i.e,
> > we won't give stat for such entries to application. This is because the
> > corresponding pointer file in cold tier won't be having a proper stat.
> > So we forced fuse clients to do a explicit lookup/stat for such entries
> > by setting nodeid as null. Similarly in case of native nfs, we marked
> > such entries as stale stat by setting attributes_follow = FALSE.
> >
> 
> Is the explicit lookup done by the kernel fuse module or is it done in
> our bridge layer?
> 
> Also does md-cache handle the case where nodeid is NULL in a readdirp
> response?
> 
> 
> 
> > But the problem comes when we use gf_api, where we don't have any
> > control over client behavior. So to fix this issue we have to give stat
> > information for all the entries.
> >
> 
> Apart from Samba, what other consumers of gfapi have this problem?
> 
> 
> > Possible solutions:
> > 1. Revert the tier_readdirp to something similiar to dht_readdirp, then
> > fix problem in [1].
> > 2. Have the tier readdirp do a lookup for every linkfile entry it finds
> > and populate the data (which would cause a performance drop). This would
> > mean that other translators do not need to be aware of the tier behaviour.
> > 3. Do some sort of batched lookup in the tier readdirp layer to improve
> > the performance.
> >
> > Both 2 and 3 won't give any performance benefit, but solve the problem
> > in [1]. In fact this also not complete, because when we do the lookup
> > (batched or single), by the time the file could have moved from the hot
> > tier or vice versa which will again result in stale data.

Doesn't this mean lookup (irrespective whether its done independently or as 
part of readdirp) in tier (for that matter dht_lookup during rebalance) is 
broken? The file is present in the volume, but lookup returns ENOENT. Probably 
we should think about ways of fixing that. I cannot think of a solution right 
now as not finding a data file even after finding a linkto file is a valid 
scenario (imagine a lookup racing with unlink). But nevertheless, this is 
something that needs to be fixed.

> >
> 
> Isn't this problem common with any of the solutions? Since tiering
> keeps moving data without any of the clients being aware, any
> attribute cache in the client stack can quickly go stale.
> 
> 
> > 4. Revert to dht_readdirp and then instead of taking all entries from
> > hot tier, just take only entries which has T file in cold tier.

I thought with existing model of cold tier being hashed subvol for all files, 
hot tier will only have data-files with linkto files being present on cold 
tier. Am I missing anything here?

> (We can
> > delay deleting of data file after demotion, so that we will get the stat
> > from hot tier)
> >
> 
> Going by the architectural model of xlators, tier should provide the
> right entries with attributes to the upper layers (xlators/vfs etc.).
> Relying on a specific behavior from layers above us to mask a problem
> in our layer does not seem ideal.  I would go with something like 2 or
> 3.  If we want to retain the current behavior, we should make it
> conditional as I am not certain that this behavior is foolproof too.
> 
> Thanks,
> Vijay
> 

Re: [Gluster-devel] Readdir plus implementation in tier xlator

2016-04-22 Thread Vijay Bellur
On Fri, Apr 22, 2016 at 1:46 AM, Mohammed Rafi K C  wrote:
> comments are inline.
>
> On 04/22/2016 09:41 AM, Vijay Bellur wrote:
>> On Mon, Apr 18, 2016 at 3:28 AM, Mohammed Rafi K C  
>> wrote:
>>> But the problem comes when we use gf_api, where we don't have any
>>> control over client behavior. So to fix this issue we have to give stat
>>> information for all the entries.
>>>
>> Apart from Samba, what other consumers of gfapi have this problem?
>
> In nfs-ganesha, What I understand is, they are not sending readdirp. So
> there we are good. But any other app which always expect a valid
> response from readdirp will fail.


For such consumers that need strict readdirplus from tiering, we can
make this behavior optional. Exposing a tunable that can be either set
by the administrators for the volume that the consumer acts on or
letting the application invoke glfs_set_xlator_option() would be nice.


>>
>>
>>> 4. Revert to dht_readdirp and then instead of taking all entries from
>>> hot tier, just take only entries which has T file in cold tier. (We can
>>> delay deleting of data file after demotion, so that we will get the stat
>>> from hot tier)
>>>
>> Going by the architectural model of xlators, tier should provide the
>> right entries with attributes to the upper layers (xlators/vfs etc.).
>> Relying on a specific behavior from layers above us to mask a problem
>> in our layer does not seem ideal.  I would go with something like 2 or
>> 3.  If we want to retain the current behavior, we should make it
>> conditional as I am not certain that this behavior is foolproof too.
>
> If we make the changes in tier_readdirp, then it effects the performance
> of plane readdir (if md-cache was on). we may need to  turn off volume
> option "performance.force-readdirp". What do you think here ?
>

If we make the behavior optional as I describe above, then tier would
not have to impact performance in readdir/readdirplus for accesses
through fuse/NFS etc and let the current behavior remain.

Regards,
Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] WORM/Retention Feature - 22/04/2016

2016-04-22 Thread Karthik Subrahmanya
Hi all,

This week's status:

-Tested the program with different modes of retention
 and by setting different values for the retention profile
-Uploaded the test case (worm.t) with the patch
-Updated the WORM design-specs
-Written blogs about the Semantics of WORM/Retention and WORM on Gluster


Plan for next week:

-Exploring on distaf and writing tests
-Writing blogs on the implementation of the WORM/Retention feature
 on Gluster and how to use the feature


Current work:

POC: http://review.gluster.org/#/c/13429/
Spec: http://review.gluster.org/13538
Feature page: 
http://www.gluster.org/community/documentation/index.php/Features/gluster_compliance_archive
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1326308
Blog: http://uskarthik.blogspot.in/

Your valuable suggestions, reviews, and wish lists are most welcome

Thanks & Regards,
Karthik Subrahmanya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] freebsd-smoke failures

2016-04-22 Thread Polakis Vaggelis
thx for the support!

by the way same problem also occurs in review
http://review.gluster.org/#/c/14045/

br, vangelis

On Tue, Apr 19, 2016 at 5:01 PM, Michael Scherer  wrote:
> Le mardi 19 avril 2016 à 09:58 -0400, Jeff Darcy a écrit :
>> > So can a workable solution be pushed to git, because I plan to force the
>> > checkout to be like git, and it will break again (and this time, no
>> > workaround will be possible).
>> >
>>
>> It has been pushed to git, but AFAICT pull requests for that repo go into
>> a black hole.
>
> Indeed, I even made the same PR twice:
> https://github.com/gluster/glusterfs-patch-acceptance-tests/pull/12
>
> I guess no one is officially taking ownership of it, which is kinda bad,
> but can be solved.
>
> No volunteer ?
>
> --
> Michael Scherer
> Sysadmin, Community Infrastructure and Platform, OSAS
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Core generated by trash.t

2016-04-22 Thread Pranith Kumar Karampuri
+Krutika

- Original Message -
> From: "Anoop C S" 
> To: "Atin Mukherjee" 
> Cc: "Pranith Kumar Karampuri" , "Ravishankar N" 
> , "Anuradha Talur"
> , gluster-devel@gluster.org
> Sent: Friday, April 22, 2016 2:14:28 PM
> Subject: Re: [Gluster-devel] Core generated by trash.t
> 
> On Wed, 2016-04-20 at 16:24 +0530, Atin Mukherjee wrote:
> > I should have said the regression link is irrelevant here. Try
> > running
> > this test on your local setup multiple times on mainline. I do
> > believe
> > you should see the crash.
> > 
> 
> I could see coredump on running trash.t multiple times in a while loop.
> Info from coredump:
> 
> Core was generated by `/usr/local/sbin/glusterfs -s localhost --
> volfile-id gluster/glustershd -p /var/'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x0040bd31 in glusterfs_handle_translator_op
> (req=0x7feab8001dec) at glusterfsd-mgmt.c:590
> 590   any = active->first;
> [Current thread is 1 (Thread 0x7feac1657700 (LWP 12050))]
> (gdb) l
> 585   goto out;
> 586   }
> 587
> 588   ctx = glusterfsd_ctx;
> 589   active = ctx->active;
> 590   any = active->first;
> 591   input = dict_new ();
> 592   ret = dict_unserialize (xlator_req.input.input_val,
> 593   xlator_req.input.input_len,
> 594   );
> (gdb) p ctx
> $1 = (glusterfs_ctx_t *) 0x7fa010
> (gdb) p ctx->active
> $2 = (glusterfs_graph_t *) 0x0

I think this is because the request came to shd even before the graph is 
intialized? Thanks for the test case. I will take a look at this.

Pranith
> (gdb) p *req
> $1 = {trans = 0x7feab8000e20, svc = 0x83ca50, prog = 0x874810, xid = 1,
> prognum = 4867634, progver = 2, procnum = 3, type = 0, uid = 0, gid =
> 0, pid = 0, lk_owner = {len = 4,
> data = '\000' }, gfs_id = 0, auxgids =
> 0x7feab800223c, auxgidsmall = {0 }, auxgidlarge =
> 0x0, auxgidcount = 0, msg = {{iov_base = 0x7feacc253840,
>   iov_len = 488}, {iov_base = 0x0, iov_len = 0}  times>}, count = 1, iobref = 0x7feab8000c40, rpc_status = 0, rpc_err =
> 0, auth_err = 0, txlist = {next = 0x7feab800256c,
> prev = 0x7feab800256c}, payloadsize = 0, cred = {flavour = 390039,
> datalen = 24, authdata = '\000' , "\004", '\000'
> }, verf = {flavour = 0,
> datalen = 0, authdata = '\000' }, synctask =
> _gf_true, private = 0x0, trans_private = 0x0, hdr_iobuf = 0x82b038,
> reply = 0x0}
> (gdb) p req->procnum
> $3 = 3 <== GLUSTERD_BRICK_XLATOR_OP
> (gdb) t a a bt
> 
> Thread 6 (Thread 0x7feabf178700 (LWP 12055)):
> #0  0x7feaca522043 in epoll_wait () at ../sysdeps/unix/syscall-
> template.S:84
> #1  0x7feacbe5076f in event_dispatch_epoll_worker (data=0x878130)
> at event-epoll.c:664
> #2  0x7feacac4560a in start_thread (arg=0x7feabf178700) at
> pthread_create.c:334
> #3  0x7feaca521a4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> 
> Thread 5 (Thread 0x7feac2659700 (LWP 12048)):
> #0  do_sigwait (sig=0x7feac2658e3c, set=) at
> ../sysdeps/unix/sysv/linux/sigwait.c:64
> #1  __sigwait (set=, sig=0x7feac2658e3c) at
> ../sysdeps/unix/sysv/linux/sigwait.c:96
> #2  0x00409895 in glusterfs_sigwaiter (arg=0x7ffe3debbf00) at
> glusterfsd.c:2032
> #3  0x7feacac4560a in start_thread (arg=0x7feac2659700) at
> pthread_create.c:334
> #4  0x7feaca521a4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> 
> Thread 4 (Thread 0x7feacc2b4780 (LWP 12046)):
> #0  0x7feacac466ad in pthread_join (threadid=140646205064960,
> thread_return=0x0) at pthread_join.c:90
> #1  0x7feacbe509bb in event_dispatch_epoll (event_pool=0x830b80) at
> event-epoll.c:758
> #2  0x7feacbe17a91 in event_dispatch (event_pool=0x830b80) at
> event.c:124
> #3  0x0040a3c8 in main (argc=13, argv=0x7ffe3debd0f8) at
> glusterfsd.c:2376
> 
> Thread 3 (Thread 0x7feac2e5a700 (LWP 12047)):
> #0  0x7feacac4e27d in nanosleep () at ../sysdeps/unix/syscall-
> template.S:84
> #1  0x7feacbdfc152 in gf_timer_proc (ctx=0x7fa010) at timer.c:188
> #2  0x7feacac4560a in start_thread (arg=0x7feac2e5a700) at
> pthread_create.c:334
> #3  0x7feaca521a4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> 
> Thread 2 (Thread 0x7feac1e58700 (LWP 12049)):
> #0  pthread_cond_timedwait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
> #1  0x7feacbe2d73d in syncenv_task (proc=0x838310) at syncop.c:603
> #2  0x7feacbe2d9dd in syncenv_processor (thdata=0x838310) at
> syncop.c:695
> #3  0x7feacac4560a in start_thread (arg=0x7feac1e58700) at
> pthread_create.c:334
> #4  0x7feaca521a4d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
> 
> Thread 1 (Thread 0x7feac1657700 (LWP 12050)):
> #0  0x0040bd31 in glusterfs_handle_translator_op
> 

Re: [Gluster-devel] Core generated by trash.t

2016-04-22 Thread Anoop C S
On Wed, 2016-04-20 at 16:24 +0530, Atin Mukherjee wrote:
> I should have said the regression link is irrelevant here. Try
> running
> this test on your local setup multiple times on mainline. I do
> believe
> you should see the crash.
> 

I could see coredump on running trash.t multiple times in a while loop.
Info from coredump:

Core was generated by `/usr/local/sbin/glusterfs -s localhost --
volfile-id gluster/glustershd -p /var/'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0040bd31 in glusterfs_handle_translator_op
(req=0x7feab8001dec) at glusterfsd-mgmt.c:590
590 any = active->first;
[Current thread is 1 (Thread 0x7feac1657700 (LWP 12050))]
(gdb) l
585 goto out;
586 }
587 
588 ctx = glusterfsd_ctx;
589 active = ctx->active;
590 any = active->first;
591 input = dict_new ();
592 ret = dict_unserialize (xlator_req.input.input_val,
593 xlator_req.input.input_len,
594 );
(gdb) p ctx
$1 = (glusterfs_ctx_t *) 0x7fa010
(gdb) p ctx->active
$2 = (glusterfs_graph_t *) 0x0
(gdb) p *req
$1 = {trans = 0x7feab8000e20, svc = 0x83ca50, prog = 0x874810, xid = 1,
prognum = 4867634, progver = 2, procnum = 3, type = 0, uid = 0, gid =
0, pid = 0, lk_owner = {len = 4, 
data = '\000' }, gfs_id = 0, auxgids =
0x7feab800223c, auxgidsmall = {0 }, auxgidlarge =
0x0, auxgidcount = 0, msg = {{iov_base = 0x7feacc253840, 
  iov_len = 488}, {iov_base = 0x0, iov_len = 0} }, count = 1, iobref = 0x7feab8000c40, rpc_status = 0, rpc_err =
0, auth_err = 0, txlist = {next = 0x7feab800256c, 
prev = 0x7feab800256c}, payloadsize = 0, cred = {flavour = 390039,
datalen = 24, authdata = '\000' , "\004", '\000'
}, verf = {flavour = 0, 
datalen = 0, authdata = '\000' }, synctask =
_gf_true, private = 0x0, trans_private = 0x0, hdr_iobuf = 0x82b038,
reply = 0x0}
(gdb) p req->procnum
$3 = 3 <== GLUSTERD_BRICK_XLATOR_OP
(gdb) t a a bt

Thread 6 (Thread 0x7feabf178700 (LWP 12055)):
#0  0x7feaca522043 in epoll_wait () at ../sysdeps/unix/syscall-
template.S:84
#1  0x7feacbe5076f in event_dispatch_epoll_worker (data=0x878130)
at event-epoll.c:664
#2  0x7feacac4560a in start_thread (arg=0x7feabf178700) at
pthread_create.c:334
#3  0x7feaca521a4d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 5 (Thread 0x7feac2659700 (LWP 12048)):
#0  do_sigwait (sig=0x7feac2658e3c, set=) at
../sysdeps/unix/sysv/linux/sigwait.c:64
#1  __sigwait (set=, sig=0x7feac2658e3c) at
../sysdeps/unix/sysv/linux/sigwait.c:96
#2  0x00409895 in glusterfs_sigwaiter (arg=0x7ffe3debbf00) at
glusterfsd.c:2032
#3  0x7feacac4560a in start_thread (arg=0x7feac2659700) at
pthread_create.c:334
#4  0x7feaca521a4d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 4 (Thread 0x7feacc2b4780 (LWP 12046)):
#0  0x7feacac466ad in pthread_join (threadid=140646205064960,
thread_return=0x0) at pthread_join.c:90
#1  0x7feacbe509bb in event_dispatch_epoll (event_pool=0x830b80) at
event-epoll.c:758
#2  0x7feacbe17a91 in event_dispatch (event_pool=0x830b80) at
event.c:124
#3  0x0040a3c8 in main (argc=13, argv=0x7ffe3debd0f8) at
glusterfsd.c:2376

Thread 3 (Thread 0x7feac2e5a700 (LWP 12047)):
#0  0x7feacac4e27d in nanosleep () at ../sysdeps/unix/syscall-
template.S:84
#1  0x7feacbdfc152 in gf_timer_proc (ctx=0x7fa010) at timer.c:188
#2  0x7feacac4560a in start_thread (arg=0x7feac2e5a700) at
pthread_create.c:334
#3  0x7feaca521a4d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7feac1e58700 (LWP 12049)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x7feacbe2d73d in syncenv_task (proc=0x838310) at syncop.c:603
#2  0x7feacbe2d9dd in syncenv_processor (thdata=0x838310) at
syncop.c:695
#3  0x7feacac4560a in start_thread (arg=0x7feac1e58700) at
pthread_create.c:334
#4  0x7feaca521a4d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7feac1657700 (LWP 12050)):
#0  0x0040bd31 in glusterfs_handle_translator_op
(req=0x7feab8001dec) at glusterfsd-mgmt.c:590
#1  0x7feacbe2cf04 in synctask_wrap (old_task=0x7feab80031c0) at
syncop.c:375
#2  0x7feaca467f30 in ?? () from /lib64/libc.so.6
#3  0x in ?? ()

Looking at the core, crash was seen from
glusterfs_handle_translator_op() routine while doing a 'volume heal'
command. I could then easily create a small test case to re-produce the
issue. Please find the attachment for the same.

--Anoop C S.


core-reprod.t
Description: Perl program
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel