[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-972:
------------------------------------

    Fix Version/s: 3.4.0

this should go into 3.3.3 and 3.4.

Any volunteers for fixing this?

> perl Net::ZooKeeper segfaults when setting a watcher on get_children
> --------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-972
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-972
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: contrib-bindings
>    Affects Versions: 3.3.2
>         Environment: rhel 5.3, perl 5.10, Net::Zookeeper-1.35, 
> zookeeper_c_client-3.3.2 and below.
>            Reporter: Robert Powers
>             Fix For: 3.4.0
>
>
> The issue I'm seeing seems strikingly similar to this: 
> https://issues.apache.org/jira/browse/ZOOKEEPER-772
> I have one writer process which adds sequenced children nodes to /queue and a 
> separate reader process which sets a children watcher on /queue, waiting for 
> children to be added or deleted. Long story short, every time a child node is 
> added or deleted by the writer, the reader's watcher is supposed to trigger 
> so the reader can check if it's time to get to work or go back to bed. Bad 
> things seem to happen while the reader is waiting on the watcher and the 
> writer adds or deletes a node.
> In versions prior to 3.3.2, my code that sets a watcher on the children of a 
> node using the perl binding would either lock up when trying to retrieve the 
> children or would segfault when a child node was added while waiting on the 
> watch. In 3.3.2, it seems to just do the locking up.
> I'm seeing this: assertion botched (free()ed/realloc()ed-away memory was 
> overwritten?): !(MallocCfg[MallocCfg_filldead] && MallocCfg[Mall
> ocCfg_fillcheck]) || !cmp_pat_4bytes((unsigned char*)(p + 1), (((1 << 
> ((bucket) >> 0)) + ((bucket >= 15 * 1) ? 4096 : 0)) - (siz
> eof(union overhead) + sizeof (unsigned int))) + sizeof (unsigned int), 
> fill_deadbeef) (malloc.c:1536)
> I managed to get a stack trace
> Program received signal SIGABRT, Aborted.
> 0xffffe410 in __kernel_vsyscall ()
> (gdb) where
> #0  0xffffe410 in __kernel_vsyscall ()
> #1  0xf7b8ed80 in raise () from /lib/libc.so.6
> #2  0xf7b90691 in abort () from /lib/libc.so.6
> #3  0xf7d6d53f in botch (diag=0xa <Address 0xa out of bounds>, 
>     s=0xf7ef42e8 "!(MallocCfg[MallocCfg_filldead] && 
> MallocCfg[MallocCfg_fillcheck]) || !cmp_pat_4bytes((unsigned char*)(p + 1),
>  (((1 << ((bucket) >> 0)) + ((bucket >= 15 * 1) ? 4096 : 0)) - (sizeof(union 
> overhead) + s"..., file=0xf7ef4119 "malloc.c", line
> =1536) at malloc.c:1327
> #4  0xf7d6d97a in Perl_malloc (nbytes=15530) at malloc.c:1535
> #5  0xf7d6f974 in Perl_calloc (elements=1, size=0) at malloc.c:2314
> #6  0xf7929eca in _zk_create_watch (my_perl=0x0) at ZooKeeper.xs:204
> #7  0xf7929f8f in _zk_acquire_watch (my_perl=0x0) at ZooKeeper.xs:240
> #8  0xf793450b in XS_Net__ZooKeeper_watch (my_perl=0x889c008, cv=0x89db8b4) 
> at ZooKeeper.xs:2035
> #9  0xf7e1dd67 in Perl_pp_entersub (my_perl=0x889c008) at pp_hot.c:2847
> #10 0xf7de47ce in Perl_runops_debug (my_perl=0x889c008) at dump.c:1931
> #11 0xf7e0d856 in perl_run (my_perl=0x889c008) at perl.c:2384
> #12 0x08048ace in main (argc=2, argv=0xffe11814, env=0xffe11820) at 
> perlmain.c:113
> The code to reproduce:
> sub bide_time
> {
>   my $root = '/queue';
>   my $timeout = 20*1000;
>   my $zkc = Net::ZooKeeper->new('localhost:2181');
>   while (1) {
>     print "Retrieving $root\n";
>     my $child_watch = $zkc->watch('timeout' => $timeout);
>     my @children = $zkc->get_children($root, watch=>$child_watch);
>     if (scalar(@children)) {
>       return @children if (rand(1) > 0.75);
>     } else {
>       print " - No Children.\n";
>     }
>     print "Time to wait for the Children.\n";
>     if ($child_watch->wait()) {
>       print "watch triggered on node $root:\n";
>       print "  event: $child_watch->{event}\n";
>       print "  state: $child_watch->{state}\n";
>     } else {
>       print "watch timed out\n";
>     }
>   }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to