This is quite mission critical for us, so my boss looked into it
further last night and has discovered where the access violation is
coming from. It seems like, when a job is retrieved from the queue
using next_eligible_job(), the 'tube' attribute is missing. The
pointer points to 0x0. Here are the tests he did.

"Ok so basically from what I can see at after bt, f1, l and peeking at
values, the job is there, with data and values, and the whole struct,
just the tube went away or the reference to it has been lost. Happens
when you poll 1000 jobs, then rest for a second, then repeat. The bug
appears after a  fairly large run, typically 300k jobs+. Reproducible
under Ubuntu and CentOS, different machines."

(gdb) run
Starting program: /usr/local/beanstalk-1.2/bin/beanstalkd
[Thread debugging using libthread_db enabled]
[New Thread 0x7f9ff30486e0 (LWP 20084)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f9ff30486e0 (LWP 20084)]
pq_take (q=0xd0) at pq.c:143

(gdb) bt
#0  pq_take (q=0xd0) at pq.c:143
#1  0x0000000000404d42 in process_queue () at prot.c:392
#2  0x0000000000406225 in do_cmd (c=0x62b8b0) at prot.c:1160
#3  0x00000000004071d1 in h_conn (fd=9, which=<value optimized out>,
c=0x62b8b0) at prot.c:1454
#4  0x00007f9ff2c2ad98 in event_base_loop (base=0x624210, flags=<value
optimized out>) at event.c:387
#5  0x0000000000401e83 in main (argc=1, argv=<value optimized out>) at
beanstalkd.c:286

(gdb) f 1
#1  0x0000000000404d42 in process_queue () at prot.c:392
392             j = pq_take(&j->tube->ready);

(gdb) l
387         job j;
388
389         dprintf("processing queue\n");
390         while ((j = next_eligible_job())) {
391             dprintf("got eligible job %llu in %s\n", j->id, j-
>tube->name);
392             j = pq_take(&j->tube->ready);
393             ready_ct--;
394             if (j->pri < URGENT_THRESHOLD) {
395                 global_stat.urgent_ct--;
396                 j->tube->stat.urgent_ct--;

(gdb) p j->id
$23 = 691194

(gdb) p j->body
$24 = 0x10ba928 "<?xml version=\"1.0\"?>\n<packet><command>send></
command><payload><msgid><![CDATA[41194]]></msgid></payload></packet>\r
\n"

(gdb) p j->tube
$25 = (tube) 0x0


On Feb 18, 2:20 pm, Keith Rarick <[email protected]> wrote:
> I'll look at this in the next day or two. Sorry for the delay.
>
> kr
>
> On Mon, Feb 16, 2009 at 9:41 AM, Tim Gunter <[email protected]> wrote:
>
> > I forget to add this:
>
> > (gdb) where
> > #0  pq_take (q=0xd0) at pq.c:143
> > #1  0x0804c5cf in process_queue () at prot.c:392
> > #2  0x0804dd7c in do_cmd (c=0x8614588) at prot.c:1160
> > #3  0x0804e9e8 in h_conn (fd=10, which=2, c=0x8614588) at prot.c:1568
> > #4  0x00d61260 in event_base_loop (base=0x860f1a0, flags=0) at event.c:
> > 387
> > #5  0x00d61599 in event_loop (flags=0) at event.c:463
> > #6  0x00d615be in event_dispatch () at event.c:401
> > #7  0x08049555 in main (argc=1, argv=Cannot access memory at address
> > 0x6
> > ) at beanstalkd.c:286
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"beanstalk-talk" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/beanstalk-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to