Hi,

I appreciate you help.

On 2015-04-08 00:43, Robert Edmonds wrote:
Hi, Rogerio:

Thanks for these details, I can easily spin up a dual core amd64 VM
running Debian jessie soon and try to replicate the problem.

Do you get a segfault immediately, or does it only occur after running
for some time under load?

segfault occur after some time.

Can you try testing with "num-threads: 1"?  (This will still result in
multiple threads running in the Unbound process, but the dnstap I/O
thread will only be consuming data from a single worker thread.)

I get the same error with "num-threads: 1".

Also, can you compile your unbound package with debugging symbols and
obtain a backtrace from a crash?  You should be able to build a
debugging enabled package with:

    DEB_BUILD_OPTIONS='nostrip debug' dpkg-buildpackage -b -uc -us

Then, run "gdb --args unbound -d" until it crashes, and at the gdb
prompt run:

    thread apply all bt full

The output is attached.


Thanks!

Rogerio Bastos wrote:
On 2015-04-06 22:44, Robert Edmonds wrote:
>Rogerio Bastos wrote:
>>I'm trying to test unbound witk dnstap. It works fine with low load, but
>>exists with segfault at high load. The segfault only happens when dnstap
>>is
>>enabled in configuration.
>>
>>I am using the debian package (version 1.5.3) avaible in [1] and
>>recompiled
>>with dnstap enabled.
>>I'm following instruction descripted in [2] and using fstrm version
>>0.2.0.
>>
>>To test the server, I'm using dnsblast [3] with the follow command:
>>
>>./dnsblast <server address> 50000 500
>
>Hi, Rogerio:
>
>Sorry to hear that.  I would be happy to help debug dnstap (I wrote the
>dnstap patchset for Unbound).  Can I get some information about your
>environment?
>
>Can you show the "dnstap:" block of settings from your config, and the
>"num-threads" server setting?

I'm using optimisation settings based on [1] (the Debian version is compiled
with libevent):

server:
    num-threads: 2

    msg-cache-slabs: 2
    rrset-cache-slabs: 2
    infra-cache-slabs: 2
    key-cache-slabs: 2

    rrset-cache-size: 100m
    msg-cache-size: 50m

    outgoing-range: 8192
    num-queries-per-thread: 4096

    so-rcvbuf: 4m
    so-sndbuf: 4m


I'm using the example from dnstap's site [2]:

dnstap:
    dnstap-enable: yes
    dnstap-socket-path: "/var/run/unbound/dnstap.sock"
    dnstap-send-identity: yes
    dnstap-send-version: yes
    dnstap-log-resolver-response-messages: yes
    dnstap-log-client-query-messages: yes

>Does fstrm's "make check" test suite succeed?

Yes, all tests is ok.

>What version of protobuf-c are you using?  (Did you compile from source,
>or did you use a packaged version?)

The packaged version from Debian Jessie (version 1.0.2).

>What OS version are you using?  (Based on your mention of the Debian
>package from experimental, I would guess Debian or Ubuntu.)

Debian Jessie, the next-stable version.

>Are you using a uniprocessor or SMP machine?  Also, since there are some
>architecture-specific parts in fstrm, what architecture are you using?

I'm using a amd64 virtual machine with a two core CPU.

[1] https://www.unbound.net/documentation/howto_optimise.html
[2] http://dnstap.info/Examples/

--

My email was sent by May First/People Link
https://mayfirst.org
Thread 2 (Thread 0x7ffff5269700 (LWP 2598)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
No locals.
#1  0x00007ffff776e638 in ?? () from /usr/lib/x86_64-linux-gnu/libfstrm.so.0
No symbol table info available.
#2  0x00007ffff776cbbd in ?? () from /usr/lib/x86_64-linux-gnu/libfstrm.so.0
No symbol table info available.
#3  0x00007ffff67880a4 in start_thread (arg=0x7ffff5269700) at 
pthread_create.c:309
        __res = <optimized out>
        pd = 0x7ffff5269700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737306334976, 
-4520694634765884115, 1, 140737354125408, 
                93824996366800, 140737306334976, 4520679868110988589, 
4520673803517945133}, mask_was_saved = 0}}, priv = {
            pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, 
canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#4  0x00007ffff64bd04d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 1 (Thread 0x7ffff7fe9700 (LWP 2594)):
---Type <return> to continue, or q <return> to quit---
#0  serviced_tcp_callback (c=0x0, arg=0x555556b853b0, error=-2, rep=0x0) at 
services/outside_network.c:1596
        r2 = {c = 0x3000000010, addr = {ss_family = 59744, __ss_align = 
140737488349344, 
            __ss_padding = 
"\000T\217\\\337`*\333\004\000\000\000}\000\000\000\000\000\000\000\377\177\000\000}",
 '\000' <repeats 15 times>, 
"\270\333vVUU\000\000\200pv\367\377\177\000\000\240\333vVUU", '\000' <repeats 
14 times>, 
"\001\000\000\000\252\\T\367\377\177\000\000\000\000\000\000\004\000\000\002\000T\217\\\337`*\333\000\000\000\000\001\000\000"},
 
          addrlen = 4149479993, srctype = 32767, pktinfo = {v6info = {ipi6_addr 
= {__in6_u = {
                  __u6_addr8 = 
"\000\000\000\000\000\000\000\000\060\300]VUU\000", __u6_addr16 = {0, 0, 0, 0, 
49200, 22109, 
                    21845, 0}, __u6_addr32 = {0, 0, 1448984624, 21845}}}, 
ipi6_ifindex = 1432264672}, v4info = {
              ipi_ifindex = 0, ipi_spec_dst = {s_addr = 0}, ipi_addr = {s_addr 
= 1448984624}}}}
#1  0x00005555555e815a in outnet_tcptimer (arg=0x5555566d9630) at 
services/outside_network.c:1120
        w = 0x5555566d9630
        outnet = 0x555555aa20e0
        cb = 0x5555555e9fe0 <serviced_tcp_callback>
        cb_arg = 0x555556b853b0
        __func__ = "outnet_tcptimer"
#2  0x00007ffff75313dc in event_base_loop () from 
/usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
No symbol table info available.
#3  0x000055555558e61c in comm_base_dispatch (b=<optimized out>) at 
util/netevent.c:305
        retval = <optimized out>
#4  0x00005555555c80eb in worker_work (worker=<optimized out>) at 
daemon/worker.c:1285
No locals.
#5  daemon_fork (daemon=0x55555583d010) at daemon/daemon.c:566
No locals.
#6  0x00005555555741a8 in run_daemon (debug_mode=<optimized out>, 
cmdline_verbose=<optimized out>, cfgfile=<optimized out>)
---Type <return> to continue, or q <return> to quit---
    at daemon/unbound.c:671
No locals.
#7  main (argc=1434701840, argv=0x55555585fb10) at daemon/unbound.c:766
        c = 1437212896
        winopt = 0x6c <error: Cannot access memory at address 0x6c>
_______________________________________________
Unbound-users mailing list
[email protected]
http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users

Reply via email to