On Tue, 27 Jun 2006, Fabian Keil wrote:

There was a "request" for Tor related problem reports a while ago, I couldn't find the message again, but I believe it was posted here.

I'm very interested in tracking down this problem, but have had a lot of trouble getting reliable reports of problems -- i.e., ones where I could get any debugging information. I had a similar conversation on these lines yeterday with Roger (Tor author) here at the WEIS conference. If this is easily reproduceable, I would like you to do the following:

- Compile in options DDB, options KDB, options BREAK_TO_DEBUGGER, options
  WITNESS, options WITNESS_SKIPSPIN, options INVARIANTS, options
  INVARIANT_SUPPORT.

- Make sure to have a kernel with debugging symbols for the kernel.

- Turn on core dumps.

The above debugging options will have a significant performance impact, and may or may not affect the probability of the race or deadlock being exercised. The first question is:

- Are there any warnings on the console from WITNESS or other debugging
  options?  If so, please copy/paste them into an e-mail for me.

- Does a panic occur?  If so, the output of the following comments would be
  very useful:

  show pcpu
  show allpcpu
  ps
  show locks
  show alllocks
  show lockedvnods
  trace

  Then walk the list of all processes listed in 'show alllocks', and run trace
  on each pid.

- Does the hang occur?  If so, use a serial break to get into DDB, see the
  above.

In both of the last two cases, attempt to get a core dump.

Robert N M Watson
Computer Laboratory
University of Cambridge


Last week I installed:
FreeBSD tor.fabiankeil.de 6.1-RELEASE-p2 FreeBSD
6.1-RELEASE-p2 #0: Fri Jun 23 20:06:57 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BIGSLEEP  i386.

At the moment it is only acting as Tor node
<http://serifos.eecs.harvard.edu/cgi-bin/desc.pl?q=zwiebelsuppe>
tor-devel (maintainer CC'd) is running jailed in a Geli image,
ntpd, named, cron and sshd are running in the host system
and that's about it. No mail or web server and nearly no traffic
besides the one caused by Tor.

I started Tor Friday night and had to reset the box three times
since then. The server just suddenly stops responding, the logs
stop as well, therefore I assume it either panics or hangs.

I only have remote access, a serial console is available,
but it becomes unresponsive as well. I didn't configure DDB yet,
so maybe that is to be expected?

cron creates some stats every five minutes, a few minutes
before a hang this morning the load was:

last pid:  7996;  load averages:  0.40,  0.37,  0.36  up 0+18:38:25    05:55:02
83 processes:  2 running, 66 sleeping, 15 waiting
CPU states: 21.3% user,  0.0% nice, 17.8% system, 20.2% interrupt, 40.7% idle
Mem: 100M Active, 157M Inact, 102M Wired, 12K Cache, 60M Buf, 134M Free
Swap: 1024M Total, 1024M Free

 PID USERNAME  THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
  11 root        1 171   52     0K     8K RUN    857:30 53.61% idle
  12 root        1 -44 -163     0K     8K WAIT    45:22  6.54% swi1: net
  23 root        1 -68 -187     0K     8K WAIT    14:48  2.83% irq12: fxp0 fxp1
7973 root        1  96    0  2264K  1544K RUN      0:00  0.51% top
  13 root        1 -32 -151     0K     8K WAIT     5:49  0.10% swi4: clock sio
  33 root        1 171   52     0K     8K pgzero   0:02  0.10% pagezero
   3 root        1  -8    0     0K     8K -        0:16  0.05% g_up
1586 _tor       14  20    0    99M 97912K kserel 188:36  0.00% tor
  15 root        1 -16    0     0K     8K -        1:01  0.00% yarrow
1443 root        1  -8    0     0K     8K geli:w   0:49  0.00% g_eli[0] md0
   4 root        1  -8    0     0K     8K -        0:21  0.00% g_down
  35 root        1  20    0     0K     8K syncer   0:17  0.00% syncer
1439 root        1  -8    0     0K     8K mdwait   0:13  0.00% md0
  24 root        1 -64 -183     0K     8K WAIT     0:08  0.00% irq14: ata0
   2 root        1  -8    0     0K     8K -        0:07  0.00% g_event
  42 root        1 -16    0     0K     8K -        0:06  0.00% schedcpu
 453 root        1  96    0  2920K  1752K select   0:05  0.00% ntpd
 256 _pflogd     1 -58    0  1548K  1216K bpf      0:05  0.00% pflog

pfctls -si:
Status: Enabled for 0 days 18:37:52           Debug: Urgent

Hostid: 0x1ec3da6b

Interface Stats for fxp0              IPv4             IPv6
 Bytes In                     25077859159                0
 Bytes Out                    27498863362                0
 Packets In
   Passed                        36192760                0
   Blocked                          32213                0
 Packets Out
   Passed                        36871432                0
   Blocked                            265                0

State Table                          Total             Rate
 current entries                     5290
 searches                        73567507         1096.8/s
 inserts                           600068            8.9/s
 removals                          594778            8.9/s
Counters
 match                             752600           11.2/s
 bad-offset                             0            0.0/s
 fragment                             102            0.0/s
 short                                  0            0.0/s
 normalize                              2            0.0/s
 memory                                68            0.0/s
 bad-timestamp                          0            0.0/s
 congestion                             0            0.0/s
 ip-option                              0            0.0/s
 proto-cksum                            0            0.0/s
 state-mismatch                     12655            0.2/s
 state-insert                           0            0.0/s
 state-limit                            0            0.0/s
 src-limit                              2            0.0/s
 synproxy

Today's traffic graph:
<http://www.fabiankeil.de/blog-surrogat/2006/06/27/tor.fabiankeil.de-dritter-ausfall-24-stunden-durchsatz-statistik-595x337.png>
(The hang around 14:00 happened while I was logged in doing a buildworld)

At the moment I'm building RELENG_6 with DDB to see if it changes anything
and if I can get a core dump, but so far the problem seems to be
similar to: http://www.freebsd.org/cgi/query-pr.cgi?pr=95180 (closed)
and <http://freebsd.rambler.ru/bsdmail/freebsd-questions_2006/msg08692.html>.

Is anyone on this list running a Tor node on FreeBSD 6.1-RELEASE
or later with similar or higher load?

Fabian
--
http://www.fabiankeil.de/

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to