On 9/7/13 5:57 PM, Charlie Belmer wrote: > Hey Karsten & Dev team, > > I didn't mean to take this off channel, so my apologies for that. > > I ran a few more tests this week (and my apologies on the slowness of some > of this work - life has been hectic this past month), but I am not finding > much of use. I could not find any errors which are not also present in when > N23 is disabled. If you want to browse the logs for yourself, I uploaded a > partial log from a run to > http://www.charliebelmer.com/scallion.n23.log(730MB).
That log ends at simulated time 0:19:34, so 5:30 minutes before the actual requests start. Can you post both full logs somewhere, maybe gzipped to save bandwidth? > The most telling entries I found so far are below, but I see the same > errors whether UseN23 is set to 1 or not, so it is likely not the root > cause. I'll keep digging. Okay. Another option would be to run your branch in Chutney to find the bug and then return to Shadow for performance measurements. That's what I did with #9166 and it helped a lot. Let me know if you need help with that. Thanks! Karsten > Error 1 (Seems to occur when opening channel): > 0:0:49:406270 [thread-2] 0:9:41:348157618 [scallion-info] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] bw_weights_error_t > networkstatus_check_weights(int64_t, int64_t, int64_t, int64_t, int64_t, > int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, > int64_t, int)(): Bw weight mismatch 1. G=214 M=12174 E=165 D=27654 T=40207 > Wmd=4769 Wme=-724848 Wmg=0 Wed=461 Wee=734848 Wgd=4769 Wgg=10000 > Wme=-724848 Wmg=0 > 0:0:49:407692 [thread-2] 0:9:41:348157618 [scallion-info] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] bw_weights_error_t > networkstatus_check_weights(int64_t, int64_t, int64_t, int64_t, int64_t, > int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, > int64_t, int)(): Bw weight mismatch 1. G=214 M=12174 E=165 D=27654 T=40207 > Wmd=4769 Wme=-724848 Wmg=0 Wed=461 Wee=734848 Wgd=4769 Wgg=10000 > Wme=-724848 Wmg=0 > > Error 2: > 0:0:54:227263 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void conn_read_callback(int, > short, void *)(): socket 1000264 wants to read. > 0:0:54:227281 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] int > connection_read_to_buf(connection_t *, ssize_t *, int *)(): 1000264: > starting, inbuf_datalen 0 (0 pending in tls object). at_most 16448. > 0:0:54:227302 [thread-2] 0:10:55:012000001 [shadow-debug] > [4uthority1-73.1.0.0] [_tcp_endOfFileSignalled] 73.1.0.0:10021 (descriptor > 1000264) <-> 91.1.0.0:9111: signaling close to user, socket no longer usable > 0:0:54:227321 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] TLS error: unexpected close > while reading (SSL negotiation finished successfully) > 0:0:54:227338 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] int tor_tls_read(tor_tls_t *, > char *, size_t)(): read returned r=0, err=-8 > 0:0:54:227354 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] int > connection_read_to_buf(connection_t *, ssize_t *, int *)(): TLS connection > closed on read. Closing. (Nickname nonexit10, address 91.1.0.0) > 0:0:54:227373 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > channel_close_for_error(channel_t *)(): Closing channel 0x7f546d03ce40 due > to lower-layer error > 0:0:54:227389 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > channel_change_state(channel_t *, channel_state_t)(): Changing state of > channel 0x7f546d03ce40 (global ID 21) from "open" to "closing" > 0:0:54:227420 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > channel_remove_from_digest_map(channel_t *)(): Removed channel > 0x7f546d03ce40 (global ID 21) from identity map in state closing (4) with > digest BBF25A366EA5376B0026CE6EDE3DF536211429BE > 0:0:54:227440 [thread-2] 0:10:55:012000001 [shadow-debug] > [4uthority1-73.1.0.0] [tcp_close] 73.1.0.0:10021 (descriptor 1000264) <-> > 91.1.0.0:9111: user closed connection > 0:0:54:227452 [thread-2] 0:10:55:012000001 [shadow-debug] > [4uthority1-73.1.0.0] [_tcp_setState] 73.1.0.0:10021 (descriptor 1000264) > <-> 91.1.0.0:9111: moved from TCP state 'TCPS_CLOSEWAIT' to 'TCPS_LASTACK' > 0:0:54:227473 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > connection_mark_for_close_internal_(connection_t *, int, const char *)(): > Calling connection_mark_for_close_internal_() on an OR conn at > /home/coffee/shadow/shadow/build/tor/src/or/connection.c:2932 > 0:0:54:227491 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] int > conn_close_if_marked(int)(): Cleaning up connection (fd -1). > 0:0:54:227509 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > circuit_n_chan_done(channel_t *, int)(): chan to NULL/91.1.0.0:9111, > status=0 > 0:0:54:227546 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] void > channel_change_state(channel_t *, channel_state_t)(): Changing state of > channel 0x7f546d03ce40 (global ID 21) from "closing" to "channel error" > 0:0:54:227565 [thread-2] 0:10:55:012000001 [scallion-debug] > [4uthority1-73.1.0.0] [scalliontor_logmsg_cb] int > connection_remove(connection_t *)(): removing socket -1 (type OR), n_conns > now 26 > > > > > On Tue, Sep 3, 2013 at 10:47 AM, Karsten Loesing > <[email protected]>wrote: > >> On 9/3/13 4:32 PM, Charlie Belmer wrote: >>> Hi Karsten, >>> >>> I have run several tests, but I haven't been able to satisfy myself yet. >> I >>> attached the results from my latest round of tests for your review. I am >>> comparing the n23 branch (UseN23 1 set in each torrc) with the initial >> head >>> I merged n23 with. It seems that for some reason, the n23 branch >> completes >>> far fewer downloads, though I haven't yet found the cause, or found >>> anything interesting in the logs. >>> >>> $ grep "fg-download-complete" m1.large-tor/data/scallion.log | wc -l >>> 8827 >>> $ grep "fg-download-complete" m1.large-n23/scallion.log | wc -l >>> 308 >>> >>> I would definitely appreciate any thoughts, insights, or advice! >> >> Interesting! Can you re-run the n23 simulation on info or even debug >> log level? That might reveal some more information. You'll do that by >> running "scallion --log-level=INFO". A tiny network should be >> sufficient for that. Watch out for filling disks. >> >> If you want me to look at the scallion.log file, feel free to post it >> somewhere, and I'll see if something looks unusual to me. >> >> Thanks, >> Karsten >> >> >>> On Tue, Sep 3, 2013 at 8:58 AM, Karsten Loesing <[email protected] >>> wrote: >>> >>>> Hi Charlie, >>>> >>>> did you have the chance to simulate your rebased n23-5 branch using >> Shadow? >>>> >>>> If not, maybe I can help by simulating it once and writing down more >>>> specific instructions for you? >>>> >>>> Thanks! >>>> Karsten >>>> >>>> >>>> On 8/8/13 10:36 AM, Karsten Loesing wrote: >>>>> (Moving this discussion back to tor-dev@ with Charlie's permission.) >>>>> >>>>> On 8/6/13 4:10 AM, Charlie Belmer wrote: >>>>>> Hey Karsten, >>>>>> >>>>>> I've been reading up on Shadow & Scallion to do some performance >>>> testing on >>>>>> N23. Before I go too far down that road, I wanted to check in with >> you. >>>>>> >>>>>> Are there existing blueprints or best practices I should be following? >>>>> >>>>> Thanks for looking into this! >>>>> >>>>> So, I wouldn't call the following notes blueprints or best practices, >>>>> but they are a description of what I would do to run Shadow simulations >>>>> of your branch. >>>>> >>>>> First of all, be sure to read the Shadow wiki which has contains lots >> of >>>>> interesting stuff: https://github.com/shadow/shadow/wiki >>>>> >>>>> The next step would be to get a vanilla Tor master (that your branch is >>>>> based on) running in a minimal or tiny Shadow network. It's probably >>>>> easiest to use a local Ubuntu VM for this. Or you can start an >> m1.large >>>>> EC2 instance for the tiny network, or an even smaller instance for the >>>>> minimal network. >>>>> >>>>> Once that simulation succeeds, you'll want to run another simulation >>>>> using your branch, extract performance metrics using Shadow's analyze >>>>> script, produce a PDF with graphs also using the analyze script, and >>>>> then wonder if things are faster than before. >>>>> >>>>> The minimal or tiny networks are mostly there to learn that things are >>>>> working as expected. You'll probably want larger networks for real >>>>> performance results, but starting small to get everything working is >>>>> probably easier. >>>>> >>>>> Sounds doable, I guess. However, here's where things start to get >>>>> tricky: in order to simulate Tor master (or a branch based on it) in >>>>> Shadow, you'll have to perform quite a few extra steps like compiling >>>>> your own CMake and Clang/LLVM. Not really hard, but you need to get >> all >>>>> pieces together before Shadow does what you want. >>>>> >>>>> I'm pasting detailed instructions that I used to simulate Steven's uTP >>>>> branch in Shadow below as an example. Not everything applies to your >>>>> branch, but it should give you the general idea. >>>>> >>>>> If something remains unclear, please feel free to ask! >>>>> >>>>>> Depending how it goes, I'd also look at re-trying adaptive 23 or >> tuning >>>> the >>>>>> existing implementation. >>>>> >>>>> Neat! We should probably include Mashael in that discussion once you >>>>> have a working and simulated n23-5 branch. Maybe she has ideas what >>>>> parts need tweaking. I'm cc'ing her in this mail so she knows what >>>>> you're working on. >>>>> >>>>> In any case, having a working rebased n23-5 branch is already a big >> step >>>>> forward. >>>>> >>>>> Thanks! >>>>> Karsten >>>>> >>>>> >>>>> === START OF INSTRUCTIONS === >>>>> >>>>> Launch new EC2 instance running Ubuntu Server 13.04 64 bit >>>>> >>>>> Set ulimit -n to 25000, see Shadow wiki for details >>>>> >>>> >> https://github.com/shadow/shadow/wiki/Preparing-your-machine#system-configs-and-limits >>>>> >>>>> $ sudo apt-get update >>>>> $ sudo apt-get upgrade >>>>> >>>>> Clone and test-compile modified libutp >>>>> >>>>> $ sudo apt-get install build-essential git automake libssl-dev >>>> libevent-dev >>>>> $ mkdir src >>>>> $ cd src/ >>>>> $ git clone https://github.com/sjmurdoch/libutp >>>>> $ cd libutp/ >>>>> $ make >>>>> $ cd ../ >>>>> >>>>> Clone and test-compile Tor branch utp >>>>> >>>>> $ git clone https://git.torproject.org/tor.git >>>>> $ cd tor/ >>>>> $ git remote add sjm217 https://git.torproject.org/sjm217/tor.git >>>>> $ git fetch sjm217 >>>>> $ git checkout -b utp sjm217/utp >>>>> $ ./autogen.sh >>>>> $ LDFLAGS="-L/home/ubuntu/src/libutp" >> CFLAGS="-I/home/ubuntu/src/libutp" >>>>> LIBS="-lutp -lrt" ./configure --disable-asciidoc >>>>> $ make >>>>> $ make distclean >>>>> $ cd ../ >>>>> >>>>> Modify Tor to use uTP for all links (or "0 &&" for not using uTP for >> any >>>>> link) >>>>> >>>>> diff --git a/src/or/channeltls.c b/src/or/channeltls.c >>>>> index 0551b73..b7b36e1 100644 >>>>> --- a/src/or/channeltls.c >>>>> +++ b/src/or/channeltls.c >>>>> @@ -418,7 +418,7 @@ channel_tls_connect(const tor_addr_t *addr, >> uint16_t >>>>> port, >>>>> /* Create a uTP connection */ >>>>> tor_addr_to_sockaddr(addr, port, (struct sockaddr*)&sin, >> sizeof(sin)); >>>>> tor_addr_to_str(addr_str, addr, sizeof(addr_str), 0); >>>>> - if (!strncmp(addr_str, "128.232.10.129", sizeof(addr_str))) { >>>>> + if (1 || !strncmp(addr_str, "128.232.10.129", sizeof(addr_str))) { >>>>> log_info(LD_CHANNEL, >>>>> "Trying uTP connection to %s", addr_str); >>>>> tlschan->utp = UTP_Create(tor_UTPSendToProc, tlschan, (const >> struct >>>>> sockaddr*)&sin, >>>>> >>>>> Install CMake 2.8.10 >>>>> >>>>> $ wget http://www.cmake.org/files/v2.8/cmake-2.8.10.2.tar.gz >>>>> $ tar xf cmake-2.8.10.2.tar.gz >>>>> $ cd cmake-2.8.10.2/ >>>>> $ ./configure >>>>> $ make >>>>> $ export PATH=~/src/cmake-2.8.10.2/bin:$PATH >>>>> $ cd ../ >>>>> $ which cmake # result: /home/ubuntu/src/cmake-2.8.10.2/bin/cmake >>>>> >>>>> Build Clang and LLVM from source >>>>> >>>>> $ wget http://llvm.org/releases/3.2/llvm-3.2.src.tar.gz >>>>> $ wget http://llvm.org/releases/3.2/clang-3.2.src.tar.gz >>>>> $ tar xaf llvm-3.2.src.tar.gz >>>>> $ tar xaf clang-3.2.src.tar.gz >>>>> $ cp -R clang-3.2.src llvm-3.2.src/tools/clang >>>>> $ cd llvm-3.2.src/ >>>>> $ mkdir build >>>>> $ cd build/ >>>>> $ cmake -DCMAKE_INSTALL_PREFIX=/home/ubuntu/.local ../. >>>>> $ make >>>>> $ make install >>>>> $ export PATH=~/.local/bin/:$PATH >>>>> $ cd ../../ >>>>> $ which clang # result: /home/ubuntu/.local/bin//clang >>>>> >>>>> Install Shadow >>>>> >>>>> $ sudo apt-get install gcc xz-utils make automake autoconf cmake tidy >>>>> libtidy-dev libglib2.0 libglib2.0-dev dstat pdftk python2.7 >>>>> python-matplotlib python-numpy python-scipy >>>>> $ git clone https://github.com/shadow/shadow.git >>>>> $ cd shadow >>>>> $ ./setup dependencies >>>>> $ ./setup build -g -i /home/ubuntu/src/libutp -i >>>>> /usr/include/x86_64-linux-gnu/c++/4.7 -l /home/ubuntu/src/libutp >>>>> --tor-prefix /home/ubuntu/src/tor --tor-lib utp >>>>> $ ./setup install >>>>> $ export PATH=~/.shadow/bin/:$PATH >>>>> >>>>> Simulate tiny network in Shadow >>>>> >>>>> $ cd resource/examples/scallion/ >>>>> $ tar xf tiny-4GB-m1.large.tar.xz >>>>> $ cd tiny-4GB-m1.large/ >>>>> $ scallion >>>>> >>>>> Graph simulation results >>>>> >>>>> Add one line to beginning of analyze.py: >>>>> >>>>> import matplotlib; matplotlib.use('Agg') >>>>> >>>>> For additional traffic simulations: >>>>> >>>>> $ grep -v "\[traffic-" scallion.log > scallion-notraffic.log >>>>> >>>>> $ cd ~/src/ >>>>> $ mkdir results >>>>> $ cd results/ >>>>> $ python ~/src/shadow/contrib/analyze.py parse --output uTP-1 >>>>> >>>> >> ~/src/shadow/resource/examples/scallion/tiny-4GB-m1.large/data-utp-1/scallion.log >>>>> $ python ~/src/shadow/contrib/analyze.py parse --output uTP-0 >>>>> >>>> >> ~/src/shadow/resource/examples/scallion/tiny-4GB-m1.large/data-utp-0/scallion.log >>>>> $ python ~/src/shadow/contrib/analyze.py parse --output vanilla >>>>> >>>> >> ~/src/shadow/resource/examples/scallion/tiny-4GB-m1.large/data-vanilla/scallion.log >>>>> $ python ~/src/shadow/contrib/analyze.py plot --title "sjm217's utp >>>>> branch (#9166)" --prefix "sjm217-utp" --data vanilla/ "vanilla >>>>> 0.2.4.4-alpha" --data uTP-1/ "uTP for all links" --data uTP-0/ "uTP for >>>>> none of the links" >>>>> >>>>> === END OF INSTRUCTIONS === >>>>> >>>>> >>>> >>>> >>> >> >> > _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
