[Nick, can you merge my torspec proposal218 branch, please? Thanks!]
On 2/28/13 2:08 AM, Rob Jansen wrote: > On Mon, Feb 25, 2013 at 10:28 AM, Karsten Loesing <[email protected]> > wrote: > >> On 2/23/13 11:20 PM, Rob Jansen wrote: >>> On Feb 23, 2013 4:22 PM, "Karsten Loesing" <[email protected]> >> wrote: >>>> Your understanding of n_circ_id and p_circ_id matches mine, but are you >>>> sure there's a UID for circuits other than origin circuits? I think you >>>> mean origin_circuit_t->global_identifier. But there's no such field for >>>> or_circuit_t or circuit_t. Or do you mean something else? >>>> >>> Ok. Though I thought my original patch moved the global Id to the base >>> circuit struct. Perhaps I didn't. Anyway, I'm not sure it matters... >> >> Your original patch did move the ID to circuit_t, but I thought we >> wanted to avoid numbering non-origin circuits (mostly because that >> affects relays in non-TestingTorNetwork mode and could lead to busy >> relays running out of IDs at some point), which is why I took this >> change out. >> > > Right, I remember this now. I could imagine ways to get around the 'running > out of ids' problem, like resetting the id counter after a given timeframe. > In fact, we could just let the counter overflow and loop back to 0 on its > own (assuming its a unsigned int type) since we really only need them to be > unique approximately for the life of a circuit. If we see the ID come up > again after an hour, we can assume its a new circuit. I'd rather not want to change behavior in non-simulation mode. If we really need a UID for non-origin circuits, we could add a separate uint64_t global_identifier to circuit_t and only use that in simulating mode. I'm not sure that we need it though. >> Also, even if we moved the field to circuit_t, the new CELL_STATS events >> would be the only ones using these UIDs, because all other events are >> for origin circuits only. I don't see how these IDs would help us. We >> never learn what UID the next or previous node in a circuit picks for a >> given circuit. > > Not currently, but couldn't we implement the functionality where relays log > or export their circuit UIDs and next/prev IDs over the control port? > Though I'm not sure if you'd want this in mainline Tor if its only useful > in simulation mode... We could have relays log local circuit UIDs together with p_circ_id, p_conn_id, n_circ_id, and n_conn_id in simulation mode. But I don't see how that would facilitate circuit tracking compared to the current approach using ORCONN and CELL_STATS events. Here's an example: - fileclient creates a circuit with - UID 14, - n_conn_id 15, and - n_circ_id 19403. - tokenglobal is the first hop in this circuit and reports - (new) UID 12345, - p_conn_id 32, - p_circ_id 19403, - n_conn_id 18, and - n_circ_id 6710. - tokenrelay is the second hop and reports - (new) UID 23456, - p_conn_id 17, - p_circ_id 6710, - n_conn_id 16, - n_circ_id 34402. - exit2 is the third and final hop and reports - (new) UID 34567, - p_conn_id 15, and - p_circ_id 34402. We need ORCONN events to know that fileclient's n_conn_id 15 is the same connection as tokenglobal's p_conn_id 32. How would the new UIDs make this easier? >>>> tl;dr: I _think_ it's possible to reconstruct circuits from ORCONN and >>>> CELL_STATS events as they are currently specified in proposal 218. >>>> >>> Great, but do we really expect every Tor controller parser to get this >>> right? It seems complicated enough that there should be an easier way. >>> Maybe it's just wishful thinking on my part. >> >> I agree that reconstructing circuits from ORCONN and CELL_STATS events >> is far from trivial. I don't really see how to make it simpler though. > > What about linking all of the IDs and UIDs as described above? (See above.) >> From an earlier mail in this thread: >>>> Finally, Rob, should I look into CIRC_BW events you suggested a while >>>> ago? If yes, what did you have in mind how that event would look like, >>>> and when/by whom would it be emitted? >>> >>> If we want to do this, it would likely be an aggregation of all STREAM_BW >>> events for a circuit, but only during the time when those streams >> belonged >>> to that circuit. I don't think it makes sense to emit it for every >>> STREAM_BW event though, so what if we aggregate and emit it once per >>> second? A format similar to the STREAM_BW format should work fine. >> >> Done. I specified and implemented such a CIRC_BW event. >> >> > Great! > > >> >> Here's the updated proposal 218 (Nick, please don't merge this yet): >> >> >> https://gitweb.torproject.org/user/karsten/torspec.git/blob/refs/heads/proposal218:/proposals/218-usage-controller-events.txt > > > In section 5.3/5.4/5.5, these events are emitted in the > second_elapsed_callback(), right? I wanted to verify that a relay who > hasn't sent anything in a few seconds and then starts sending again will > emit the event at the end of the second after which it resumed sending, > rather than the first bytes after it resumed sending. Yes, all these events are emitted in second_elapsed_callback(). > The last word in 5.4 should be 'reading' instead of 'read'. Fixed. > Is the specification of 5.5 and 5.6 complex enough to warrant including > example outputs? Sure, can't hurt. Added a few examples. > In 5.6, were we planning on explaining how buckets can go negative and how > that affects the reporting of the TB_EMPTY events? I tried a better explanation: """ ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read (write) bucket was empty since the last refill. LastRefill is the time in millis since the last refill. If a bucket went negative and if refilling tokens didn't make it go positive again, there will be multiple consecutive TB_EMPTY events for each refill interval during which the bucket contained zero tokens or less. In such a case, ReadBucketEmpty or WriteBucketEmpty are capped at LastRefill in order not to report empty times more than once. """ > Here's the tor branch: >> >> >> https://gitweb.torproject.org/karsten/tor.git/shortlog/refs/heads/morestats3 >> >> Here's a Shadow log file containing all new events: >> >> https://people.torproject.org/~karsten/volatile/scallion.log.gz >> >> Here's the Java program that I used to parse the Shadow log file: >> >> https://people.torproject.org/~karsten/volatile/ParseProposal218Events.java >> >> > Ugh. I really don't look forward to writing parsers for this. I guess there > may be few projects that actually require this information, and those that > do can use your code:) I hope that some of the complexity goes away once we use a parsing library like Stem. And I think we should provide parsing code to other people looking into this. >> And finally, here's the output, which should be easier to understand now: >> >> https://people.torproject.org/~karsten/volatile/scallion.txt >> >> Search for "Circuit [fileclient-60.1.0.0]:14" to find the circuit I >> mentioned earlier in this thread. >> >> >> Can you review the proposal changes and tell me if they make sense to you? >> >> Best, >> Karsten >> >> > See comments above. Overall, it looks good. I'm still wondering about the > UIDs / IDs issue. It may be that we don't want to include that for other > reasons, in which case the current implementation is fine. Okay, in that case let's consider proposal 218 done for the moment, unless we come up with a better idea to solve circuit-tracking thing. I asked Nick above to merge my changes, so that I can put proposal 218 on the sponsor F year 3 wiki page as one result of the February 28 milestone. (That doesn't mean we cannot make it even better after February.) Thanks! Karsten _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
