Re: [tor-dev] Torsocks reengineered
Hello! Comments below. warms0x: (snip!) 4) One of the biggest technical issues is that it's not thread safe and fixing it right now would add an important impact on performance adding locks to multiple global data structures. A clean rewrite would allow to take into account this *very* important feature from the start without any ugly hacks in the current code base. Please be extra careful with locking, especially since you're going to redesign this. I haven't personally reviewed torsock's current data structure use, but in general, when somebody looks at a codebase and says this code is not threadsafe - it needs more locking around its data structure use, that person is usually wrong, or at best only half right. It's almost always better to get the locking out of your critical path and use thread local storage, message passing, and/or job scheduling instead of direct lock acquisition+release for commonly used data structures. Since everybody is piling on advice I might as well throw in some of my own, bear in mind I don't claim to fully understand the extent of what torsocks is capable of. Doesn't it's behavior/functionality map well to a libev/libevent loop instead of a multi-threaded model? You can see Torsocks as an inprocess library that is loaded using LD_PRELOAD and hijacks a good amount of symbols of the Libc to make sure any communications goes through the Tor network. Now, this means that Torsocks is quite dependent on the application threading model which means that loading into an application that uses a lot of threads brings the possibility for anything *in* Torsocks to be concurrent. Using a big fat lock would solve the issue but the performance hit would be quite important and it's undesirable. I sent some days ago the locking design to handle such a thing. Maybe you could have a look at it if you are interested? :) Cheers! David It does mean you likely need to approach the problem from a slightly different angle, but with the I/O calls that are being made, it would provide a good high performance alternative without a lot of concurrency overhead. I highly recommend reading some of the redis project's source code (https://github.com/antirez/redis), which also relies on libev and is quite readable (as far as C goes). - warms0x --- xmpp: warm...@riseup.net http: http://warms0x.github.com ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
(snip!) 4) One of the biggest technical issues is that it's not thread safe and fixing it right now would add an important impact on performance adding locks to multiple global data structures. A clean rewrite would allow to take into account this *very* important feature from the start without any ugly hacks in the current code base. Please be extra careful with locking, especially since you're going to redesign this. I haven't personally reviewed torsock's current data structure use, but in general, when somebody looks at a codebase and says this code is not threadsafe - it needs more locking around its data structure use, that person is usually wrong, or at best only half right. It's almost always better to get the locking out of your critical path and use thread local storage, message passing, and/or job scheduling instead of direct lock acquisition+release for commonly used data structures. Since everybody is piling on advice I might as well throw in some of my own, bear in mind I don't claim to fully understand the extent of what torsocks is capable of. Doesn't it's behavior/functionality map well to a libev/libevent loop instead of a multi-threaded model? It does mean you likely need to approach the problem from a slightly different angle, but with the I/O calls that are being made, it would provide a good high performance alternative without a lot of concurrency overhead. I highly recommend reading some of the redis project's source code (https://github.com/antirez/redis), which also relies on libev and is quite readable (as far as C goes). - warms0x --- xmpp: warm...@riseup.net http: http://warms0x.github.com ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
David Goulet: Of what I can see, proxychains and tsocks have not been updated since 2006 and 2002 (http://sourceforge.net/projects/proxychains/). I'm not sure how alive these projects are. The sourceforge project is dead. There are two forks, maybe alive by your definition. 8 months ago: https://github.com/haad/proxychains 1 month ago: https://github.com/rofl0r/proxychains ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
On Mon, Jun 03, 2013 at 09:38:30PM -0400, David Goulet wrote: There might be some rebasing going on in that branch in the next weeks so please don't consider it right now as git stable since I'm in the heavy part of getting everything together before starting a normally growing master branch. If you like to contribute or help, let me know and we'll make something work. Big thanks to nickm, ioerror and sysrqb for their help on all this. David, Does the current version of torsocks support Optimistic Data? That saves a round trip through the Tor network, and makes things snappier. Tor clients, servers, and recently the Tor Browser now support it. Regardless, if you're rewriting torsocks, you might ensure the new version has such support. (Ticket #3711 is about this; there appears to be an unmerged branch for it.) - Ian ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
On 6/4/13 2:08 PM, Ian Goldberg wrote: David, Does the current version of torsocks support Optimistic Data? That saves a round trip through the Tor network, and makes things snappier. Tor clients, servers, and recently the Tor Browser now support it. I'd also say that you should consider extended socks code support, to know if a Tor Hidden Service exists or not: https://trac.torproject.org/projects/tor/ticket/6031 Fabio ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
Mike Perry: David Goulet: Hi everyone, About a week ago I've sent an email to the Torsocks maintainers to address some concerns I had about important issues of the current code base. For the reasons found below, I proposed a rewrite that I am willing to do and submit it to the community once I feel ok with it. Here is the repository I've created and started working on the rewrite. It has been branched from upstream master at commit e0bf65b5d3ca0e28b827c08d80b0fe1841d5a149 https://github.com/dgoulet/torsocks/tree/rewrite Torsocks rewrite reasons: 1) The code needs to be easier to maintain, so that it isn't sporadically maintained. Considering the number of issues (see below), it needs more love on a regular basis right now. 2) It has never been safe from the start so a rewrite will NOT impact the current state of security. There is not much tests right now. 4) One of the biggest technical issues is that it's not thread safe and fixing it right now would add an important impact on performance adding locks to multiple global data structures. A clean rewrite would allow to take into account this *very* important feature from the start without any ugly hacks in the current code base. Please be extra careful with locking, especially since you're going to redesign this. Of course. :) I haven't personally reviewed torsock's current data structure use, but in general, when somebody looks at a codebase and says this code is not threadsafe - it needs more locking around its data structure use, that person is usually wrong, or at best only half right. It's almost always better to get the locking out of your critical path and use thread local storage, message passing, and/or job scheduling instead of direct lock acquisition+release for commonly used data structures. This is not just for performance reasons. Deadlocks are forever. Library call hooking like this is especially tricky, because it could easily lead you to subtle instances of lock order reversal deadlock if you attempt to acquire your data structure lock before an OS/app lock via one socket call path, but after it in another (for example, due to OS/app abstractions that cause certain socket calls to be merely wrappers for combinations of others). Such app-specific and race-prone deadlock conditions will be very annoying to track down. Careful design of how you deploy locking and concurrency is a very wise time investment for this reason. You should probably post a design doc and ask for review of it. That makes sense. I'm not sure yet how complex will be the locking model given that there is simply none right now, I'll surely submit a design proposal if it becomes non trivial. I come from a Linux kernel and RCU (https://en.wikipedia.org/wiki/Read-copy-update) background so I'm used to heavily document any locking scheme and create design proposal before making any production code. So, I agree 100% with you on the importance of thinking this through! Thanks! David 5) This is a very, and I can't stress this enough, _VERY_ intrusive library so extra care MUST be put in making it bullet proof in terms of error handling. Right now, there is multiple call sites that can fail on error. ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
Fabio Pietrosanti (naif): On 6/4/13 2:08 PM, Ian Goldberg wrote: David, Does the current version of torsocks support Optimistic Data? That saves a round trip through the Tor network, and makes things snappier. Tor clients, servers, and recently the Tor Browser now support it. Right now it does not but sysrqb did patches for that. Indeed, this rewrite _must_ support that. I'd also say that you should consider extended socks code support, to know if a Tor Hidden Service exists or not: https://trac.torproject.org/projects/tor/ticket/6031 I'll certainly add this feature! Thanks for the pointers! David Fabio ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev signature.asc Description: OpenPGP digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] Torsocks reengineered
David Goulet: Hi everyone, About a week ago I've sent an email to the Torsocks maintainers to address some concerns I had about important issues of the current code base. For the reasons found below, I proposed a rewrite that I am willing to do and submit it to the community once I feel ok with it. Here is the repository I've created and started working on the rewrite. It has been branched from upstream master at commit e0bf65b5d3ca0e28b827c08d80b0fe1841d5a149 https://github.com/dgoulet/torsocks/tree/rewrite Torsocks rewrite reasons: 1) The code needs to be easier to maintain, so that it isn't sporadically maintained. Considering the number of issues (see below), it needs more love on a regular basis right now. 2) It has never been safe from the start so a rewrite will NOT impact the current state of security. There is not much tests right now. 4) One of the biggest technical issues is that it's not thread safe and fixing it right now would add an important impact on performance adding locks to multiple global data structures. A clean rewrite would allow to take into account this *very* important feature from the start without any ugly hacks in the current code base. Please be extra careful with locking, especially since you're going to redesign this. I haven't personally reviewed torsock's current data structure use, but in general, when somebody looks at a codebase and says this code is not threadsafe - it needs more locking around its data structure use, that person is usually wrong, or at best only half right. It's almost always better to get the locking out of your critical path and use thread local storage, message passing, and/or job scheduling instead of direct lock acquisition+release for commonly used data structures. This is not just for performance reasons. Deadlocks are forever. Library call hooking like this is especially tricky, because it could easily lead you to subtle instances of lock order reversal deadlock if you attempt to acquire your data structure lock before an OS/app lock via one socket call path, but after it in another (for example, due to OS/app abstractions that cause certain socket calls to be merely wrappers for combinations of others). Such app-specific and race-prone deadlock conditions will be very annoying to track down. Careful design of how you deploy locking and concurrency is a very wise time investment for this reason. You should probably post a design doc and ask for review of it. 5) This is a very, and I can't stress this enough, _VERY_ intrusive library so extra care MUST be put in making it bullet proof in terms of error handling. Right now, there is multiple call sites that can fail on error. -- Mike Perry signature.asc Description: Digital signature ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev