#2116: Application layer protocol for transfering RPC  messages + utf8 decoding
error
-------------------+--------------------------------------------------------
 Reporter:  bro    |       Owner:        
     Type:  patch  |      Status:  new   
 Priority:  major  |   Milestone:  Future
Component:  other  |     Version:  1.3.5 
 Keywords:         |  
-------------------+--------------------------------------------------------
 == Introduction ==

 After upgrading my FreeBSD server I suddenly had problems getting  the GTK
 client to show the torrent list (as well as the states and trackers info
 to the left.). The CLI client wouldn't show the torrents with the ''info''
 command. It was pretty much the same symptoms as these bug reports:
 http://dev.deluge-torrent.org/ticket/1531 http://dev.deluge-
 torrent.org/ticket/2095 which I presume have encountered the same bug.

 Some that reported this problem in the bug reports have a lot of torrents
 (many hundreds), and I have about 1600 myself, so it looked like this
 would be a problem only for those with a lot of torrents.

 What happens is that the client can see the daemon being online, but when
 connecting, it will fail to populate the torrent list.
 ''If'' the torrent list is successfully populated, it continues to work
 with no problems.

 After a lot of testing, expecting it was a problem with corruption of the
 state files, I eventually found the cause, which of course, had nothing to
 do with the state files ;-)

 The problem actually turned out to be twofold.

 == Problem one - Application layer protocol for transferring data ==
 The first problem has to do with the application layer protocol, or the
 lack of such a protocol when it comes to transferring the data over the
 network. When RPC commands are sent over the network, they're usually
 successfully decompressed by ''zlib'' on the receiver, but not always.
 When transferring one command (say 20 bytes), with a big enough delay
 until the next command is sent, the 20 bytes are received on the other
 side, and the data is decompressed and decoded (by rencode).

 When transferring multiple messages within a short time period, these
 messages might be received as ''one'' on the other side. So when
 transferring two messages at the same time, one of 20 bytes, and one of 40
 bytes, the receiver might receive 60 bytes, and has no idea that it's
 actually two messages.

 The current protocol trusts that the ''zlib'' library will successfully
 decompress compressed messages that are concatenated by TCP when
 transfered. Usually it will successfully decompress the first message, but
 subsequent messages aren't successfully decompressed as far as I've been
 able to test.

 Most messages are received separately, so that is why this bug shows it's
 ugly face only in rare cases.

 The '''big''' problem is that in some cases, the torrent list being
 transferred isn't successfully decompressed, and thereby lost to the
 client.

 The current deluge protocol will add newly received data onto the current
 buffer of old (but not successfully decompressed) data and then it tries
 to decompress the whole buffer. If this fails it will wait for more data.

 What happens when the torrent list is never populated, is that it fails to
 decompress the data in the buffer even after the whole torrent list has
 been transferred. The result is that it just keeps appending incoming data
 onto the buffer which results in no messages being received at all.

 I'm not sure if this is because the torrent list is received as the second
 message (i.e. it has been concatenated onto the data of another message on
 arrival), or if there is another reason the torrent list message fails to
 be decompressed.

 As far as I can see, the only way to solve this is to use an  application
 layer protocol that knows how to separate messages.

 == Problem two - Decoding error with rencode ==

 After implementing a simple application layer protocol it turned out the
 torrent list still wouldn't load, because the ''rencode'' library fails to
 decode the transferred data with a
 {{{
 UnicodeDecodeError: 'utf8' codec can't decode byte 0xe8 in position 54:
 invalid continuation byte
 }}}

 This problem was easily fixed by replacing ''rencode'' with
 [http://docs.python.org/library/pickle.html ''pickle''].

 Other advantages of ''pickle'' (or more precisely cPickle) compared to
 ''rencode'' is especially speed, but also size (somewhat):

 {{{
 ~/test$ ./pickle_test.py
 Data size: 5125267 bytes

 pickle    dumps time: 0:00:00.939099
 pickle     dump size: 3470949
 cPickle   dumps time: 0:00:00.203606
 cPickle    dump size: 3426268
 rencode   dumps time: 0:00:00.448616
 rencode    dymp size: 4033409

 pickle    loads time: 0:00:00.673550
 cPickle   loads time: 0:00:00.162311
 rencode   loads time: 0:00:00.690657
 }}}

 These are test results from the actual data of the torrent list
 transferred from my server (1600 torrents).

 == Patch ==
 I've implemented a class ''!DelugeTransferProtocol'' which fixes the
 issues I had, though I cant guarantee that it fixes the issues others have
 reported, or that it doesn't introduce other bugs ;-)

 This patch will render the client unable to communicate with any unpatched
 daemons.

 I've also implemented tests to verify sending and receiving of messages
 through the transfer protocol.


 
https://github.com/bendikro/deluge/commit/8e67c763ab7176bc16b871cac974eef74aaf324f

-- 
Ticket URL: <http://dev.deluge-torrent.org/ticket/2116>
Deluge <http://deluge-torrent.org/>
Deluge project

-- 
You received this message because you are subscribed to the Google Groups 
"Deluge Dev" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/deluge-dev?hl=en.

Reply via email to