[Pyogp] Design of the messging system (was) Re: Refactoring

Enus Linden Wed, 13 Aug 2008 08:25:14 -0700

moving this to the list. (I took the liberty of pasting this mail'spredecessor below as well)


This is an excellent discussion...


Locklainn wrote:

Here is a proposal of mine of how we can change the message system toincorporate Tao's suggestions (more Pythonic creation of messages) aswell as some other features (having a connection that messages aresent to)



   Overall Design

The user should never have to see any serialized form of a message. Heor she should always use the higher level classes like Packet that aresent and received. Serialization and deserialization are done not bythe user but by the UDPConnection.







     MessageSystem -> UDPConnection

Represents a udp connection that can send and receive  packets.

* *

*Responsibilities:*

   * Keeping track of all the circuits it is connected to, or have
     sent/received messages on (it seems people want the connection to
     represent only a single circuit)
   * When sending packets, add flags and sequence number, have a
     serializer serialize them and send them to the host
   * When receiving packets, deserialize them, keeping track of the
     flags that were deserialized
         o Accounting:
               + keeping track of the packets we received that need acks
               + keeping track of the packets we sent that weren’t acked
               + sending acks to those we need to ack
               + resending packets that didn’t get acked

*Changes:*

   * No longer has the responsibility of creating messages
   * No longer has the responsibility of getting data from messages




     Packet

Represents everything about a Packet that is needed to be sent andreceived. It also has methods to add data to the packet. The designsays that the user shouldn’t set the flags or sequence numberthemselves, but the UDPConnection will do such a thing.






*Responsibilities:*

   * Knows the flags it is being sent with
   * Knows its sequence number
   * Knows its payload
   * Methods to add blocks of data to the packet
   * Passed to a UDPConnection to be sent, and read from the receive



*Changes:*

   * Didn’t exist before. Exists because this design is a
     message-centric design that allows the user to have direct access
     to messages and manipulation of them.






     MessageTemplateReader -> UDPDeserializer

Deserializes a buffer into a UDPPacket…..hmmm….how does this work?Whoever is deserializing the buffer MUST know it’s a udp packet, sothis must be done by the UDPConnection.




*Responsibilities:*

   * Determines if the buffer is a template message, decodes the data
   * Also deserializes the flags and sequence number as well





*Changes:*

   * It doesn’t work off of a single message anymore. The data isn’t
     gotten by doing a get_data on the reader. Instead, the
     deserializer outputs a Packet object which the user has direct
     access to.

* Decoding the flags and sequence number wasn’t a responsibilitybefore





     MessageTemplateBuilder -> UDPSerializer

Serializes a Packet into a buffer that can be sent.



*Responsibilities:*

   * Serializes a Packet, including its flags and sequence number,
     header, and payload.
   * Outputs a Packet object.



*Changes:*

   * This was previously done by the template builder, which also built
     the message (adding data and blocks to the message). These
     functionalities are now separated and the Packet is directly
     manipulated to add data, and the deserializer is used to put it
     into network format.


-------------------------------------------------------------------------
(previous mail - discussion between Tao and Lock, with Tao in blue....)

In my approach all domain knowledge is encapsulated in an abstractobject (abstract not in an OO sense) and the same would be true forthe connection. Right now I am not 100% sure what needs to be donewhen sending and receiving but I would assume that the Connectionclass will handle this.
I was thinking of a Packet class which holds a message and has aserializer which will take care of getting all information together.
What sort of flags would that be btw? Who defines which one gets set?
Right now I know about the reliable flag and I assume that this is iseither defined by the template of message.xml. So both could be knownby the message.

The flags are added on by the client. Any packet can be reliable, anypacket can be resent, any packet can have acks added on. Reliable meansthe server needs to ack it. Resent means this is a second (or more)attempt at the packet, and so may be a duplicate if it is received. Theack flag means that we have attached acks on the end of our message(saves network traffic). The templates don't define any of this and sothe message itself can't know. These are added on at the time of sending.

The Connection class would then have a list of to be acked packagesand would basically do the same as the message system does.

Similarly, the server can make any packet ackable (by setting the sendflag with the ack flag), and so we must ack it, otherwise the servergets angry. We don't know which packet we will need to ack, we have todetermine that when we receive one.

It would follow the pattern seen in e.g. smtplib, urllib2 (where theRequest is the message). And most network modules actually have aconnection object, such as ftplib, nntplib, gopherlib etc. Not allhave message classes though because if it's just a file you send, thenthere is no need to encapsulate this in a separate object.
Here also is another example of message objects in email:

http://docs.python.org/lib/message-objects.html

I'm starting to like the idea of a Message. Maybe this Message could beonly the payload of the message, with a Packet class (I think you havesuggested this) having the other necessary fields. Then, a serializercan serialize the packet and a net framework can send it on a connection.

Also be aware that connections will change during the lifetime of theclient. You don't have a single udp connection. You communication withneighboring sims, you may switch regions, etc. This causes you to createa new connection to send on.But also remember that for udp messages you don't need a connection, youcan simply send the message to any given host. So, it may be extra tohave a connection class doing such a thing because you can use a singleconnection to send and receive on. The target we are sending to changes,but we don't need to change the sockets or anything.

*Current Design: *
This is taken fromhttp://wiki.secondlife.com/wiki/Pyogp/Documentation/Specification/pyogp.lib.base
messenger = MessageSystem()
host = Host('sim_ip', 'sim_port') #note: these aren't true values, ofcourse
messenger.new_message("UseCircuitCode")
messenger.next_block("CircuitCode")
messenger.add_data('Code', circuit_code, \
MsgType.MVT_U32)
messenger.add_data('SessionID', \
uuid.UUID(session_id), \
MsgType.MVT_LLUUID)
messenger.add_data('ID', \
uuid.UUID(agent_id), \
MsgType.MVT_LLUUID)
messenger.send_message(host)

_Explanation:_
The thing to know about the current design is that it is encapsulatedinto a MessageSystem. Everything from building, reading, sending, andreceiving messages all occurs in the Message System (though each ofthe sets of functionality are performed by other objects that thesystem HAS).
I think this is a quite good explanation where we differ. As saidbefore this feels very uncommon in the python world to me.
One concern I also have is all the sort of global state in these longliving objects. It doesn't need to but might lead to problems withthreading or coroutines. I would try to keep locking zones as small aspossible. I also might think of this scenario with coroutines:
- You create and send a message in coroutine A
- Sending blocks for whatever reasons
- Coroutine B gets activated, creates a message and sends it. Maybewith the same message system. It also blocks on sending.
- current_msg is now message of B and this is what A sends.
So this would mean that you need to separate message system perthread. This also means though that it's only one host you connect toper message system and thus the host could be in the constructor asit's quite fixed then.

Yea, I do agree that it is confusing having the message remembered instate by the system, builders, and readers. I'm starting to like theidea of outputting a Message that the user adds data to and sends. Maybethe Message System could remain as the connection you send through andreceive on, which automatically serializes sending packets anddeserializes receiving packets, keep track of all acks and such.

In my approach you of course also just would have one Connection perthread/coroutine but additionally you could create messages e.g.outside a thread and pass it into a thread. The send method would alsojust have method local variables it works. Packet ID apparently issomething which needs thinking ;-)
For the current design, you don't ever have direct access (handle,object, reference whatever it may be called) to a Message or aConnection. Building is delegated to the Message System, which,underneath the hood, is delegated to the appropriate builder. Sendingis delegated to the Message System, which again, is delegated to theappropriate sender, this case being a udp_sender. Also note the userdoesn't need to serialize or otherwise perform any functions on thebuilt message.
One point I experienced in my programmer life was that delegation fromone object to another (and maybe yet to another) makes debugging hardbecause if you need to keep in mind which method now was where (esp.if they are called the same). As I had to debug such systems I feelmore comfortable with calls you perform directly on the object youactually want to change.
You’ll also notice that the type is given when adding data. This isnot absolutely necessary to have (and can be removed). It is used asa user-check to make sure the user knows what type of data he or sheis sending. This makes it a bit easier for coders to think theircreation through, as well as other coders who look at it (it may beconfusing to see adding a simple 1 where that 1 can be stored as abyte, an int, or a long).
In Python you don't care about this. If there is a 1 you mean 1 andyou don't care how it's sent over the wire on a lower level of thesystem. Yes, you might run into a problem if you don't know the typebut in my experience this rarely actually leads into problems. Havingno type also makes coding faster as you have to type less and youdon't have to consult the documentation.

So let's get rid of the type-checking, I'm fine with that. It IS justextra junk I don't feel like typing anyway :)

*Proposed Design 1:*

*A*

conn = UDPConnection(region)

msg = Message('UseCircuitCode',
        Block('CircuitCode',
        ('Code', circuit_code, MsgType.MVT_U32),
        ('SessionID', uuid.UUID(session_id), MsgType.MVT_LLUUID)),
        ('ID', uuid.UUID(agent_id),MsgType.MVT_LLUUID)
        )
)

conn.send(msg)

* *

*B*

conn = UDPConnection(region)

msg = Message('UseCircuitCode')

block = Block('CircuitCode',
    ('Code', circuit_code, MsgType.MVT_U32),
    ('SessionID', uuid.UUID(session_id), MsgType.MVT_LLUUID)),
    ('ID', uuid.UUID(agent_id),MsgType.MVT_LLUUID)
)

msg.add(block)
conn.send(msg)

* *
BTW, now that I look at it again I think a Message is just a list ofblocks so it could even derive from a list object and add() would beappend. Blocks seem like dicts to me with the exception that they havea name. But they could be more easily instantiated as
blk = Block('CircuitCode',
    Code=circuit_code,
    SessionID=sessionid,
    ID=agent_id)
msg.append(blk)
_Explanation:_
This takes the code for the current design and makes it morePythonic. It essentially makes a wrapper class called Message, whichcan handle Pythonic structures, and can create a message like that ofthe current design.
In the A version of this design, the constructor takes in all theblocks and data and then would construct the message completely. TheB version allows users to create blocks separately and add them intothe message. These two methods could be combined, in fact.
You can of course also first create the blocks in separate vars andthen pass them into the Message constructor: Message(name, blk1, blk2,blk3)
_Pros n Cons:_
This method allows us to keep most of the same design in place, withan additional layer that wraps the message creation to make it lesssequential and more Pythonic. It cuts out all the calls tonew_message, next_block, and add_data, allowing users to pass in morePython structured data (form of lists).
This means less typing which to me is always a pro :)
With the above change even less typing.
Messages can have multiple and variable number of blocks with thesame name, so this method would consist of the user passing in a listof blocks rather than just a single block into the constructor. Thisis not too difficult to handle.
Having the constructor take the entire message may be complicated andvisually difficult to parse for the user. It is also prone to syntaxerrors.
I actually see this the completely other way, esp. with

msg = Message('UseCircuitCode',
    Block('CircuitCode',
        Code=circuit_code,
        SessionID=sessionid,
        ID=agent_id)
    )
Of course if the message is more complex you would probably createblocks separately and then pass them in. But both would be possible.
It would also remove one bit of delegation (add_data) and methodswould only be defined on those classes which they actually implement.
It also refactors the way the Message System, builders, and readerswork. Some messages are template messages, which means the messagesMUST be built according to the template. If they are not, then theyshouldn’t be allowed to be built and sent. The builder makes surethis doesn’t happen. These designs get rid of a builder and put itdirectly into the message, which means the message IS the builder.When the message is being created, we somehow have to determine whattype of message it is (template or llsd) and use the correct builder(or at least make sure messages are being built correctly).
I have one superclass Message from which I have derived LLSDXMLMessageand UDPMessage. There is a MessageFactory utility which can be used tocreate such a message:
factory = getUtility(IMessageFactory)
message = factory.new('UseCircuitCode')

You can then look into message.flavor to check the flavor.

To serialize either message you then do

serializer = ISerialization(message)
serializer.serialize()

This is the same pattern as in the rest of the library.

My first though on this was though to create just an LLSDMessage class
which doesn't know about it's final encoding. This is decided onserialization time. I think this would more follow the protocolstructure as both types are actually equivalent.
The problem was that the message based template was initialized fromthe template on instantiation which the XML version not always couldbe because not every message is in the template.I am not sure if the initialization is necessary or just made to havedefault values here. I would think it's not necessary as the templateis known in both approaches and you can also check for invalid blockswhen you add new ones (might raise an InvalidBlock exception).
The serialization step in this case would look like above just thatthe serializer would consult the MessageDict (which is a utility in mycase).
There might then also be a MessageDispatcher which does the same so itknows over which channel to send this message (I guess for XMLmessages it's simply the cap we have and we do cap.POST(data).

Right, so I'm thinking the Message System could do all this. Maybe theMessage System could be the factory and dispatcher, with all messagingbeing sent and received going through it (but BUILDING messages notgoing through this).

* *

*Proposed Design 2:
*msg = api.new_message('PacketAck')
msg.next_block('Packets')
msg.add_data('ID', 0x00000001, MsgType.MVT_U32)
msg.next_block('Packets')
msg.add_data('ID', 0x00000001, MsgType.MVT_U32)
data = api.serialize(msg)
connection = UDPConnection(host)
connection.send(data)



_Explanation:_
The new proposed design has a few differences. One is what theresponsibilities of each of the objects is. You'll notice in thisdesign you have direct access to the message. The message is also thebuilder, so you perform building operations directly on the message(whereas in the current design you use a builder to add data to themessage). You'll also notice that you have direct access to theUDPConnection and therefore you direct the message to the connectionyou wish to send it to.
Actually I would prefer the design above with Blocks and Messages.
message (up to 500 I believe) will have its own unique class thatwill initialize the data attributes.
We would start with the ones we actually use in the library. Ifsomebody needs to use an additional one he can still use the more lowlevel version (Message('name', Block(...), Block(...)) ).
We also have to look at every message in the protocol spec anyway anddefine it there in detail. When we do this we can go along and definethem in code as well. I am also willing to do that.
A pro here would be that you can put default values in the class sothat you don't have to specify all parameters.
When receiving a message you would have the possibility to attach anevent handler directly to that class using ZCA.
Another pro is that the user of this level doesn't have to know aboutblocks and the sequence of these. She only needs to know about theactual data to be passed in.

Well, we can do this with ZCA without deriving a class for each message.We can have them all implement an interface and register them with acertain name. This way we don't have to write each individual class, butcan have a generic Message which can handle them all, with handlers. Thedefault data can be added in by the Message Factory (which looks intothe template and fills in the message with default values).I guess this is the problem then. If we have a single Message whichbuilds itself (add_block methods), we cannot write a Message class whichtests that the data being added is correct and expected. Unless themessage itself is a UDP message derivation and can look at the messagetemplate itself and do the checking.

_Questions:_
The region domain stores the connections, both udp and http. So wouldsending a message be something like:
msg.send(region)

region.send(msg)

api.send(msg, region)

conn = UDPConnection(region)

conn.send(msg)
How are the UDP packet flags added onto the packets being sent? Theyare apparently not built into the message itself (because they areonly UDP), so need to be added on when actually sending the message.These depend on how you want to send the message (want an ack) and socan vary per message, and they are not always the same even on asingle circuit.
Do I get this right that the message type defines the flags needed?
I shortly looked into your code and I think I would do it similarly.You have all the data in MsgData without those flags and you add themon sending. I would maybe move some logic from the msgsystem into thepacket like this:
def send(message):
    send_flags = ...
    packet=Packet(id, message, send_flags)
    packet.addAcks(self.acklist) # might be in the constructor as well

    serializer = ISerialization(packet)
    packetdata = serializer.serialize()

    # what defines if it's a reliable packet?

    self.udp_client.send_packet(self.socket, packet_data, self.host)
This is just a quick shot without reading the code in detail so itmight be wrong ;-)
The reason there are builders is because the template messages musthave the correct data. The template builder makes sure that blocksand data being added to messages follow the template’s specification(LLSD has no format because it is going to be formatted into XML, anddeserialized when being received, and so the arrays and dicts can bedirectly accessed). How is this accomplished without going throughthe builder? How do we distinguish between creating a templatemessage (making sure it has the correct data) and an llsd message?
As said above, by the message factory. It gives you one of two classesof messages.
Seehttp://svn.secondlife.com/trac/linden/browser/projects/2008/pyogp/pyogp.lib.base/branches/mrtopf-message-refactoring/pyogp/lib/base/message/message.py
in line 46 for the message factory. Message types follow below. Thisis not using blocks etc. as in the example above though.
Who does the message maintenance? Meaning, who keeps track of thepackets that need to be acked, the ones we want acked, and resendingmessages that weren’t acked? Do we leave this up to the user tocreate such a system?
Some Connection class which seems to me similar to your message systemand circuits.
BTW, what actually is a circuit? Is it a connection to a region? Orcan you have many circuits to one region? This part of the protocol isnot that clear to me right now. We probably should write it down if itisn't somewhere (but it should be part of the spec at some point anyway).

A circuit is a UDP connection. So, it is a UNIQUE connection to ipaddress and port combination. Can only have 1 circuit for each ip andport combination.

Thanks for your work, I'm starting to see where we can improve things.I'll start writing down my new proposal and see if we can get somethingworking.

PS: I don't like the idea of ZCAifying things like the dictionary justso that we can register them with ZCA as a global utility. It is anextra abstraction that is confusing and the reasoning not clearly seen.Something else we can do?

_______________________________________________
Click here to unsubscribe or manage your list subscription:
https://lists.secondlife.com/cgi-bin/mailman/listinfo/pyogp

[Pyogp] Design of the messging system (was) Re: Refactoring

Reply via email to