moving this to the list. (I took the liberty of pasting this mail's predecessor below as well)

This is an excellent discussion...

Locklainn wrote:
Here is a proposal of mine of how we can change the message system to incorporate Tao's suggestions (more Pythonic creation of messages) as well as some other features (having a connection that messages are sent to)


   Overall Design

The user should never have to see any serialized form of a message. He or she should always use the higher level classes like Packet that are sent and received. Serialization and deserialization are done not by the user but by the UDPConnection.






     MessageSystem -> UDPConnection

Represents a udp connection that can send and receive  packets.

* *

*Responsibilities:*

   * Keeping track of all the circuits it is connected to, or have
     sent/received messages on (it seems people want the connection to
     represent only a single circuit)
   * When sending packets, add flags and sequence number, have a
     serializer serialize them and send them to the host
   * When receiving packets, deserialize them, keeping track of the
     flags that were deserialized
         o Accounting:
               + keeping track of the packets we received that need acks
               + keeping track of the packets we sent that weren’t acked
               + sending acks to those we need to ack
               + resending packets that didn’t get acked

*Changes:*

   * No longer has the responsibility of creating messages
   * No longer has the responsibility of getting data from messages




     Packet

Represents everything about a Packet that is needed to be sent and received. It also has methods to add data to the packet. The design says that the user shouldn’t set the flags or sequence number themselves, but the UDPConnection will do such a thing.





*Responsibilities:*

   * Knows the flags it is being sent with
   * Knows its sequence number
   * Knows its payload
   * Methods to add blocks of data to the packet
   * Passed to a UDPConnection to be sent, and read from the receive



*Changes:*

   * Didn’t exist before. Exists because this design is a
     message-centric design that allows the user to have direct access
     to messages and manipulation of them.






     MessageTemplateReader -> UDPDeserializer

Deserializes a buffer into a UDPPacket…..hmmm….how does this work? Whoever is deserializing the buffer MUST know it’s a udp packet, so this must be done by the UDPConnection.



*Responsibilities:*

   * Determines if the buffer is a template message, decodes the data
   * Also deserializes the flags and sequence number as well





*Changes:*

   * It doesn’t work off of a single message anymore. The data isn’t
     gotten by doing a get_data on the reader. Instead, the
     deserializer outputs a Packet object which the user has direct
     access to.
* Decoding the flags and sequence number wasn’t a responsibility before




     MessageTemplateBuilder -> UDPSerializer

Serializes a Packet into a buffer that can be sent.



*Responsibilities:*

   * Serializes a Packet, including its flags and sequence number,
     header, and payload.
   * Outputs a Packet object.



*Changes:*

   * This was previously done by the template builder, which also built
     the message (adding data and blocks to the message). These
     functionalities are now separated and the Packet is directly
     manipulated to add data, and the deserializer is used to put it
     into network format.



-------------------------------------------------------------------------
(previous mail - discussion between Tao and Lock, with Tao in blue....)


In my approach all domain knowledge is encapsulated in an abstract object (abstract not in an OO sense) and the same would be true for the connection. Right now I am not 100% sure what needs to be done when sending and receiving but I would assume that the Connection class will handle this.

I was thinking of a Packet class which holds a message and has a serializer which will take care of getting all information together.
What sort of flags would that be btw? Who defines which one gets set?
Right now I know about the reliable flag and I assume that this is is either defined by the template of message.xml. So both could be known by the message.

The flags are added on by the client. Any packet can be reliable, any packet can be resent, any packet can have acks added on. Reliable means the server needs to ack it. Resent means this is a second (or more) attempt at the packet, and so may be a duplicate if it is received. The ack flag means that we have attached acks on the end of our message (saves network traffic). The templates don't define any of this and so the message itself can't know. These are added on at the time of sending.
The Connection class would then have a list of to be acked packages and would basically do the same as the message system does.

Similarly, the server can make any packet ackable (by setting the send flag with the ack flag), and so we must ack it, otherwise the server gets angry. We don't know which packet we will need to ack, we have to determine that when we receive one.
It would follow the pattern seen in e.g. smtplib, urllib2 (where the Request is the message). And most network modules actually have a connection object, such as ftplib, nntplib, gopherlib etc. Not all have message classes though because if it's just a file you send, then there is no need to encapsulate this in a separate object.

Here also is another example of message objects in email:

http://docs.python.org/lib/message-objects.html

I'm starting to like the idea of a Message. Maybe this Message could be only the payload of the message, with a Packet class (I think you have suggested this) having the other necessary fields. Then, a serializer can serialize the packet and a net framework can send it on a connection.

Also be aware that connections will change during the lifetime of the client. You don't have a single udp connection. You communication with neighboring sims, you may switch regions, etc. This causes you to create a new connection to send on. But also remember that for udp messages you don't need a connection, you can simply send the message to any given host. So, it may be extra to have a connection class doing such a thing because you can use a single connection to send and receive on. The target we are sending to changes, but we don't need to change the sockets or anything.

*Current Design: *
This is taken from http://wiki.secondlife.com/wiki/Pyogp/Documentation/Specification/pyogp.lib.base

messenger = MessageSystem()
host = Host('sim_ip', 'sim_port') #note: these aren't true values, of course
messenger.new_message("UseCircuitCode")
messenger.next_block("CircuitCode")
messenger.add_data('Code', circuit_code, \
MsgType.MVT_U32)
messenger.add_data('SessionID', \
uuid.UUID(session_id), \
MsgType.MVT_LLUUID)
messenger.add_data('ID', \
uuid.UUID(agent_id), \
MsgType.MVT_LLUUID)
messenger.send_message(host)

_Explanation:_
The thing to know about the current design is that it is encapsulated into a MessageSystem. Everything from building, reading, sending, and receiving messages all occurs in the Message System (though each of the sets of functionality are performed by other objects that the system HAS).

I think this is a quite good explanation where we differ. As said before this feels very uncommon in the python world to me.

One concern I also have is all the sort of global state in these long living objects. It doesn't need to but might lead to problems with threading or coroutines. I would try to keep locking zones as small as possible. I also might think of this scenario with coroutines:

- You create and send a message in coroutine A
- Sending blocks for whatever reasons
- Coroutine B gets activated, creates a message and sends it. Maybe with the same message system. It also blocks on sending.
- current_msg is now message of B and this is what A sends.

So this would mean that you need to separate message system per thread. This also means though that it's only one host you connect to per message system and thus the host could be in the constructor as it's quite fixed then.

Yea, I do agree that it is confusing having the message remembered in state by the system, builders, and readers. I'm starting to like the idea of outputting a Message that the user adds data to and sends. Maybe the Message System could remain as the connection you send through and receive on, which automatically serializes sending packets and deserializes receiving packets, keep track of all acks and such.
In my approach you of course also just would have one Connection per thread/coroutine but additionally you could create messages e.g. outside a thread and pass it into a thread. The send method would also just have method local variables it works. Packet ID apparently is something which needs thinking ;-)

For the current design, you don't ever have direct access (handle, object, reference whatever it may be called) to a Message or a Connection. Building is delegated to the Message System, which, underneath the hood, is delegated to the appropriate builder. Sending is delegated to the Message System, which again, is delegated to the appropriate sender, this case being a udp_sender. Also note the user doesn't need to serialize or otherwise perform any functions on the built message.

One point I experienced in my programmer life was that delegation from one object to another (and maybe yet to another) makes debugging hard because if you need to keep in mind which method now was where (esp. if they are called the same). As I had to debug such systems I feel more comfortable with calls you perform directly on the object you actually want to change.


You’ll also notice that the type is given when adding data. This is not absolutely necessary to have (and can be removed). It is used as a user-check to make sure the user knows what type of data he or she is sending. This makes it a bit easier for coders to think their creation through, as well as other coders who look at it (it may be confusing to see adding a simple 1 where that 1 can be stored as a byte, an int, or a long).

In Python you don't care about this. If there is a 1 you mean 1 and you don't care how it's sent over the wire on a lower level of the system. Yes, you might run into a problem if you don't know the type but in my experience this rarely actually leads into problems. Having no type also makes coding faster as you have to type less and you don't have to consult the documentation.
So let's get rid of the type-checking, I'm fine with that. It IS just extra junk I don't feel like typing anyway :)

*Proposed Design 1:*

*A*

conn = UDPConnection(region)

msg = Message('UseCircuitCode',
        Block('CircuitCode',
        ('Code', circuit_code, MsgType.MVT_U32),
        ('SessionID', uuid.UUID(session_id), MsgType.MVT_LLUUID)),
        ('ID', uuid.UUID(agent_id),MsgType.MVT_LLUUID)
        )
)

conn.send(msg)

* *

*B*

conn = UDPConnection(region)

msg = Message('UseCircuitCode')

block = Block('CircuitCode',
    ('Code', circuit_code, MsgType.MVT_U32),
    ('SessionID', uuid.UUID(session_id), MsgType.MVT_LLUUID)),
    ('ID', uuid.UUID(agent_id),MsgType.MVT_LLUUID)
)

msg.add(block)
conn.send(msg)

* *

BTW, now that I look at it again I think a Message is just a list of blocks so it could even derive from a list object and add() would be append. Blocks seem like dicts to me with the exception that they have a name. But they could be more easily instantiated as

blk = Block('CircuitCode',
    Code=circuit_code,
    SessionID=sessionid,
    ID=agent_id)
msg.append(blk)



_Explanation:_

This takes the code for the current design and makes it more Pythonic. It essentially makes a wrapper class called Message, which can handle Pythonic structures, and can create a message like that of the current design.

In the A version of this design, the constructor takes in all the blocks and data and then would construct the message completely. The B version allows users to create blocks separately and add them into the message. These two methods could be combined, in fact.

You can of course also first create the blocks in separate vars and then pass them into the Message constructor: Message(name, blk1, blk2, blk3)


_Pros n Cons:_

This method allows us to keep most of the same design in place, with an additional layer that wraps the message creation to make it less sequential and more Pythonic. It cuts out all the calls to new_message, next_block, and add_data, allowing users to pass in more Python structured data (form of lists).

This means less typing which to me is always a pro :)
With the above change even less typing.

Messages can have multiple and variable number of blocks with the same name, so this method would consist of the user passing in a list of blocks rather than just a single block into the constructor. This is not too difficult to handle.

Having the constructor take the entire message may be complicated and visually difficult to parse for the user. It is also prone to syntax errors.

I actually see this the completely other way, esp. with

msg = Message('UseCircuitCode',
    Block('CircuitCode',
        Code=circuit_code,
        SessionID=sessionid,
        ID=agent_id)
    )

Of course if the message is more complex you would probably create blocks separately and then pass them in. But both would be possible.

It would also remove one bit of delegation (add_data) and methods would only be defined on those classes which they actually implement.


It also refactors the way the Message System, builders, and readers work. Some messages are template messages, which means the messages MUST be built according to the template. If they are not, then they shouldn’t be allowed to be built and sent. The builder makes sure this doesn’t happen. These designs get rid of a builder and put it directly into the message, which means the message IS the builder. When the message is being created, we somehow have to determine what type of message it is (template or llsd) and use the correct builder (or at least make sure messages are being built correctly).

I have one superclass Message from which I have derived LLSDXMLMessage and UDPMessage. There is a MessageFactory utility which can be used to create such a message:

factory = getUtility(IMessageFactory)
message = factory.new('UseCircuitCode')

You can then look into message.flavor to check the flavor.

To serialize either message you then do

serializer = ISerialization(message)
serializer.serialize()

This is the same pattern as in the rest of the library.

My first though on this was though to create just an LLSDMessage class
which doesn't know about it's final encoding. This is decided on serialization time. I think this would more follow the protocol structure as both types are actually equivalent.

The problem was that the message based template was initialized from the template on instantiation which the XML version not always could be because not every message is in the template. I am not sure if the initialization is necessary or just made to have default values here. I would think it's not necessary as the template is known in both approaches and you can also check for invalid blocks when you add new ones (might raise an InvalidBlock exception).

The serialization step in this case would look like above just that the serializer would consult the MessageDict (which is a utility in my case).

There might then also be a MessageDispatcher which does the same so it knows over which channel to send this message (I guess for XML messages it's simply the cap we have and we do cap.POST(data).
Right, so I'm thinking the Message System could do all this. Maybe the Message System could be the factory and dispatcher, with all messaging being sent and received going through it (but BUILDING messages not going through this).


* *

*Proposed Design 2:
*msg = api.new_message('PacketAck')
msg.next_block('Packets')
msg.add_data('ID', 0x00000001, MsgType.MVT_U32)
msg.next_block('Packets')
msg.add_data('ID', 0x00000001, MsgType.MVT_U32)
data = api.serialize(msg)
connection = UDPConnection(host)
connection.send(data)



_Explanation:_
The new proposed design has a few differences. One is what the responsibilities of each of the objects is. You'll notice in this design you have direct access to the message. The message is also the builder, so you perform building operations directly on the message (whereas in the current design you use a builder to add data to the message). You'll also notice that you have direct access to the UDPConnection and therefore you direct the message to the connection you wish to send it to.

Actually I would prefer the design above with Blocks and Messages.

message (up to 500 I believe) will have its own unique class that will initialize the data attributes.

We would start with the ones we actually use in the library. If somebody needs to use an additional one he can still use the more low level version (Message('name', Block(...), Block(...)) ).

We also have to look at every message in the protocol spec anyway and define it there in detail. When we do this we can go along and define them in code as well. I am also willing to do that.

A pro here would be that you can put default values in the class so that you don't have to specify all parameters.

When receiving a message you would have the possibility to attach an event handler directly to that class using ZCA.

Another pro is that the user of this level doesn't have to know about blocks and the sequence of these. She only needs to know about the actual data to be passed in.
Well, we can do this with ZCA without deriving a class for each message. We can have them all implement an interface and register them with a certain name. This way we don't have to write each individual class, but can have a generic Message which can handle them all, with handlers. The default data can be added in by the Message Factory (which looks into the template and fills in the message with default values). I guess this is the problem then. If we have a single Message which builds itself (add_block methods), we cannot write a Message class which tests that the data being added is correct and expected. Unless the message itself is a UDP message derivation and can look at the message template itself and do the checking.

_Questions:_

The region domain stores the connections, both udp and http. So would sending a message be something like:

msg.send(region)

region.send(msg)

api.send(msg, region)

conn = UDPConnection(region)

conn.send(msg)

How are the UDP packet flags added onto the packets being sent? They are apparently not built into the message itself (because they are only UDP), so need to be added on when actually sending the message. These depend on how you want to send the message (want an ack) and so can vary per message, and they are not always the same even on a single circuit.

Do I get this right that the message type defines the flags needed?
I shortly looked into your code and I think I would do it similarly. You have all the data in MsgData without those flags and you add them on sending. I would maybe move some logic from the msgsystem into the packet like this:


def send(message):
    send_flags = ...
    packet=Packet(id, message, send_flags)
    packet.addAcks(self.acklist) # might be in the constructor as well

    serializer = ISerialization(packet)
    packetdata = serializer.serialize()

    # what defines if it's a reliable packet?

    self.udp_client.send_packet(self.socket, packet_data, self.host)

This is just a quick shot without reading the code in detail so it might be wrong ;-)


The reason there are builders is because the template messages must have the correct data. The template builder makes sure that blocks and data being added to messages follow the template’s specification (LLSD has no format because it is going to be formatted into XML, and deserialized when being received, and so the arrays and dicts can be directly accessed). How is this accomplished without going through the builder? How do we distinguish between creating a template message (making sure it has the correct data) and an llsd message?

As said above, by the message factory. It gives you one of two classes of messages.

See http://svn.secondlife.com/trac/linden/browser/projects/2008/pyogp/pyogp.lib.base/branches/mrtopf-message-refactoring/pyogp/lib/base/message/message.py

in line 46 for the message factory. Message types follow below. This is not using blocks etc. as in the example above though.

Who does the message maintenance? Meaning, who keeps track of the packets that need to be acked, the ones we want acked, and resending messages that weren’t acked? Do we leave this up to the user to create such a system?

Some Connection class which seems to me similar to your message system and circuits.

BTW, what actually is a circuit? Is it a connection to a region? Or can you have many circuits to one region? This part of the protocol is not that clear to me right now. We probably should write it down if it isn't somewhere (but it should be part of the spec at some point anyway).
A circuit is a UDP connection. So, it is a UNIQUE connection to ip address and port combination. Can only have 1 circuit for each ip and port combination.

Thanks for your work, I'm starting to see where we can improve things. I'll start writing down my new proposal and see if we can get something working.

PS: I don't like the idea of ZCAifying things like the dictionary just so that we can register them with ZCA as a global utility. It is an extra abstraction that is confusing and the reasoning not clearly seen. Something else we can do?



_______________________________________________
Click here to unsubscribe or manage your list subscription:
https://lists.secondlife.com/cgi-bin/mailman/listinfo/pyogp

Reply via email to