Re: [tp-devel] Structure - Questions, Questions

Tim Ansell Sat, 15 Sep 2007 23:00:14 -0700

> Actually I was more refering to tpe_util_parse_array().  Parse_packet()
> is the older one which has hard coded structure formats.  It's not
> particularly scalable.
> 
> http://git.thousandparsec.net/gitweb/gitweb.cgi?p=galaxie.git;a=blob;f=tpe_util.c;h=c81d11befdc395b06a452d1c84d5e9c65aa351a6;hb=2fdbebf7acb9d2ae18b6d8289975b32c3d156b9b#l538
> 
> > I was thinking something which is closer to a direct port of the python
> > xstruct.py
> > 
> > On a side note, shouldn't you be using specific type sizes in the file.
> > A int might be 64 bits on a AMD 64 machine (while int32_t will always be
> > 32 bits)?
> 
> a) Parse_packet does.
> b) Probably.
> c) Yet to find a IP64 arch in general use, since most emt64 (aka amd64)
> alpha, mips and sparcs are LLP64 or occasionally LP64; it's not a major
> issue practically.


Yes, but it makes the code more explicit and thus easier to understand.
There is no confusion between if you are meaning an int32, int64 or
int16.

A lot of the below is included for completeness (rather then because
they are actually used at the moment).

> > The various formats I want to support, from
> > http://git.thousandparsec.net/gitweb/gitweb.cgi?p=libtpproto-py.git;a=blob;f=tp/netlib/xstruct.py
> > 
> > c Char
> 
> Why is 'c' different from 'b' at a data level?  From a protocol
> perspective they are the same.

Well, I prefer strong typing. A character vs a byte is quite a different
things at the "semantic level" even if they are identically represented
in the protocol. 

It is the same difference for uint32 vs int32 vs sint32. They all have
the same representation in the computer (32 bits) but the bits have a
different meaning. 

You might easily implement them in C using something like,
 case 'i':
 case 'I':
 case 'j':
    <code to extract them>


> > b Int8       (8 bit integer)
> > B UInt8      (8 bit unsigned integer)
> 
> The only byte size data I could find in the TP protocol was for the
> header.  This needs to be handled specially anyway (the tpe_msg code
> handles it).  Did I miss these? I originally started with these, but
> dropped them as I couldn't see them in use.

I parse the header the same way I parse everything else (I use the
following string passed to xstruct.py "4sIII"). The new TP04 header
would have the following  "2sBBIII", and hence requires these types.

> > h Int16      (16 bit integer)
> > H UInt16     (16 bit unsigned integer)
> > n SInt16     (16 bit semi-signed integer)
>
> Are there any in TP protocol?

Not at the moment, but they are trivial to implement, so why not include
them?

> > i Int32      (32 bit integer)
> > I UInt32     (32 bit unsigned integer)
> 
> There is no need to differentiate between signed and unsigned at the
> wire level.  So I don't ;-)

The parser should still accept these separate values, see above why.

> > q Int64      (64 bit integer)
> > Q UInt64     (64 bit unsigned integer)
> 
> Handle this: See 'l'.
> /me notes 'q' is a bit flaky - it's used extensively on the BSD based
> unices, but unfortunately the size is not always consistent.  

Well, we can define it as 64bit integer right? If so then there is no
confusion?

> > f float      (32 bit floating point number)
> > d double     (64 bit floating point number)
> 
> What format?  Floats and exponents aren't always handled that cleanly.
> The number of bits in either aren't really portable.  

> Heck I can't even be sure that a 64 bit float exists.
> And there isn't a float32_t that I am aware of.

Well, we have yet to define these - so we can ignore them for now. In
the future I think we will need them however. 

I think we should refer to various IEEE standards for floating point
numbers, as they turn out to be quite complicated (IE Things like NaN
and +Inf, -Int, +0, -0, etc).

> > S String
> 
> Done.

Okay.

> > [ List Start (unsigned int32 length)
> > ] List End 
> 
> The main thing this function does is the list.  Needs work on lists in
> lists... but that is, as they say... another matter.

Not sure what you mean? The way xstruct.py works is it is just a
recursive call.

Something like,

parse_structure
 parse_list
   for i in range(0, parse_int):
     parse_structure

> > { List Start (unsigned int64 length)
> > } List End
> 
> Is this used anywhere?  And in what case are we going to be sending more
> then 4 billion items whose total capacity is less then 1 meg?

You are probably right, we can remove this.

> > j SInt32     (32 bit semi-signed integer)
> > p SInt64     (64 bit semi-signed integer)

See above.

> > t timestamp  (32 bit unsigned integer)
> > T timestamp  (64 bit unsigned integer)
>
> All synonyms for one of short, int or long.

Again see above, these have a "semantic difference", even if they all end up in
the same code path when implemented in C.

> > Some examples would be,
> > Board           - "ISSIT"
> > Category        - "ITSS"
> > Component       - "IT[I]SSS[IS]"
> > Connect         - "S"
> > Design          - "jT[I]SSjj[II]S[IS]"
> > 
> > However, lists of structures don't really map well to C as they are not
> > even constant size thanks to things like "[IS]".
> 
> Not really - we can recurse and return a pointer to go into another
> struct.  
> 
> And a size pointer... 
> 
> It is the plan.

I'm not sure what you mean?

> > --------------
> > 
> > Maybe something like, 
> > 
> > /* In parse.h */
> > typedef struct {
> >         uint32_t len;
> >         char[]   s;
> > } string_t;
> > 
> > typedef sint32_t uint32_t;
> > 
> > /* Defined locally */
> > struct my_list {
> >         int32_t  i;
> >         string_t s;
> > };
> > 
> > void do_something() {
> >         sint32_t        arg1; 
> >         struct my_list* arg2;
> >         
> >         p = parse_packet(p, "j[IS]", &arg1, &arg2);
> 
> The problem with this is it requires a particular struct layout and a
> predefined structure - this is way tpe_util_parse_packet currently does.
> It works, but it's pants.

It should be pretty trivial to define a macro preprocessor which
converts something like "il" into a C structure.

Something like;

STRUCT("il", "int1", "long1") ->

struct {
 uint32_t int1;
 uint64_t long1;
};

This would take most of the hassle out of the actual definition? Or is
that not what is the "pitta" part?

> tpe_util_parse_array is a little more sophisticated.
> 
> So it takes something like:
> ioff = (char *)&tut_il->i - (char *)tut_il;
> loff = (char *)&tut_il->l - (char *)tut_il;
> tut_il = tpe_util_parse_array(tutildata, tutildata + sizeof(tutildata),
>                       sizeof(struct tut_il),"[il]", ioff,loff);
> 
> (I do intent to put some wrappers in to make this a little easier).  
> This allows you to point to parts of structures and the like.

Okay, what the heck does the above do? :)

> > Another question is, should we return pointers to the bits in the string
> > (and then let the person copy/malloc the bits they need) or should you
> > malloc for them?
> 
> Malloc for them; the string itself is in the wire format and endianess.
> So don't really want higher level code to need to deal with byte
> swapping.  If they want to copy again, it's all fine.

If I give you a char* (with an associated S), you will malloc the data
and put the address of the new string in the char*?

While, with a uint32_t* (with an associated I), you will just copy the
data into the given location?

> tpe_util_parse_packet currently will reallocate buffers and copy data up
> for things like arrays of resources, so it will manage the C memory for
> you.

I'm not sure I understand what this means? How does it handle freeing
the memory? (It's not like you have a garbage collector or anything...)

> I intend to do similar for the _parse_array function.
> 
>       Regards,
>       nash

Sorry about taking so long to reply to this, been busy with a new job
and all.

Mithro

_______________________________________________
tp-devel mailing list
[email protected]
http://www.thousandparsec.net/tp/mailman.php/listinfo/tp-devel

Re: [tp-devel] Structure - Questions, Questions

Reply via email to