Re: Big and littleendian fields in one mspec

Łukasz Dywicki Tue, 14 Apr 2020 06:46:51 -0700

Hey Chris,
Sorry for wrongly addressed mail, it was intended to go to mailing list.


To the point - I think that what you propose with reordering fields
manually shift the load from the tool to the developer. It will lead to
further implementation errors caused by a need to reorder fields while
creating spec. It will be inconsistent with rest of places where things
are in "natural order" (big endian) except one or two types which
collects bit flags which will have to be declared in different order.
What I did in my naive implementation with "little endian" flag on the
State type is re-arrangement of fields during generation time.
It does not require any byte shifting at the runtime and keeps natural
order of fields without the need for manual reordering of flags in mspec
declaration.

I understand that keeping mspec simple is an important priority, however
doing it at the cost of developers who will use it to implement codecs
will not help us in long term.
I'm quite sure that sooner or later we will get a  protocol with mix of
BE/LE fields.

Cheers,
Łukasz


On 14.04.2020 14:57, Christofer Dutz wrote:
> Hi Lukasz (adding dev@ to the recipients again)
> 
>  
> 
> you are not quite correct. Enum types do declare a base type where they
> are declared … not where they are used.
> 
>  
> 
> If you have bit-fields then why don’t you define them as sequence of
> bits? I really can’t understand why you need to flip anything.
> 
> I think you’re trying to do something similar to Sebastian when he
> wanted to introduce some bit-mask types.
> 
>  
> 
> I could imagine that in the spec it might be written … these two bytes
> are a bit-mask and the bit 1 means X bit 10 means Y, …
> 
> So why not define the bit fields in the sequence they are sent over the
> wire?
> 
>  
> 
> Assuming you have a bit field of 16 bits BE uint. If the spec now says:
> 
>  
> 
>  
> 
> Bit:                        15|14|13|12|11|10|9 |8 |7 |6 |5  |4 |3  |2
>  |1  |0
> 
> Meaning:            A  |B  |C  |D  |E  | F  |G|H |I  | J | K |L | M|N |O |P
> 
>  
> 
> Then the order they are sent is:
> 
> 7 |6 |5  |4 |3  |2  |1  |0 | 15|14|13|12|11|10|9 |8
> 
> So you have to declare the bit fields in the order:
> 
>                                I  | J | K |L | M|N |O |P  |A  |B  |C  |D
>  |E  | F  |G|H
> 
>  
> 
> So I don’t quite understand what you’re trying to flip here.
> 
>  
> 
> Chris
> 
>  
> 
> *Von: *Łukasz Dywicki <l...@code-house.org>
> *Datum: *Dienstag, 14. April 2020 um 13:26
> *An: *Christofer Dutz <christofer.d...@c-ware.de>
> *Betreff: *Re: Big and littleendian fields in one mspec
> 
>  
> 
> I've added the endianness switch to the types in my experiments. These
> are not needed for enums as these are read as int and then mapped to
> constants. I referred to these as an example - enum does have a length
> indication and type behind it.
> 
>  
> 
> Again, I will bring the issue - StateIO currently fails to do anything
> useful due to BE/LE switch. Because this type uses single bits which
> need to be flipped before reading or after writing otherwise they end up
> in improper order. Out of 16 bits 9 are in use so we have a lot of
> possible combinations (2^9), which seems too much to create an enum.
> 
> Since you didn't like my initial proposal and modifications (as they
> might be redundant to what is available in the buffer API), how would
> you handle State serialization without affecting mspec and without
> further complication to code generation?
> 
>  
> 
> Best,
> 
> Łukasz
> 
>  
> 
> wt., 14 kwi 2020 o 12:47 Christofer Dutz <christofer.d...@c-ware.de
> <mailto:christofer.d...@c-ware.de>> napisał(a):
> 
>     Hi Lukasz,
> 
>     but we don't have a:
>     [enum uint 16 little endian 'CommandId']
>     But only a:
>     [enum uint 16 'CommandId']
> 
>     And in your case I think perhaps the constants are not correct. So
>     having a 16 bit uint will result in a 4 digit hex string:
>     So if you are having problems in mapping them to your constants,
>     perhaps the constants are in the wrong endianness.
> 
>     I have encountered (but can't recall where) that the Enum constants
>     in a BE protocol were written down in LE notation.
>     Of course the thing can't work then.
> 
>     So if for example you have the constant "1" in a BE protocol with a
>     uint 16 enum type, your constant is not 0x0001, but 0x0100 instead.
> 
>     Chris
> 
> 
> 
> 
>     Am 14.04.20, 12:37 schrieb "Łukasz Dywicki" <l...@code-house.org
>     <mailto:l...@code-house.org>>:
> 
>         The legendary "two bytes" (shall we create a band with this
>     name?!) are
>         coming from unfortunate State type. I don't mind these to be an
>         int/sint/uint or whathever - this is matter of interpretation.
> 
>         If you could please check again my earlier messages you will find
>         struggle I have which is - how much given type takes.
> 
>         We have that for enums
>         [enum uint 16 little endian 'CommandId']
> 
>         but we don't have that for complex types
>         [type 'State']
> 
>         This leads to situation that we don't know how to read and interpret
>         incoming byte sequence or how to properly write it back to stream.
> 
>         It would be great if we could limit the issue just to
>         // read
>         state = StateIO.parseStatic(io.readUnsignedInt(16, true)); //
>     true->LE
>         // write
>         wio = new WriteBuffer(2);
>         StateIO.staticSerialize(wio, new State(...));
>         io.writeUnsignedInt(8, wio.read, true); // true->LE
> 
>         However to do that first we need to know that State is a uint 16
>     LE and
>         then do a lot of mambo-jambo in code generators to cope with
>     that. :-)
> 
>         Best,
>         Łukasz
> 
> 
>         On 14.04.2020 12:20, Christofer Dutz wrote:
>         > Oh gee ... I totally remember having exactly the same
>     discussion with Sebastian about "bytes" ...
>         >
>         > The problem is there is generally no "byte" "int" and "uint"
>     because a byte = 8 bits is either signed or unsigned. That should
>     the 3rd option be? Perhaps-signed?
>         > So generally I use "uint 8" for what you refer to as a "byte 8".
>         >
>         > Wo what would be the difference between your "2 byte little
>     endian" and an "uint 16" with a LE ReadBuffer?
>         >
>         > I think what you have to rid yourself of thinking of int as
>     number and a byte not being a number.
>         >
>         > Chris
>         >
>         >
>         >
>         > Am 14.04.20, 10:52 schrieb "Łukasz Dywicki"
>     <l...@code-house.org <mailto:l...@code-house.org>>:
>         >
>         >     Hey Christian,
>         >     The problem I face is quite simple. State type in mspec i
>     declared as a
>         >     bunch of bits. Type length is fixed, however not available
>     anywhere in
>         >     mspec or code generator to re-arrange bytes upfront. All
>     we have exposed
>         >     at reader/write level is read/write bit.
>         >     To be fair LE/BE support in read/write buffers is limited
>     just to
>         >     numbers. There is no such support for raw bytes or bits,
>     cause for that
>         >     you need to declare a length of LE/BE sequence.
>         >     
>         >     I would love if I could just declare State as '2 byte
>     little endian' so
>         >     it would be read properly upfront and parsed with no
>     changes in
>         >     generated code, however I'm not sure how to do it and
>     where. That's why
>         >     I'm playing with different things described in earlier mail.
>         >     Since all type handling is general I am just afraid of
>     more complicated
>         >     scenarios where we have variable length structures such as
>     arrays.
>         >     
>         >     Best regards,
>         >     Łukasz
>         >     
>         >     
>         >     On 14.04.2020 09:41, Christofer Dutz wrote:
>         >     > Hi Lukasz,
>         >     >
>         >     > I am not sure I am understanding the problems you are
>     facing. We already have LE and BE protocols.
>         >     > For example EIP is LE and the rest is generally BE. It
>     seems that ADS/AMS is also LE.
>         >     > mspec doesn't even know about endianness.
>         >     >
>         >     > Up till now the endianness doesn't have an effect on
>     bit-fields or single-bit ints.
>         >     > It only starts to affect if a field goes from one byte
>     to the next, which is usually for (u)int and floating point values.
>         >     >
>         >     > That's why we have created the Read/WriteBuffers to set
>     their endianness in the constructor.
>         >     >
>         >     > So if you're creating a driver for ADS/AMS which is LE,
>     then you write the mspec according to the sequence the information
>     is located in the transferred bytes and have the Read/WriteBuffer
>     handle the endianness issue.
>         >     >
>         >     > I do see a problem when there are drivers that use mixed
>     endianness, but we have still to encounter such a protocol.
>         >     >
>         >     > So I have to admit that I don't like any of the mspec
>     changes you proposed, as I think you are just not using the tools we
>     have the right way.
>         >     >
>         >     > Chris
>         >     >
>         >     >
>         >     >
>         >     > Am 14.04.20, 00:32 schrieb "Łukasz Dywicki"
>     <l...@code-house.org <mailto:l...@code-house.org>>:
>         >     >
>         >     >     Hey Niclas,
>         >     >     I realized how old the old things are when I started
>     preparing
>         >     >     automation training for mere mortals and got into
>     history of frames and
>         >     >     even cabling. Mr. Modbus and EIA-485 is definitely
>     older than I. ;-)
>         >     >     
>         >     >     Getting back to the point - yes. I been thinking how
>     to address the byte
>         >     >     order in effective way. Here are two approaches I
>     have for now:
>         >     >     
>         >     >     A) My initial attempt is just a temporary buffer
>     which is then written
>         >     >     in reverse order to caller. For reading it is
>     similar - just getting N
>         >     >     bytes in reversed order. The hard part is.. knowing
>     N. I had to add a
>         >     >     static calculation in order to allocate valid buffer
>     sizes. I tend to
>         >     >     work but I'm not happy with this approach cause it
>     involves additional work.
>         >     >     B) Second idea I've got is really simple and relies
>     on code generation.
>         >     >     We know in which order fields are coming. Here I'm
>     referring to a State
>         >     >     field which is just bunch of bits. If we would group
>     fields in bytes and
>         >     >     generate code in reverse order then it has chance to
>     work. Requirement
>         >     >     for that - ability to know basic field sizes upfront.
>         >     >     C) Try to combine above with bit-io or
>     Read/WriteBuffers as these are
>         >     >     places which know actual position and state of
>     buffers which are being
>         >     >     read/written.
>         >     >     
>         >     >     Now, getting to two cases which are a problem.
>     CommandId and State. So
>         >     >     with command id situation is simple as it is
>     declared as enum and it is
>         >     >     read as uint. We know size upfront and can generate
>     valid method call
>         >     >     (readIntLE).
>         >     >     [enum uint 16 little endian 'CommandId'
>         >     >         ['0x00' INVALID]
>         >     >         ['0x01' ADS_READ_DEVICE_INFO]
>         >     >         ['0x02' ADS_READ]
>         >     >         ['0x03' ADS_WRITE]
>         >     >         ['0x04' ADS_READ_STATE]
>         >     >         ['0x05' ADS_WRITE_CONTROL]
>         >     >         ['0x06' ADS_ADD_DEVICE_NOTIFICATION]
>         >     >         ['0x07' ADS_DELETE_DEVICE_NOTIFICATION]
>         >     >         ['0x08' ADS_DEVICE_NOTIFICATION]
>         >     >         ['0x09' ADS_READ_WRITE]
>         >     >     ]
>         >     >     
>         >     >     Second candidate is what I'm stuck right now sniping
>     next cycles of
>         >     >     problems. So in case of State we have complex type
>     composed from 2
>         >     >     bytes. A note here - instead of two bytes we might
>     have a variable
>         >     >     length type which includes array or other variable
>     section.
>         >     >     [type little endian 'State'
>         >     >         [simple     bit 'broadcast'             ]
>         >     >         [reserved   int 7 '0x0'                 ]
>         >     >         [simple     bit 'initCommand'           ]
>         >     >         [simple     bit 'updCommand'            ]
>         >     >         [simple     bit 'timestampAdded'        ]
>         >     >         [simple     bit 'highPriorityCommand'   ]
>         >     >         [simple     bit 'systemCommand'         ]
>         >     >         [simple     bit 'adsCommand'            ]
>         >     >         [simple     bit 'noReturn'              ]
>         >     >         [simple     bit 'response'              ]
>         >     >     ]
>         >     >     
>         >     >     The order of reading big endian encoded data to
>     impose little endian
>         >     >     shift would be (please correct me if I'm wrong):
>         >     >     1) init
>         >     >     2) udp
>         >     >     3) add timestamp
>         >     >     4) priority
>         >     >     5) system
>         >     >     6) ads
>         >     >     7) noreturn
>         >     >     8) response (end of byte 1)
>         >     >     9) broadcast
>         >     >     10) reserved (end of byte )
>         >     >     We can do same trick for writing, by re-arranging
>     fields. By this way we
>         >     >     avoid any additional byte level operations.
>         >     >     
>         >     >     Overall trouble with generated driver is to declare
>     "how much" bytes
>         >     >     should be read and interpreted. We have precise size
>     information at the
>         >     >     runtime - due to length fields, we can leverage it
>     at generation time,
>         >     >     but then we won't be able to cover all cases.
>         >     >     
>         >     >     I would love to keep it simple and do not break
>     things thus I need your
>         >     >     advice on how to approach this problem in a valid way.
>         >     >     
>         >     >     Cheers,
>         >     >     Łukasz
>         >     >     
>         >     >     
>         >     >     On 13.04.2020 03:26, Niclas Hedhman wrote:
>         >     >     > <anecdotal-rant>
>         >     >     > For us who were around and shaping the protocols
>     in the 1980s, and people
>         >     >     > before us (and before standards like RS-232), a
>     lot of the "specifications"
>         >     >     > came out of "observation of implementation we
>     managed to get to work",
>         >     >     > rather than "implement this spec". A lot was due
>     to extreme memory
>         >     >     > constraints (in my case, multi-tasking operating
>     system, serial protocol
>         >     >     > 187kbps, interpreted programming language with
>     floating point ops and user
>         >     >     > applications in 2kB RAM and 8kB EPROM) and a
>     general lack of information,
>         >     >     > like what other people were doing, sharing
>     experiences and so on.
>         >     >     >
>         >     >     > And there were many "innovative" ways to squeeze
>     just a little bit extra
>         >     >     > out of the hardware, resulting in "hard to
>     understand" consequences. Bit
>         >     >     > packing was a typical one, multiple functions
>     packed into a single byte.
>         >     >     > Look at page 14 in
>     https://www.nxp.com/docs/en/data-sheet/80C31_80C32.pdf
>         >     >     > and read up on "UART Enahanced Mode", and we used
>     this, i.e. 9 bits, no
>         >     >     > parity and clever use of address and mask to
>     create a slave-to-slave direct
>         >     >     > protocol, where the master's role was to signal
>     which slave "owned" the
>         >     >     > cable. Yeah, in that 8kB ROM limitation (I think
>     protocol was about 1kB
>         >     >     > ROM) and something like 150 bytes RAM for comm
>     protocol.
>         >     >     >
>         >     >     > Could you implement a compatible device to this
>     with PLC4X and modern
>         >     >     > hardware (i.e. no 8031/32 co-processor)? Possibly
>     but bit-banging is needed
>         >     >     > to support the 9bit data (+start and stop bits)
>     and an awful lot of CPU
>         >     >     > cycles on something that was automatic on one of
>     the slowest long-lived
>         >     >     > microcontroller ever.
>         >     >     > </anecdotal-rant>
>         >     >     >
>         >     >     > My point was only to highlight that some of the
>     strange things you see in
>         >     >     > protocols today, have its roots in
>     pre-standardization days. Today no one
>         >     >     > would go down that route, because the hardware
>     cost nothing now (8031  +
>         >     >     > 8kB EPROM + 2kB static RAM + battery backup =>
>     ~$50 in 1983's currency) and
>         >     >     > longevity of software is more important.
>         >     >     >
>         >     >     > Cheers
>         >     >     > Niclas
>         >     >     >
>         >     >     >
>         >     >     > On Sun, Apr 12, 2020 at 10:10 PM Christofer Dutz
>     <christofer.d...@c-ware.de <mailto:christofer.d...@c-ware.de>>
>         >     >     > wrote:
>         >     >     >
>         >     >     >> Hi Lukasz,
>         >     >     >>
>         >     >     >> I think it really gets tricky when using BE and
>     having some byte-odd-sizes
>         >     >     >> ... I remember in the Firmata protocol there were
>     some bitmasks and then 10
>         >     >     >> bit uint as BE ... not it really got tricky as
>     the specs were written from
>         >     >     >> a point of view: You read 16 bits BE and then the
>     first6 bits mean XYZ
>         >     >     >> instead of describing how the bits actually
>     travel over the wire.
>         >     >     >>
>         >     >     >> Chris
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >> Am 11.04.20, 01:21 schrieb "Łukasz Dywicki"
>     <l...@code-house.org <mailto:l...@code-house.org>>:
>         >     >     >>
>         >     >     >>     I've made some progress with topic by
>     modyfing mspec and allowing
>         >     >     >>     'little endian' flag on fields. This moved me
>     further to next issue -
>         >     >     >>     which is whole type encoded little endian.
>         >     >     >>
>         >     >     >>     In ADS driver such type is State, which has 2
>     bytes and uses 8 bits for
>         >     >     >>     various flags.
>         >     >     >>     There are two cases which require different
>     approach - reading and
>         >     >     >>     writing. So for reading we need to swap N
>     bytes based on type length.
>         >     >     >>     For writing we need to alocate buffer for N
>     bytes and swap them before
>         >     >     >>     writing.
>         >     >     >>
>         >     >     >>     I am stuck now with freemaker templates and
>     bit-io.
>         >     >     >>
>         >     >     >>     Cheers,
>         >     >     >>     Łukasz
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >>     On 10.04.2020 17:57, Łukasz Dywicki wrote:
>         >     >     >>     > I am doing some tests of ADS serialization.
>         >     >     >>     >
>         >     >     >>     > I've run into some troubles with payload
>     which is generated with new
>         >     >     >>     > driver. I'm not sure if that's my fault or
>     generated code.
>         >     >     >>     >
>         >     >     >>     > I did a verification of what Wireshark
>     shows and how ads structures
>         >     >     >> are
>         >     >     >>     > parsed. There is a gap I think. For example
>     ams port number 1000
>         >     >     >>     > (0x1027) is read as 4135.
>         >     >     >>     >
>         >     >     >>     > Obviously I used wrong structures while
>     implementing protocol logic
>         >     >     >> in
>         >     >     >>     > first place, but now I am uncertain of how
>     fields are encoded. How we
>         >     >     >>     > mark field as little endian when rest of
>     payload is big endian? Do we
>         >     >     >>     > have `uint_le`?
>         >     >     >>     >
>         >     >     >>     > As far I remember route creation logic I
>     was tracking last week used
>         >     >     >>     > combination of LE and BE.
>         >     >     >>     >
>         >     >     >>     > Best regards,
>         >     >     >>     > Łukasz
>         >     >     >>     >
>         >     >     >>
>         >     >     >>
>         >     >     >>
>         >     >     >
>         >     >     
>         >     >
>         >     
>         >
>

Re: Big and littleendian fields in one mspec

Reply via email to