Last week Steve described how to identify the fields in a data format (such as Link-16 or VMF) with bitOrder=leastSignificantBitFirst, when the data is viewed in a hex editor:
Ø read each *byte* left-to-right, but read the *bits within each byte* right-to-left … tricky when fields overlap byte boundaries (which is very common in VMF/Link-16) That is really complicated. Question: If I write the binary on a piece of paper and manually process the bits right-to-left – the first field is the rightmost two bits, the second field is the next five bits, the third field is the next two bits, etc. – then each field is sequential, one after the other, right? In one of the emails, Mike mentioned that Link-16 is Little Endian. Does Endianess impact the strategy described in the previous paragraph? I imagine the Endianess would impact this situation: when processing the fields right-to-left, I encounter a 16-bit unsigned integer field. As I recall, Little Endian means this: the most significant byte is at the highest memory address. In the 16-bit unsigned integer field, which byte is at the highest memory address? Suppose the 16 bits are these: [cid:image001.png@01DA2C0C.8256B080] What is the (decimal) value of that field? From: Steve Lawrence <slawre...@apache.org> Sent: Thursday, December 7, 2023 3:28 PM To: users@daffodil.apache.org Subject: Re: Bits are streamed into an application ... the bits are put in memory ... is the first bit received at the lowest memory address? Lin16/VMF is definitely confusing. A hex editor will actually display the data like this: 4E .. . . . 21 Another way to think about how to read link16/vmf data (which is more useful when using hex editors which don't understand the idea of right-to-left Lin16/VMF is definitely confusing. A hex editor will actually display the data like this: 4E ..... 21 Another way to think about how to read link16/vmf data (which is more useful when using hex editors which don't understand the idea of right-to-left data), is to read each *byte* left-to-right, but read the *bits within each byte* right-to-left. So if you look at your example data in a hex editor you'll see something like this low memory address high memory address 4 E . . . 2 1 0100 1110 ........ ........ ........ 0010 0001 If we read the bytes left-to-right like a hex editor would show, our first byte is 01001110 (0x4E). But within that byte, we read the bits right-to-left, which means our first 2-bit field would be "10" (taken from the right of the 0xE, not left of the 0x4). As an further example, if there was a 3-bit field following our 2-bit field, it would be 011. Note that once we consume the entire first byte, we start reading from the right-most bit of the 2nd byte, and so on. This can make things tricky when fields overlap byte boundaries (which is very common in VMF/Link16). For example, let's say we've already consumed 5 bits in the first byte (marked in x's below), and the next field is 6-bits long (marked from 1-6 in the order to read the bits below), this is where the bits would come from for that field: 321x xxxx .... .654 So this field crosses a byte boundary, and some bits come from the left of the first byte and others from the right of the second byte. In this example, the the value of this 6-bit field 654321. It can definitely be confusing to read *bytes* left-to-right but *bits* right-to-left. So it is sometimes helpful to mentally (or even manually) reorder all the bytes so they are reversed (without reversing the bits), and then just read the bits from the end of the file, right-to-left and bottom-to-top. In fact, I believe the data viewer in the Daffodil VS Code extension is planning (or already has?) the ability to show bytes in the reverse order at bit resolution, which would make this a little easier so you wouldn't have to do it yourself. On 2023-12-07 02:38 PM, Roger L Costello wrote: > Thanks Steve. > > As it turns out, I am working with the Link-16 data format. You are > saying that the stream of received bits – 0 then 1 then 1 then 1 then 0 > then 0 then 1 then 0 then … - are viewed in this fashion: > > High memory address Low memory address > > 0010 0001 ................................................ 0100 1110 > > Hex: 21 ……………………………………. 4E > > That is consistent with the Link-16 specification. > > Here’s where I get confused. Suppose I store those bits into a file. > Then I open the file in a hex editor. How will the hex editor display > the data? > > Will the hex editor display the four bits at the lowest memory address > (1110) in hex digit form (E), followed by the four bits at the > next-to-lowest memory address (0100) in hex digit form (4): > > E4 ……………………………… 12 > > Or will the hex editor see the bits in reverse order: > > Low memory address High memory address > > 0111 0010 ………………………………………… 1000 0100 > > Hex: 72 ………………………………………………. 84 > > And display this: > > 72 ………………………………………….. 84 > > That is wildly different! > > Eek! I am so confused! Help! > > *From:*Steve Lawrence <slawre...@apache.org<mailto:slawre...@apache.org>> > *Sent:* Thursday, December 7, 2023 1:16 PM > *To:* users@daffodil.apache.org<mailto:users@daffodil.apache.org> > *Subject:* [EXT] Re: Bits are streamed into an application ... the bits > are put in memory ... is the first bit received at the lowest memory > address? > > The most likely interpretation is that the 2-bit field will have a value > of "01", with a decimal value of 1. This is what DFDL calls dfdl: > bitOrder="mostSignificantBitFirst" and is what most data formats use. > DFDL also has dfdl: bitOrder="leastSignificantBitFirst", > > The most likely interpretation is that the 2-bit field will have a value > > of "01", with a decimal value of 1. This is what DFDL calls > > dfdl:bitOrder="mostSignificantBitFirst" and is what most data formats use. > > DFDL also has dfdl:bitOrder="leastSignificantBitFirst", which is common > > in some old military formats like VMF and Link16. One way to imagine > > this is as if the memory addresses and bits were ordered and read > > right-to-left. So, your example would look like this: > > High memory address Low memory address > > 0 0 ................................................ 1 1 0 > > So we have the same data, and you still read the low memory address bits > > first, it's just all backwards. Because we read right-to-left, the first > > two bits are now "10". Note that interpreting the value of the bits is > > the same as normal, so "10" evaluates to 2--all that changes is the > > order in which we read bits. > > I haven't seen any formats where the high memory address bits would be > > read first and would lead to the 2-bit field being "00". DFDL doesn't > > have a way to model this. > > On 2023-12-07 12:28 PM, Roger L Costello wrote: > >> Hi Folks, > >> > >> A basic question about bits. > >> > >> Scenario: an application is receiving a message. The message arrives as a >> stream of bits. The first bit received is 0. The second bit received is 1. >> The third bit received is another 1. ... The second-to-last bit received is >> 0. The last bit received is 0. > >> > >> The application stores the message in memory. > >> > >> Will the first bit received be in the lowest memory address and the last bit >> received in the highest memory address? I.e., > >> > >> Low memory address High memory address > >> 0 1 1 ............................................................. 0 0 > >> > >> The message contains a series of bit fields. The specification for the >> message says the first field is two bits. > >> > >> Is the first field the bits 0 1 or the bits 00? > >> > >> /Roger >