Hi Folks, I think DFDL is awesome. Think about it: DFDL is a standard language for describing (describe, not parse) just about any data format. Again, I emphasize that it's not about how to parse the data format, it's about describing the data format. Given a description a DFDL processor can figure out how to parse instances of the data format. Wow!
But there's another reason that DFDL is awesome: it forces you to be very precise in your description. It forces you to think very logically. It forces you to understand the implications of your description decisions. Let me give you an example of the later. I am dealing with a data format that consists of a sequence of lines. Here's a sample instance: John Doe OPER/XRAY// Sally Smith The first and last lines are just strings. Not interesting. The second line is the interesting one. Here's another instance: John Doe EXER/TANGO// Sally Smith As you can see, the second line starts with either OPER or EXER and terminates with //. The second line is also optional. That is, the second line is either OPER, EXER, or neither. That leads one to this description: choice OPER (optional) EXER (optional) However, DFDL doesn't allow branches of a choice to be optional. So, the correct description is: choice sequence OPER (optional) sequence EXER (optional) Slick, aye? But not correct. Let's think about this. Suppose the input is this: John Doe EXER/TANGO// Sally Smith While processing the second line, you would think that the DFDL processor would find that the first branch of the choice (the OPER branch) doesn't match and therefore the processor would process the line using the second branch. Ha! Not correct! The first branch is optional. That is key! Since the second line doesn't start with OPER, the DFDL processor thinks, "Oh, there must be no occurrences of the OPER line." So, the processor moves on to the description following the choice. Do you see it? Do you see the problem? I hope so. This is wicked cool. As I worked through this example, it forced me to think very, very clearly about the implication of an optional OPER line. So, what's the solution? Make OPER and EXER mandatory: choice sequence OPER (mandatory) sequence EXER (mandatory) And, place the choice inside an optional wrapper element: OPER-EXER-wrapper (optional) choice sequence OPER (mandatory) sequence EXER (mandatory) Now, with this input: John Doe EXER/TANGO// Sally Smith The processor will try the first branch of the choice, it fails, so it tries the second branch and succeeds. With this input: John Doe Sally Smith The processor will try the first branch of the choice, it fails, try the second branch, it fails, so there is no value for the wrapper element. This blows my mind. I feel like this example alone boosted my IQ by 10 points. /Roger