I agree that every bad error message is a bug, and any error message that is not-helpful should be reported as one.
The left-over data error you are seeing is a bit tricky. When Daffodil is invoked to consume data from a stream, then this situation is not even an error at all, as it is perfectly normal for a parser to parse one message from a stream, and stop, leaving the stream positioned for the next parse call. Only when daffodil is invoked in a context where it is clear it is intended to consume the entire input, is this error detected at all. What this means is that the parse ended normally, produced an infoset, but then it was discovered that there was data left over. To me what can be improved here is the error message text, which should say that "parse ended normally", should indicate that an infoset was created (and display all/part of it), and indicate that it ended without consuming all the data, giving all positions in both bytes+optional 0..7 bits if not on a byte boundary. On Wed, Apr 6, 2022 at 7:53 AM Roger L Costello <coste...@mitre.org> wrote: > Hi Folks, > > I ran Daffodil on my DFDL schema and got this error message: > > [error] Left over data. Consumed 1504 bit(s) with at least 3040 bit(s) > remaining. > Left over data (Hex) starting at byte 189 is: (0x0d0a47454e544558...) > Left over data (UTF-8) starting at byte 189 is: (??GENTEX...) > > That is a really bad error message. Why did Daffodil stop consuming the > input? No idea. What is in my DFDL schema that caused the generation of the > error? No idea. > > No disrespect intended, but Daffodil has the worst error messages of any > tool that I have ever encountered. > > Good error messages are important. In a recent podcast Michael Kay > (creator of Saxon) talks about his emphasis on good error messages: > > What makes a good product? Users must be able to understand the error > messages. People will tell you, one thing I like about Saxon is the error > messages. To me, a bad error message is something that really needs to be > fixed. Error messages are what users are dealing with every day. They are > reading my error messages. If those glare out as being unhelpful, as being > badly spelled, then that's their experience with the product, so it's > important to get it right. I put a lot of effort into those sorts of little > details. Getting good error messages it really quite an art. Do you phrase > the error message in terms of the proper terminology of the spec, or do you > use the terminology that the users are using (which might be quite wrong)? > For example, what many users call a "tag" isn't what the spec calls a tag. > They'll use "tag" to mean "element." So which word am I going to use in an > error message? It's quite hard to get that sort of thing right. Getting a > balance between a message that is technically correct and a message that > users understand, sometimes requires a fair bit of thought. And then you've > got to phrase the error message in terms of what the user was trying to do, > not what was going on internally. That again gives you a significant > challenge. So you have to think about those sorts of things. > > /Roger >