Discussion regarding libwebvtt unit tests -- Forwarded to dev-media per Dave's request. ________________________________ From: Caitlin Potter [[email protected]] Sent: Monday, February 04, 2013 11:05 AM To: Kyle Barnhart; webvtt-dev Subject: RE: We Need Everyone To Weigh In To Move Forward
I've answered these questions on IRC in slightly more detail, but I'll bring them into the mailing list so that they can be referred back to by everyone. Feel free to ask me for further clarification if something is unclear. Note, I am not trying to sound "toxic" or mean or anything, and if it comes off that way, I apologize. 1. "if we are implementing a validator, where are we implementing the interpreter?" Firstly, we aren't implementing a validator. We're implementing a general purpose framework for working with WebVTT streams, which can serve a variety of purposes. The rules of a parser come from the syntax or grammar. The parser spec isn't actually required for interpreting data in the files, because that all comes from the grammar, and additional information about the markup elements. So, to implement a parser, we don't actually have to care at all about the guidelines that they provide. However, the business rules that they provide should be easy to enable (or disable) at the choice of the user. The most flexible place to do that is in the cue callback, which would allow the user maximum control over what they want to do with the cue. You see that we are interpreting the input, we have an idea of the cue's parameters, and we produce a tree of cuetext markup. This is where the interpretation happens. 2. "How do you determine if a test is written correctly?" Kyle is referring to the tests concerned with the syntax/grammar spec (as opposed to the parser spec) -- which should be all of them because that is what the test parser assumes. Based on the syntax specification, we have an idea what makes an input valid or erroneous. We also understand how to interpret the input. We don't actually have strict rules laid out in the spec about what kinds of syntax errors should be returned, so these are made up in the way that seems to make the most sense, or is the simplest to implement. The syntax errors should be ones that would make the most sense for a given syntax error, and point to a relatively appropriate location in the text. We are assuming a test is correct if it returns the appropriate number of cues (which means the maximum possible number of cues, because we do not discard based on minor errors as we would for the browser), the appropriate number of errors, the errors are correct and/or the interpreted data is read correctly/as expected. __________________________________________________________________________________ It is true that not also writing tests to be sure that we can operate in accordance with the parser spec is a problem. We do need a second test parser for this, and separate unit tests to perform these tests. A further issue is the way you (Kyle) have been going about reviewing these tests. It's understandable why you have a misunderstanding about what the tests are actually testing, and I will be the first to admit that there are things that are incorrect about these tests, a few of which have been seen in your reviews. However, when you find an issue with a specific test, it would be a lot simpler to create a single pull request or issue addressing that single test, or group of related tests. This way, it's a much smaller change, and is a lot easier for us to look at and determine if your changes have been correct or fully thought out. It's much harder to do that when it's 66 changed files or what have you. So in order to keep things simple, and as correct as possible, smaller pull requests/issues would be a good idea. ________________________________ From: Kyle Barnhart [[email protected]] Sent: Friday, February 01, 2013 10:55 PM To: webvtt-dev Subject: Re: We Need Everyone To Weigh In To Move Forward In case I wasn't clear. This is about the tests. Not rewriting the code. The only changes to the library that this affects are changes made to make it pass the tests. "the parser is modeled after the syntax, because the parser's job is to process input according to that syntax or grammar" If the parser is actually only a validator, when are we writing an interpreter? We cannot send anything to a browser to process without one. On Fri, Feb 1, 2013 at 10:18 PM, Kyle Barnhart <[email protected]<mailto:[email protected]>> wrote: I ask you one question. How do you determine if a test is written correctly? On 2013-02-01 9:07 PM, "Kyle Barnhart" <[email protected]<mailto:[email protected]>> wrote: We need a consensus on the issue I will cover shortly. To do that we need to get everyone to reply so we can move things along, even if it is only a few words. Thank you. Mailto: [email protected]<http://senecacollege.ca> Problem Our parser must at some stage be as standard compliant as possible. That includes whatever we implement into Mozilla. The problem stems from the tests for the parser library. It is this, there is disagreement as to if the parser is subject to the parser section in the specifications. Why It Matters "It should follow what implementations do as well. So if there's something strange there it might be a bug in the spec." - Velmont "yeah, what Velmont said. If there's a reason to implement something other than the spec, we should change the spec." - Ian Hickson 1. If the parser does not follow the specifications it will not be used. Speak up if you want people to actually use our work. "If the code is to end up in Gecko, it had better follow the specification to the letter" - Ms2ger 2. The purpose of specifications is to make sure that things will work the same across applications and platforms. What you get if there is no standard or if it is not followed is the same problems you get when trying to make a webpage work on IE6 or any non-conforming browser. No one wants the complexity and extra work to make sure the stuff (in this case webvtt) behaves the same, such as having two versions of the same file and making sure each one only goes to the appropriate browser. "'It's a bit unusual for a standard to specify the parsing algorithm, but I can understand why.' It's not unusual for modern specs. It's a much more dependable way of getting to consistent behavior than only specifying a format." - Glenn Maynard 3. The specification is not a suggestion or a guide. The parser rules are created to be so that all implementations behave the same and to do that they must be adhered to. "When writing something that reads WebVTT files, be very sure to parse it as specified by the parser--*not* by reading the syntax and coming up with your own parsing algorithm." - Glenn Maynard "[Syntax rules] are requirements for writing, not for parsing. Requirements in that section don't apply to you." - Simon Pieters "what do they think the spec is for, if not following" - Ian Hickson "inspiration" - MikeSmith "Suggestions" - Ms2ger What You Need To Know It would seem that some are not very familiar with the specifications. This is understandable as it is complex. So I will outline what the different sections are for. The specification has four parts: syntax, parser, DOM construction, and rendering. Syntax * These rules apply to someone writing a WebVTT file. One can have a validator that checks if a file was written correctly. Parser * These rules apply to interpreters. These are the only rules a parser/interpreter needs to comply with, this was the deliberate intention of having a parser section. It is important to note that the interpreter can be written in away way so long as it arrives to the same results as in the parsing specifications. There should be test to ensure compliance. DOM Construction * These rules apply to the creation of DOM objects. The results of the parser are to be converted by these rules. For the WebVTT specifications it is isomorphic. However since our parser does not have objects like the parsing specification does, our parsing results need to be converted to the specification's DOM objects. This is important so that the behavior is the same across browsers. (Developers can write code to manipulate the DOM.) Rendering Rules * These rules determine how WebVTT text tracks are to be displayed. They are very thorough and it is important that the are followed to make sure WebVTT files look the same when they are used. Solutions I see three possibilities for the parser tests. 1. The current unit tests are the ones that should ensure compliance with the specification. 2. There should be a second set for of tests for compliance for the current parser. 3. There is another parser that uses the current parser library. A second set of tests for the second parser. 4. Make compliance at the browser level. One. The solution I propose. I will cover the other three options first. Two. A second set of tests on the same library would not only be redundant, but would actually be in direct conflict with the tests as they stand now. So this is not a viable option. Three. Having a second parser is redundant not just because it is two of the same thing. If there is real parser that our parser library is part of, and it is not standards compliant because of a problem in our parser library, then our library is going to have to change anyway. This is a very real case when you accept input that the specifications do not allows, and modify it so that the parser creates a cue. For example changing the timestamp "00:00.1000" to "00:01.000". (The specification states that a cue with such timestamps should be discarded.) Additionally, why have a parser that is doing things wrong in the first place? Four. We cannot make the parser compliant via the implementation into the browser because that runs completely counter to the specifications. In addition there are all the same problems as option number two. Back to one. If the library and tests are not subject to the specification, then by what standard are the tests to be judged for correctness. The only real options is that the test be judged by the specification. Not only would this make is clear what the tests need to be, what the parser needs to do, it would also significantly reduce development efforts (e.g. two sets of tests, two parsers), and ensure our work can be used. Summary This is an important topic that needs a clear decisions soon. A lot of work needs to be done to make the parser pass the unit tests. Whither those test are standards compliant or not is critically important and will change the way the parser works. Thank You, Kyle Barnhart _______________________________________________ dev-media mailing list [email protected] https://lists.mozilla.org/listinfo/dev-media

