Discussion regarding libwebvtt unit tests --

Forwarded to dev-media per Dave's request.
________________________________
From: Caitlin Potter [[email protected]]
Sent: Monday, February 04, 2013 11:05 AM
To: Kyle Barnhart; webvtt-dev
Subject: RE: We Need Everyone To Weigh In To Move Forward

I've answered these questions on IRC in slightly more detail, but I'll bring 
them into the mailing list so that they can be referred back to by everyone. 
Feel free to ask me for further clarification if something is unclear.

Note, I am not trying to sound "toxic" or mean or anything, and if it comes off 
that way, I apologize.

1. "if we are implementing a validator, where are we implementing the 
interpreter?"
Firstly, we aren't implementing a validator. We're implementing a general 
purpose framework for working with WebVTT streams, which can serve a variety of 
purposes.

The rules of a parser come from the syntax or grammar. The parser spec isn't 
actually required for interpreting data in the files, because that all comes 
from the grammar, and additional information about the markup elements.

So, to implement a parser, we don't actually have to care at all about the 
guidelines that they provide. However, the business rules that they provide 
should be easy to enable (or disable) at the choice of the user.

The most flexible place to do that is in the cue callback, which would allow 
the user maximum control over what they want to do with the cue.

You see that we are interpreting the input, we have an idea of the cue's 
parameters, and we produce a tree of cuetext markup. This is where the 
interpretation happens.

2. "How do you determine if a test is written correctly?"
Kyle is referring to the tests concerned with the syntax/grammar spec (as 
opposed to the parser spec) -- which should be all of them because that is what 
the test parser assumes.

Based on the syntax specification, we have an idea what makes an input valid or 
erroneous. We also understand how to interpret the input.

We don't actually have strict rules laid out in the spec about what kinds of 
syntax errors should be returned, so these are made up in the way that seems to 
make the most sense, or is the simplest to implement. The syntax errors should 
be ones that would make the most sense for a given syntax error, and point to a 
relatively appropriate location in the text.

We are assuming a test is correct if it returns the appropriate number of cues 
(which means the maximum possible number of cues, because we do not discard 
based on minor errors as we would for the browser), the appropriate number of 
errors, the errors are correct and/or the interpreted data is read correctly/as 
expected.

__________________________________________________________________________________

It is true that not also writing tests to be sure that we can operate in 
accordance with the parser spec is a problem.

We do need a second test parser for this, and separate unit tests to perform 
these tests.

A further issue is the way you (Kyle) have been going about reviewing these 
tests. It's understandable why you have a misunderstanding about what the tests 
are actually testing, and I will be the first to admit that there are things 
that are incorrect about these tests, a few of which have been seen in your 
reviews.

However, when you find an issue with a specific test, it would be a lot simpler 
to create a single pull request or issue addressing that single test, or group 
of related tests. This way, it's a much smaller change, and is a lot easier for 
us to look at and determine if your changes have been correct or fully thought 
out. It's much harder to do that when it's 66 changed files or what have you.

So in order to keep things simple, and as correct as possible, smaller pull 
requests/issues would be a good idea.

________________________________
From: Kyle Barnhart [[email protected]]
Sent: Friday, February 01, 2013 10:55 PM
To: webvtt-dev
Subject: Re: We Need Everyone To Weigh In To Move Forward

In case I wasn't clear. This is about the tests. Not rewriting the code. The 
only changes to the library that this affects are changes made to make it pass 
the tests.

"the parser is modeled after the syntax, because the parser's job is to process 
input according to that syntax or grammar"
If the parser is actually only a validator, when are we writing an interpreter? 
We cannot send anything to a browser to process without one.


On Fri, Feb 1, 2013 at 10:18 PM, Kyle Barnhart 
<[email protected]<mailto:[email protected]>> wrote:

I ask you one question. How do you determine if a test is written correctly?

On 2013-02-01 9:07 PM, "Kyle Barnhart" 
<[email protected]<mailto:[email protected]>> wrote:
We need a consensus on the issue I will cover shortly. To do that we need to 
get everyone to reply so we can move things along, even if it is only a few 
words. Thank you.

Mailto: [email protected]<http://senecacollege.ca>


Problem
Our parser must at some stage be as standard compliant as possible. That 
includes whatever we implement into Mozilla. The problem stems from the tests 
for the parser library. It is this, there is disagreement as to if the parser 
is subject to the parser section in the specifications.


Why It Matters

"It should follow what implementations do as well. So if there's something 
strange there it might be a bug in the spec."
- Velmont

"yeah, what Velmont said. If there's a reason to implement something other than 
the spec, we should change the spec."
- Ian Hickson


1. If the parser does not follow the specifications it will not be used. Speak 
up if you want people to actually use our work.

"If the code is to end up in Gecko, it had better follow the specification to 
the letter"
- Ms2ger


2. The purpose of specifications is to make sure that things will work the same 
across applications and platforms. What you get if there is no standard or if 
it is not followed is the same problems you get when trying to make a webpage 
work on IE6 or any non-conforming browser. No one wants the complexity and 
extra work to make sure the stuff (in this case webvtt) behaves the same, such 
as having two versions of the same file and making sure each one only goes to 
the appropriate browser.

"'It's a bit unusual for a standard to specify the parsing algorithm, but I can 
understand why.'
It's not unusual for modern specs.  It's a much more dependable way of getting 
to consistent behavior than only specifying a format."
- Glenn Maynard


3. The specification is not a suggestion or a guide. The parser rules are 
created to be so that all implementations behave the same and to do that they 
must be adhered to.

"When writing something that reads WebVTT files, be very sure to parse it as 
specified by the parser--*not* by reading the syntax and coming up with your 
own parsing algorithm."
- Glenn Maynard

"[Syntax rules] are requirements for writing, not for parsing. Requirements in 
that section don't apply to you."
- Simon Pieters

"what do they think the spec is for, if not following" - Ian Hickson
"inspiration" - MikeSmith
"Suggestions" - Ms2ger



What You Need To Know
It would seem that some are not very familiar with the specifications. This is 
understandable as it is complex. So I will outline what the different sections 
are for.

The specification has four parts: syntax, parser, DOM construction, and 
rendering.

Syntax

  *   These rules apply to someone writing a WebVTT file. One can have a 
validator that checks if a file was written correctly.

Parser

  *   These rules apply to interpreters. These are the only rules a 
parser/interpreter needs to comply with, this was the deliberate intention of 
having a parser section. It is important to note that the interpreter can be 
written in away way so long as it arrives to the same results as in the parsing 
specifications. There should be test to ensure compliance.

DOM Construction

  *   These rules apply to the creation of DOM objects. The results of the 
parser are to be converted by these rules. For the WebVTT specifications it is 
isomorphic. However since our parser does not have objects like the parsing 
specification does, our parsing results need to be converted to the 
specification's DOM objects. This is important so that the behavior is the same 
across browsers. (Developers can write code to manipulate the DOM.)

Rendering Rules

  *   These rules determine how WebVTT text tracks are to be displayed. They 
are very thorough and it is important that the are followed to make sure WebVTT 
files look the same when they are used.


Solutions

I see three possibilities for the parser tests.

  1.  The current unit tests are the ones that should ensure compliance with 
the specification.
  2.  There should be a second set for of tests for compliance for the current 
parser.
  3.  There is another parser that uses the current parser library. A second 
set of tests for the second parser.
  4.  Make compliance at the browser level.

One. The solution I propose. I will cover the other three options first.

Two. A second set of tests on the same library would not only be redundant, but 
would actually be in direct conflict with the tests as they stand now. So this 
is not a viable option.

Three. Having a second parser is redundant not just because it is two of the 
same thing. If there is real parser that our parser library is part of, and it 
is not standards compliant because of a problem in our parser library, then our 
library is going to have to change anyway. This is a very real case when you 
accept input that the specifications do not allows, and modify it so that the 
parser creates a cue. For example changing the timestamp "00:00.1000" to 
"00:01.000". (The specification states that a cue with such timestamps should 
be discarded.) Additionally, why have a parser that is doing things wrong in 
the first place?

Four. We cannot make the parser compliant via the implementation into the 
browser because that runs completely counter to the specifications. In addition 
there are all the same problems as option number two.

Back to one. If the library and tests are not subject to the specification, 
then by what standard are the tests to be judged for correctness. The only real 
options is that the test be judged by the specification. Not only would this 
make is clear what the tests need to be, what the parser needs to do, it would 
also significantly reduce development efforts (e.g. two sets of tests, two 
parsers), and ensure our work can be used.


Summary

This is an important topic that needs a clear decisions soon. A lot of work 
needs to be done to make the parser pass the unit tests. Whither those test are 
standards compliant or not is critically important and will change the way the 
parser works.


Thank You,

Kyle Barnhart

_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

Reply via email to