Re: [whatwg] SRT research: timestamps

2011-10-11 Thread Ralph Giles
On 10/10/11 12:19 AM, Simon Pieters wrote: 0 negative intervals 0 cues skipped because field counts were different That will teach me to proofread after posting. The real counts should be: 2227 negative intervals 6822 cues skipped because field counts were different From which I conclude

Re: [whatwg] SRT research: timestamps

2011-10-07 Thread Ralph Giles
On 06/10/11 01:58 AM, Simon Pieters wrote: I don't know how many have negative interval, I'd need to run a new script over the 52,000,000 lines to figure out. (If you want me to check this, please contact me with details about what you want to count as negative interval.) I had in mind

Re: [whatwg] SRT research: timestamps

2011-10-06 Thread Philip Jägenstedt
On Thu, 06 Oct 2011 07:36:00 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Thu, Oct 6, 2011 at 10:51 AM, Ralph Giles gi...@mozilla.com wrote: On 05/10/11 04:36 PM, Glenn Maynard wrote: If the files don't work in VTT in any major implementation, then probably not many. It's

Re: [whatwg] SRT research: timestamps

2011-10-06 Thread Simon Pieters
On Wed, 05 Oct 2011 23:07:17 +0200, Silvia Pfeiffer silviapfeiff...@gmail.com wrote: On Thu, Oct 6, 2011 at 4:22 AM, Simon Pieters sim...@opera.com wrote: I did some research on authoring errors in SRT timestamps to inform whether WebVTT parsing of timestamps should be changed. Our

Re: [whatwg] SRT research: timestamps

2011-10-06 Thread Philip Jägenstedt
On Thu, 06 Oct 2011 05:46:15 +0200, Glenn Maynard gl...@zewt.org wrote: On Wed, Oct 5, 2011 at 7:51 PM, Ralph Giles gi...@mozilla.com wrote: A point Philip Jägenstedt has made is that it's sufficiently tedious to verify correct subtitle playback that authors are unlikely to do so with any

Re: [whatwg] SRT research: timestamps

2011-10-06 Thread Ralph Giles
This is all I meant as well. Of course we should all implement the parser as spec'd. My comments were with respect to amending the spec to be more forgiving of common errors. -r Philip Jägenstedt phil...@opera.com wrote: On Thu, 06 Oct 2011 07:36:00 +0200, Silvia Pfeiffer

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread Silvia Pfeiffer
On Thu, Oct 6, 2011 at 4:22 AM, Simon Pieters sim...@opera.com wrote: I did some research on authoring errors in SRT timestamps to inform whether WebVTT parsing of timestamps should be changed. Our starting point was 70,000 files provided to Opera (for research purposes) by opensubtitles.org

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread David Singer
On Oct 5, 2011, at 14:07 , Silvia Pfeiffer wrote: On Thu, Oct 6, 2011 at 4:22 AM, Simon Pieters sim...@opera.com wrote: The most common error is to use a dot instead of a comma. They're WebVTT files already. ;-) which rather raises the question of how many people will write comma instead

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread Glenn Maynard
On Wed, Oct 5, 2011 at 7:17 PM, David Singer sin...@apple.com wrote: which rather raises the question of how many people will write comma instead of dot in VTT, given a european view or SRT habits. If the files don't work in VTT in any major implementation, then probably not many. It's the

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread David Singer
On Oct 5, 2011, at 16:36 , Glenn Maynard wrote: On Wed, Oct 5, 2011 at 7:17 PM, David Singer sin...@apple.com wrote: which rather raises the question of how many people will write comma instead of dot in VTT, given a european view or SRT habits. If the files don't work in VTT in any major

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread Ralph Giles
On 05/10/11 10:22 AM, Simon Pieters wrote: I did some research on authoring errors in SRT timestamps to inform whether WebVTT parsing of timestamps should be changed. This is completely awesome, thanks for doing it. hours too many '(^|\s|)\d{3,}[:\.,]\d+[:\.,]\d+' 834 As Silvia mentioned,

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread Ralph Giles
On 05/10/11 04:36 PM, Glenn Maynard wrote: If the files don't work in VTT in any major implementation, then probably not many. It's the fault of overly-lenient parsers that these things happen in the first place. A point Philip Jägenstedt has made is that it's sufficiently tedious to verify

Re: [whatwg] SRT research: timestamps

2011-10-05 Thread Glenn Maynard
On Wed, Oct 5, 2011 at 7:51 PM, Ralph Giles gi...@mozilla.com wrote: A point Philip Jägenstedt has made is that it's sufficiently tedious to verify correct subtitle playback that authors are unlikely to do so with any vigilance. Therefore the better trade-off is to make the parser forgiving,