Thanks for the summary, Caitlin and Richard. Moving the thread here where it's more generally visible.
We've been working through build system issues the past two days, and are very close to having it pass 'make distcheck' with most of the tests disabled. I'd like to get as many of those tests re-enabled as possible before we merge into mozilla/webvtt. My plan is for Chris Pearce and I to read all the code in the humphd/seneca branch, asking questions and filing issues as we go. Hopefully the authors can also respond and/or fix as we go along. When we're happy we at least understand the state of the code, we'll merge into the mozilla/webrtc repo and work can continue there, with ongoing review of smaller pull requests. Until then I'd like us to keep feeding into the seneca branch as the best version to merge. One thing I'm concerned about is parser security. It's hard to write secure string parsing in C (or C++) but this code will be fed files from the wild, wild web, and it's important that we do all we can to not be become an pwnage path. Unit tests, scan-build and fuzzing are all important parts of that, but I think it's still going to take some time to mature. Because of that I propose we merge with mozilla's github tree and import into m-c after filing, but not necessarily resolving all the issues we find in review, relying on the pref to protect users. I fear anything else will take too long to get the integration done. Chris and David, what do you think? -r On 13-01-17 3:08 PM, Caitlin Potter wrote: > Thank you for subscribing to the mailing list, Chris and Ralph. We > haven't done a very good job of using the tool as of yet, but certainly > for communications with people many timezones away it should be > invaluable and very helpful. > > As you likely are aware, our implementation of WebVTT will soon be > undergoing a review and thorough grilling in the near future, and I > wanted to give an indication of the current state of it so that we might > get some feedback on things to focus on before the initial review. > > If you have not seen the source tree, our current work can be seen here > <https://github.com/humphd/webvtt/tree/seneca> > > In summary: > > * Autotools build system > /which may not be desirable for Mozilla integration/ > * Lexical scanner ( src/libwebvtt/lexer.c ) > /Simple state machine for scanning symbols from the main file, such > as keywords like "NOTE" or "WEBVTT"/. /Reentrant./ > * Main parser ( src/libwebvtt/parser.c ) > /Larger, buggier state machine. Currently does not actually call the > cuetext parser when a cue is available. Nor does it do a very good > job of reading multiple cues./ /Reentrant./ > * Cuetext parser ( src/libwebvtt/cuetext.c ) > /Currently operates on UTF16 strings collected from the "main" parser./ > * Unit tests ( test/unit ) > /M//ost of these will fail because of bugs in the parser which cause > segfaults and lea//ks//./ > * Not thread-safe at all. > > That's a very simplistic look at the tree, but those are the things > which stand out a lot to me personally, and these are problems that I'm > hoping to solve in the next few weeks. It hasn't been much of a tour, > but I would appreciate any and all comments and suggestions that will > assist me in solving some of these problems in a timely manner. > > Some current goals that myself and Rick Eyre are working on: > > * Call cuetext parser from main parser, once a cue has been read. > * Remove UTF16 string object, deal exclusively with UTF8/8bit encodings. > * Separate main parser into smaller (but still reentrant) routines. > * Resolve as many memory issues as possible, as quickly as possible. > > I'll leave it at that for now, hopefully it gives a good idea of what > those of us working closely with the parser itself are contending with > at this time. If there are any questions or comments regarding the code > and decisions, we are all looking forward to your comments on the > mailing list and IRC. _______________________________________________ dev-media mailing list [email protected] https://lists.mozilla.org/listinfo/dev-media

