On Fri, Sep 19, 2014 at 4:43 PM, Kuchta, Tomasz <[email protected]> wrote: > Hi All, > > I’m a PhD student at Imperial College London, where I work on techniques > for recovering broken documents, i.e. files that either > crash a program or do not load correctly. > > One of the problems that I came across is the availability of good > benchmarks. As far as I know, OpenOffice has a bug reporting feature, > and I was wondering whether it might be possible to use some of these > reports and associated documents to try our recovery mechanism on them. >
Hi Tomasz, The crash reporting feature existed in older versions of OpenOffice.org and sent reports to servers controlled by Sun Microsystems. We stripped out this feature from OpenOffice when it came to Apache. It raised data privacy issues that we did not want to deal with. Two alternative approaches you might want to consider: 1) Our Bugzilla data contains many documents submitted by users that illustrate the bugs they were reporting. Not all were crashes or load issues, but many were. http://issues.apache.org/ooo/ 2) Using fuzzing automation techniques, you can randomly modify the bits of a file, looking for crashes. We've done a good deal of this to search for crashes with security ramifications. You can see my presentation on this technique here: http://www.robweir.com/blog/publications/AOOFuzzing.pdf As you'll see from that presentation, I seeded the fuzzing campaign with documents from our bug database, which helps since it starts with documents that are already, in some way, problematic. Regards, -Rob > Thank you, > Tomasz Kuchta --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
