Hi, On Mon, 20 Feb 2012 11:29:41 +0200 Yedidyah Bar-David <linux...@didi.bardavid.org> wrote:
> On Mon, Feb 20, 2012 at 10:40:58AM +0200, Nadav Har'El wrote: > > On Sun, Feb 19, 2012, Dotan Cohen wrote about "Re: Preparing to convince to > > shift to non-propriety documents formats": > > > Undocumented? Which file format is that? All the .doc and .docx > > > formats are documented, even the older binary formats. > > > > Where is the ".doc" format documented? > > > > I once wrote a tool to extract the text in MS Office files (for a search > > engine). It was a really annoying reverse-engineering-like > > trial-and-error process, and I could hardly find any documentation. > > The PowerPoint format (.ppt) was particularly odd. > > > > What documentation do you refer to? > > According to Wikipedia, it's partially documented. I did not follow the > links inside: > http://en.wikipedia.org/wiki/DOC_(computing)#Specification there's also this (with a link at the top): http://www.joelonsoftware.com/items/2008/02/19.html The licence may be problematic. Regards, Shlomi Fish -- ----------------------------------------------------------------- Shlomi Fish http://www.shlomifish.org/ My Public Domain Photos - http://www.flickr.com/photos/shlomif/ Larry Wall dreams in Perl. Please reply to list if it's a mailing list post - http://shlom.in/reply . _______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il