On Thu, 7 Aug 2008, Dmitry Goldenberg wrote:
Does anyone know of an API to get at the metadata and content of MS Publisher files (.pub)?

I'm not aware of any, other than using ole controls to automate the app, which is only single threaded and very prone to failures (most office apps "support" this)

Are any of the POI contributors currently working on .pub support?

I don't think so, but it might not to be too hard to get something very basic going (eg most of the text in the file, and lots of stuff that isn't quite text....). Any chance you could open a new bugzilla entry, and attach a few sample files?

Ideally these would be fairly simple files, from very simple + one page up to simple and three pages. Along with these should be a textual description of what's in each file.

Armed with that, it should be possible for someone to take a look at the files, and try to figure out the structure. If it's like excel or powerpoint, a basic extractor could be done in something like 5-10 hours. If it's more like visio or project, then in the absence of any docs it could be much much more :/

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to