I'm not certain I'm jumping in with both feet ... haven't fully followed the thread ... but ...
I don't know if this works for the new format for OO (v2+), but for earlier versions to get the text from the "content.xml" files I would; cat content.xml | perl -p -e "s/<[^>]*>//g;s/ +//;" Depending upon what I wanted to do next, I could redirect to a file, append to an existing file, etc. This works very nicely for OO documents. I just tried SXIs and it *seems* to work okay. Test it to see if it works as is, or have fun to tweak to suit your specific needs. regards, farmerdude --- JC Helary <[EMAIL PROTECTED]> wrote: > > Having tried various combinations of 'strings' and > 'sed', I have > > concluded that the text cannot be reliably > extracted without some > > more intelligent parsing of the PPT format. OO > obviously performs > > this parsing since all the PPT files open > flawlessly in > > OpenOffice.org Impress. > > > > Is there any way I can, using OpenOffice.org, > create a macro to > > extract the text from all of these files? There > must be something > > better than 1500 copy/paste operations! > > Greg, > > 1) there is not save to text in OOo for presentation > files. > 2) all the contents is there in the converted OD > file, in the xml > 3) there was recently an annoucement about an OOo > batch conversion > utility > > with 3) you transform the PPT files to OD format, > since 1) you can't > use that directly but thanks to 2) and smart XML > parsers/conversion > tools you can readily access the textual data by > removing _all_ the > xml tags. > > I have never tried that because I never _had_ to > dump to text but my > feeling is that what you ask, although a little > unorthodox is > possible with a few tricks. > > Jean-Christophe > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
