I'm not certain I'm jumping in with both feet ...
haven't fully followed the thread ... but ...

I don't know if this works for the new format for OO
(v2+), but for earlier versions to get the text from
the "content.xml" files I would;

cat content.xml | perl -p -e "s/<[^>]*>//g;s/ +//;"

Depending upon what I wanted to do next, I could
redirect to a file, append to an existing file, etc.

This works very nicely for OO documents.  I just tried
SXIs and it *seems* to work okay.  Test it to see if
it works as is, or have fun to tweak to suit your
specific needs.

regards,

farmerdude


--- JC Helary <[EMAIL PROTECTED]> wrote:

> > Having tried various combinations of 'strings' and
> 'sed', I have  
> > concluded that the text cannot be reliably
> extracted without some  
> > more intelligent parsing of the PPT format.  OO
> obviously performs  
> > this parsing since all the PPT files open
> flawlessly in  
> > OpenOffice.org Impress.
> >
> > Is there any way I can, using OpenOffice.org,
> create a macro to  
> > extract the text from all of these files?  There
> must be something  
> > better than 1500 copy/paste operations!
> 
> Greg,
> 
> 1) there is not save to text in OOo for presentation
> files.
> 2) all the contents is there in the converted OD
> file, in the xml
> 3) there was recently an annoucement about an OOo
> batch conversion  
> utility
> 
> with 3) you transform the PPT files to OD format,
> since 1) you can't  
> use that directly but thanks to 2) and smart XML
> parsers/conversion  
> tools you can readily access the textual data by
> removing _all_ the  
> xml tags.
> 
> I have never tried that because I never _had_ to
> dump to text but my  
> feeling is that what you ask, although a little
> unorthodox is  
> possible with a few tricks.
> 
> Jean-Christophe
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to