JC

No worries.  I don't know how your mileage will be but
I hope it works for you.  I have a script to unzip the
OO files and then do the PERL mojo below.  I'm sure
someone could improve upon it :)

farmerdude


--- JC Helary <[EMAIL PROTECTED]> wrote:

> > I don't know if this works for the new format for
> OO
> > (v2+), but for earlier versions to get the text
> from
> > the "content.xml" files I would;
> >
> > cat content.xml | perl -p -e "s/<[^>]*>//g;s/
> +//;"
> 
> Hey, thanks, that is exactly what I was looking for
> !!!
> 
> JC
> 
> > Depending upon what I wanted to do next, I could
> > redirect to a file, append to an existing file,
> etc.
> >
> > This works very nicely for OO documents.  I just
> tried
> > SXIs and it *seems* to work okay.  Test it to see
> if
> > it works as is, or have fun to tweak to suit your
> > specific needs.
> >
> > regards,
> >
> > farmerdude
> >
> >
> > --- JC Helary <[EMAIL PROTECTED]> wrote:
> >
> >>> Having tried various combinations of 'strings'
> and'sed', I have
> >>> concluded that the text cannot be reliably
> extracted without some
> >>> more intelligent parsing of the PPT format.  OO
> obviously performs
> >>> this parsing since all the PPT files open
> flawlessly in
> >>> OpenOffice.org Impress.
> >>>
> >>> Is there any way I can, using OpenOffice.org,
> create a macro to
> >>> extract the text from all of these files?  There
> must be something
> >>> better than 1500 copy/paste operations!
> >>
> >> Greg,
> >>
> >> 1) there is not save to text in OOo for
> presentation files.
> >> 2) all the contents is there in the converted OD
> file, in the xml
> >> 3) there was recently an annoucement about an OOo
> batch conversion  
> >> utility
> >>
> >> with 3) you transform the PPT files to OD format,
> since 1) you  
> >> can't use that directly but thanks to 2) and
> smart XML parsers/ 
> >> conversion tools you can readily access the
> textual data by  
> >> removing _all_ the xml tags.
> >>
> >> I have never tried that because I never _had_ to
> dump to text but  
> >> my feeling is that what you ask, although a
> little unorthodox is  
> >> possible with a few tricks.
> >>
> >> Jean-Christophe
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to