Hi Bastien,

ODT files from OpenOffice/LibreOffice are just Zip files which contain a 
bunch of xml files and folders for the images or media which you've 
inserted into a document. The text itself is contained in a file called 
"content.xml" inside of it.

There's a plain Java parser for ODT files on this very old post in one of 
the Oracle blogs which may be 
handy: 
https://blogs.oracle.com/prasanna/entry/openoffice_parser_extracting_text_from

Regards,

Denis



El martes, 3 de junio de 2014 19:27:40 UTC-4, Bastien Guerry escribió:
>
> Hi all, 
>
> I'm trying to get the content of an ODT file as plain text. 
>
> I've found Pantomime, but don't understand how to use it? 
>
> Can anyone put me on the right tracks with a minimal working 
> example? 
>
> Thanks in advance! 
>
> -- 
>  Bastien 
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to