Re: [lingu-dev] Help needed - bulk extraction of words

F Wolff Fri, 08 Feb 2008 02:41:58 -0800

Op Donderdag 2008-02-07 skryf Leif Lodahl:
> Hi all,
> The Danish project has been so fortunate to receive a bunch of articles 
> from a news magazine. These are odt files and we would like to extract 
> the words from these documents. We have programs for this purpose, but 
> we usually get donations one document at the time. This time we have 
> several thousand documents and I believe it would take about a year to 
> load these documents one by one.
> 
> Do any of you have a program that can extract words from several documents ?
> 
> The words will be loaded into our workflow for linguistic processing and 
> at the end be a part of the Danish spelling directory.
> 
> Thanks in advance.
>


Hallo Leif

My system has a command called sxw2txt which can simply print out the
plain text from a file. I also found a website with a tool called
odt2txt which might help:
http://stosberg.net/odt2txt/

Keep well
Friedel


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [lingu-dev] Help needed - bulk extraction of words

Reply via email to