Well, as I'm developing LanguageTool, I know too well that this is not
so trivial - as we painfully found out :( You would have to traverse the
whole document (including tables and footnotes) to find the text - we
didn't find a proper way to do it and were happy to abandon this as soon
as new API was available. Actually exporting to text document and
running a script would be an easier option from the developer's point of
view. Or even using a standalone Java program that parses ODF as XML file.
If you have to create frequency lists very frequently, then maybe it
could make some sense to create such an extension that you describe.
What would be the use of the frequency list? I simply cannot see a
realistic usage scenario for non-scripting environment.
Regards
Marcin
Harold Fuchs pisze:
Thanks but it's not exactly what I had in mind. As far as I know
extensions to OOo can be written in Java which, again as far as I know,
can handle the associative array you used in your awk example. So, for
someone familiar with the ODF structure and API, writing such an
extension should be quite simple. Or ???
In addition, OOo can already produce a word *count* so it knows what a
"word" is ...
Harold Fuchs
London, England
Please reply *only* to [email protected]
On 06/01/2009 02:18, Marcin Miłkowski wrote:
Save as text file, and run this awk script on it from command line
(gawk -f <scriptfile> <filename.txt>):
----------
# Print list of word frequencies
{
for (i = 1; i <= NF; i++)
freq[$0]++
}
END {
for (word in freq)
printf "%s\t%d\n", word, freq[word]
}
--------------
To get better results you could remove all punctuation by simple
search and replace before saving as a text file. An extension would be
easy to write but a nightmare in a language without hash tables as
used in awk.
Best
Marcin
Harold Fuchs pisze:
Is there an extension (or other software) that will produce a word
frequency table in Writer (2.4.1 or 3.x)? Where, please?
Note: I do not mean a word count but a list of the number of times
each word is used in a document.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]