https://bugs.freedesktop.org/show_bug.cgi?id=55707
Priority: medium
Bug ID: 55707
Assignee: [email protected]
Summary: Word count incorrect if language is set to Finnish
Severity: normal
Classification: Unclassified
OS: All
Reporter: [email protected]
Hardware: Other
Status: UNCONFIRMED
Version: 3.5.4 release
Component: Writer
Product: LibreOffice
Created attachment 68181
--> https://bugs.freedesktop.org/attachment.cgi?id=68181&action=edit
Different word counts for corresponding Finnish and English texts.
Word count does not function correctly if the language of the text is set to
Finnish: any punctuation or combination of punctuation that is not immediately
preceded by a letter is counted as the end of a previous word (even if the
punctuation occurs in the beginning of a paragraph, so there is no previous
word). Furthermore, some special symbols are counted together with the previous
word (typically but not necessarily written with numerical digits, e.g. "10 %")
as if they were a single word, despite their being separated by a space.
If the language is set to another language, the word count seems to function
correctly.
Found in 3.5.4 (backported for Debian Squeeze); also present in 3.6.2
(Windows).
The issue may be related to bug 33774 (it seems that the fix does not affect
Finnish text for some reason).
Steps to reproduce:
1) Open the attached test file. The first column has some samples set in
Finnish, whereas the second column has corresponding samples set in English.
2) Open the word-count dialog box (if using 3.5.4).
3) Select the first sample line (including the punctuation) in the Finnish
column. According to the word count, the current selection contains two words,
the opening quotation mark being counted as a separate word. The corresponding
English sample is correctly reported to contain only one word.
4) On the second line, select one by one each of the Finnish sample words. The
string "USA:n" is correctly counted as a single word (since the colon is
preceded by a letter), but both "90:n" and "%:n" are counted as two separate
words (and by further experimentation, one can see that the string "n, %:" is
counted as a single word, mixing the ending of one word, the intervening
punctuation, and the stem of the following word). In the corresponding English
column, "USA's", "90's" and "%'s" are each counted as a single word.
5) Select the entire third line ("10 %, 10 €") in the Finnish column. The
current selection is counted as two words, whereas the identical string in the
English column is counted as four words.
If the language setting of the Finnish column is changed into English, the word
count works correctly, and vice versa. Also, if the language setting is changed
into French, German, or Swedish, the word count works correctly. The issue only
seems to affect Finnish. (Could this somehow be connected to the fact that the
spell checker for Finnish is not Hunspell but Voikko?)
--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs