Hi,
I need some help here.
It is about the following data files in folder
i18npool/source/breakiterator/data/
-- char_in.txt
-- count_word*.txt
-- dict_word*.txt
-- edit_word*.txt
-- line.txt
-- sent.txt
(A) I did not find the original sources of these data files on [2].
Does somebody know the original source for these data files?
(B) The data files count_word*.txt, dict_word*.txt and edit_word*.txt do not
differ much. I assume that they are adapted from the original source for certain
usages and languages.
Can someone confirm this?
(C) I have found files at [3] which correspond to these data files. The found
files are named char.txt, line.txt, sent.txt and word.txt. Thus, it looks like
that the original source of these data files is ICU. This would mean that the
license for these files seems to be the ICU license.
Can someone confirm this?
Note: Eike Rathke stated in an posting made in June 2011 that these data files
are taken from ICU and had been adpated for OOo.
Thus again, can somebody help here?
Best regards, Oliver.
[3]
http://www.opensource.apple.com/source/ICU/ICU-400.39/icuSources/data/brkitr/ and
http://www.opensource.apple.com/source/ICU/ICU-400.42/icuSources/data/brkitr/
On 01.12.2011 14:48, Oliver-Rainer Wittmann wrote:
Hi,
looking at our IP clearance wiki page showed that there is an entry for which I
was volunteering, but which get out of my focus. Now, it gets back to my
attention.
It is the issue regarding the license headers for the data files in module
i18npool - see [1].
Status update:
- Most data files are covered by Oracle's SGA
- The data files in folder i18npool/source/breakiterator/data/ which have an IBM
copyright does not have a proper license header.
I will look at ICU [2] for an appropriate replacement.
[1] https://cwiki.apache.org/confluence/display/OOOUSERS/IP_Clearance
[2] http://site.icu-project.org/