Hi,

I need some help here.

It is about the following data files in folder 
i18npool/source/breakiterator/data/
-- char_in.txt
-- count_word*.txt
-- dict_word*.txt
-- edit_word*.txt
-- line.txt
-- sent.txt

(A) I did not find the original sources of these data files on [2].
Does somebody know the original source for these data files?

(B) The data files count_word*.txt, dict_word*.txt and edit_word*.txt do not differ much. I assume that they are adapted from the original source for certain usages and languages.
Can someone confirm this?

(C) I have found files at [3] which correspond to these data files. The found files are named char.txt, line.txt, sent.txt and word.txt. Thus, it looks like that the original source of these data files is ICU. This would mean that the license for these files seems to be the ICU license.
Can someone confirm this?

Note: Eike Rathke stated in an posting made in June 2011 that these data files are taken from ICU and had been adpated for OOo.

Thus again, can somebody help here?

Best regards, Oliver.


[3] http://www.opensource.apple.com/source/ICU/ICU-400.39/icuSources/data/brkitr/ and
http://www.opensource.apple.com/source/ICU/ICU-400.42/icuSources/data/brkitr/

On 01.12.2011 14:48, Oliver-Rainer Wittmann wrote:
Hi,

looking at our IP clearance wiki page showed that there is an entry for which I
was volunteering, but which get out of my focus. Now, it gets back to my 
attention.

It is the issue regarding the license headers for the data files in module
i18npool - see [1].

Status update:
- Most data files are covered by Oracle's SGA
- The data files in folder i18npool/source/breakiterator/data/ which have an IBM
copyright does not have a proper license header.

I will look at ICU [2] for an appropriate replacement.

[1] https://cwiki.apache.org/confluence/display/OOOUSERS/IP_Clearance
[2] http://site.icu-project.org/

Reply via email to