FYI, here's how you can create a list of all available text
encodings in the JVM you're running in. This can lead to a
very long combo box, though :-)
Map<String, Charset> charsetMap = Charset.availableCharsets();
--Thilo
On 5/18/2010 01:40, Jörn Kottmann (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/UIMA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868448#action_12868448
> ]
>
> Jörn Kottmann commented on UIMA-1782:
> -------------------------------------
>
> There is now an option to specify the encoding of the text import files. It
> is always preset to the default platform encoding. The combo box displays the
> Java standard charsets (see here:
> http://java.sun.com/j2se/1.4.2/docs/api/java/nio/charset/Charset.html).
> In case the user wants to use a non-standard Java charset (which usually are
> there) he has to type in the name of the charset he wants to use, while the
> name is typed in, it is validated if the charset is available and he can
> proceed with the import, otherwise the "Apply" button just remains disabled.
>
> It would be nice to add a warning to tell the user that the "Apply" button is
> disable because of an invalid charset name or unsupported charset.
>
>> Encoding of text files during import should be confugurable
>> -----------------------------------------------------------
>>
>> Key: UIMA-1782
>> URL: https://issues.apache.org/jira/browse/UIMA-1782
>> Project: UIMA
>> Issue Type: Improvement
>> Components: CasEditor
>> Affects Versions: 2.3
>> Reporter: Thomas Hampp
>> Assignee: Jörn Kottmann
>> Fix For: 2.3.1
>>
>>
>> During import of text files into a corpus it seems to be impossible to
>> control the encoding used. Looks like the default platform encoding is used
>> (Latin 1 on Western Windows systems). The Eclipse default encoding settings
>> for text files don't seem to affect import encoding. That makes it
>> impossible to import documents with international characters in UTF8.
>> Ideally the encoding should be selectable in a drop down field in the import
>> wizard.
>