https://bugs.documentfoundation.org/show_bug.cgi?id=146429

            Bug ID: 146429
           Summary: Fallback to other character encodings detected by ICU
                    above a certain confidence threshold
           Product: LibreOffice
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: medium
         Component: LibreOffice
          Assignee: [email protected]
          Reporter: [email protected]

Description:
As discussed in Bug 92161, we should modify SwIoSystem::IsDetectableText so
that if none of the encodings we explicitly check for match, we can consider
falling back to whatever ucsdet_getName (from icu library) returns (provided LO
supports it). We can use ucsdet_getConfidence to filter out anything below a
certain confidence threshold.

Steps to Reproduce:
1. Save a text file as one of the encodings we don't detect, e.g. EUC-KR with
some Korean text pasted in
2. open the file in LO Writer. 

Actual Results:
It should display incorrectly as it is assumed to be Unicode or some other
encoding

Expected Results:
the filetype is correctly deterined and it displays happily


Reproducible: Always


User Profile Reset: No



Additional Info:
see desc

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to