I’m using JXLS to generate a report in Excel and am having a hard time with 
non-ASCII text, such as the following: 

𝑦 = π‘šπ‘₯ + 𝑏, 𝐴π‘₯ + 𝐡𝑦 = 𝐢, and 𝑦 - 𝑦₁ = π‘š(π‘₯ - π‘₯₁)

The above is rendered to the sharedStrings.xml file as:

<sst count="1" uniqueCount="1" 
xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main";><si><t>?? = 
???? + ??, ???? + ???? = ??, and ?? - ??₁ = ??(?? - ??₁)</t></si></sst>

I believe I’ve narrowed it down to 
org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRst. My testing shows 
that it’s storing the string correctly internally, but when writing to the 
sharedStrings.xml, the text isn’t being handled correctly. I’m not sure if this 
is something I’m doing wrong, or if this is a bug somewhere in POI or XmlBeans. 
I don’t believe the issue is in the JXLS library as I’ve isolated the issue to 
the code below:

        String text = "𝑦 = π‘šπ‘₯ + 𝑏, 𝐴π‘₯ + 𝐡𝑦 = 𝐢, and 𝑦 - 𝑦₁ = π‘š(π‘₯ - π‘₯₁)";
        SharedStringsTable table = new SharedStringsTable();
        CTRst st = CTRst.Factory.newInstance();
        st.setT(text);
        table.addEntry(st);

        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        table.writeTo(baos);
        String output = baos.toString("UTF-8");

        // This assertion passes
        Assert.assertEquals(st.getT(), text);

        // This assertion fails
        Assert.assertEquals(output, "<?xml version=\"1.0\" 
encoding=\"UTF-8\"?>\n" +
                        "<sst count=\"1\" uniqueCount=\"1\" 
xmlns=\"http://schemas.openxmlformats.org/spreadsheetml/2006/main\";><si><t>𝑦 = 
π‘šπ‘₯ + 𝑏, 𝐴π‘₯ + 𝐡𝑦 = 𝐢, and 𝑦 - 𝑦₁ = π‘š(π‘₯ - π‘₯₁)</t></si></sst>");


Here’s another snippet which reproduces the issue I’m having with creating a 
xlsx workbook:

        XSSFWorkbook workbook = new XSSFWorkbook();
        XSSFSheet sheet = workbook.createSheet();

        Row row = sheet.createRow(0);
        Cell cell = row.createCell(0);
        cell.setCellValue(TEXT);

        FileOutputStream outputStream = new FileOutputStream(FILE_NAME);
        workbook.write(outputStream);
        workbook.close();


I’m assuming it’s something I’m doing wrong, but have been unable to find a 
solution. I created a github repo with the above code in hopes that it aids in 
finding a solution.

https://github.com/JohnBrainard/poi-utf8-debugging

Thank you for your help!

John


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to