[ https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016857#comment-16016857 ]
James R Doyle commented on VELOCITY-880: ---------------------------------------- Further attempts to get getBinaryStream() to work as the solution yields even more intrigue. Most interesting is that the HSQLDB and Oracle drivers have quite different semantics. If we were to want to support BLOB columns, then I think we would be OK. However, the getCharacterStream() approach is working with both VARCHAR and CLOB as well as between Oracle and HSQLDB. See further below for sample output. Your original ask: Extracting the raw bytes from getBinaryStream() using Oracle shows that the UTF-8 code for Euro symbol IS present. Have a look and observe that 0x20ac is there. However, the conversion problem is going to be another bug inside ResourceLoader::buildReader 546865204575726f2043757272656e63792053796d626f6c20ac20697320612074776f2d62797465205554462d38206368617261637465722e00000000000000 <pre> HSQLDB, VARCHAR, getBinaryStream : Fault due to JDBC driver not supporting getBinaryStream() ================================ incompatible data type in conversion java.sql.SQLSyntaxErrorException: incompatible data type in conversion at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source) ..... at org.apache.velocity.runtime.resource.loader.DataSourceResourceLoader.getResourceReader Oracle12, VARCHAR, getBinaryStream: Test failure due to charset coercion problem. ============================== org.junit.ComparisonFailure: Unicode test failed. Expected :The Euro Currency Symbol € is a two-byte UTF-8 character. Actual :The Euro Currency Symbol � is a two-byte UTF-8 character.org.junit.ComparisonFailure: Unicode test failed. Expected :The Euro Currency Symbol € is a two-byte UTF-8 character. Actual :The Euro Currency Symbol � is a two-byte UTF-8 character. </pre> How would you like to proceed. I believe we should take the changes to CLOB, because that is primary use case, VARCHAR should also work as a requirement. The JavaDoc changes for CLOB should also be done because that is how we expect people to learn and try to use with this resource loader, and what databases really support this 'TEXT' column type anyways? Should we move to ApacheDB while at it a the reference database for embedded unit tests? This resource loader does not work, and I'm sure people are either abandoning the approach altogether (which is sad), or building workaround like I did if they are able to. > DataSourceResourceLoader corrupts UTF-8 encoded characters in template > ---------------------------------------------------------------------- > > Key: VELOCITY-880 > URL: https://issues.apache.org/jira/browse/VELOCITY-880 > Project: Velocity > Issue Type: Bug > Affects Versions: 2.1.x > Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8 > Reporter: James R Doyle > Attachments: velocity-880.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 > templates retrieved from the database. The Unit Test suite for this resource > loader has deficiencies that hide the bug. > The cause of the problem is this: > {code} > InputStream rawStream = rs.getAsciiStream(templateColumn); > The resolution of the problem is simply: > Reader r = rs.getCharacterStream(templateColumn); > InputStream rawStream = null; > try { > rawStream = IOUtils.toInputStream(IOUtils.toString(r), > encoding); > } catch (IOException ioe) {} > {code} > Once done, the test failure vanishes: > org.junit.ComparisonFailure: Unicode test failed. > Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded > character. > Actual :The Euro Currency Symbol ? is a two-byte UTF-8 encoded > character. > The bug was verified and the fix was tested against Oracle12c and HSQLDB > 2.3.4 using a CLOB column to store the template data. > The Unit Tests for this resource loader need attention. > Please see VELOCITY-599 ; long standing problem, which has been erroneously > marked as resolved but has been in the codebase for a long time. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org For additional commands, e-mail: dev-h...@velocity.apache.org