[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-26 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063787#comment-16063787
 ] 

Michael Osipov commented on VELOCITY-880:
-

>From {{public synchronized Reader getResourceReader(final String name, String 
>encoding)}}.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
>Assignee: Claude Brisson
> Fix For: 2.0
>
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-26 Thread Claude Brisson (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063780#comment-16063780
 ] 

Claude Brisson commented on VELOCITY-880:
-

The method takes the name of the column, so the user can potentially use it to 
build the reader with the encoding he wants. Otherwise than that; where would 
that encoding come from?
Also, storing the template in a BLOB column rather than in a CLOB column is 
obviously a design mistake that I don't think we should care about.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
>Assignee: Claude Brisson
> Fix For: 2.0
>
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-26 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063610#comment-16063610
 ] 

Michael Osipov commented on VELOCITY-880:
-

Just checked the changes. Look good to me, but I am uncertain for the 
{{DataSourceResourceLoader#getReader()}} method. The protected one does not 
have the encoding passed. Given that a template could also be stored in a BLOB, 
a custom loader wouldn't be able to return a proper reader. WDYT?

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
>Assignee: Claude Brisson
> Fix For: 2.0
>
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Claude Brisson (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036262#comment-16036262
 ] 

Claude Brisson commented on VELOCITY-880:
-

I totally agree that getAsciiStream() is to be avoided, and I'd also say so 
from getBinaryStream(). With getBinaryStream(), which is more targeted towards 
BLOBs than CLOBs, you take the risk of exposing an internal database encoding 
you're unaware of. The natural method is getCharacterStream(), and if it 
doesn't work for a specific JDBC driver, it means the driver is broken. At the 
minimum, we can have the getReader() be protected so that it can be subclassed.

But I totally agree that we should validate it against the widest number of db 
vendors. Since some time ago I was implied in a wide DB benchmarking job, I do 
also have access to the ones you name plus several others like PostgresQL, and 
Vertica.

James, you say:
> The reason I use IOUtils to convert the Reader to an InputStream is that the 
> base class method expects InpuStream.
This isn't true anymore for 2.0. I think the only needed change is to switch to 
getCharacterStream(). 


> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Claude Brisson (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036253#comment-16036253
 ] 

Claude Brisson commented on VELOCITY-880:
-

You said: "This entire DBResourceLoader thing seems to be pure foobar." I 
probably misinterpreted this phrase as a remark about the code itself, as if 
you were implying that the whole loader needed a full re-enginering, something 
like that...

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036248#comment-16036248
 ] 

Michael Osipov commented on VELOCITY-880:
-

Please cite what you exact weren't able to understand. I don't want to 
interprete either :-D

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Claude Brisson (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036247#comment-16036247
 ] 

Claude Brisson commented on VELOCITY-880:
-

I did understand that, but I didn't know how to interpret the "pure foobar" 
part :-)

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036241#comment-16036241
 ] 

Michael Osipov commented on VELOCITY-880:
-

Salut Claude, I'd like to have a minimal example for the issue creator to 
validate against Oracle 11.2g and MySQL. If this is all a huge fuzz and cannot 
be fixed portably across databases, I am considering turning this loader into 
an abstact one where the client has to provide the input stream or we provide 
an enum option which getter will be called in the result set.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-06-04 Thread Claude Brisson (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036236#comment-16036236
 ] 

Claude Brisson commented on VELOCITY-880:
-

Michael, for the record, do you want to share your comments about the code, so 
that we can move forward?

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-05-29 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028263#comment-16028263
 ] 

Michael Osipov commented on VELOCITY-880:
-

James, please provide a minimal/isolated test project rather than a full-blown 
modified working copy.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: project.tar.gz, velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-05-22 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020154#comment-16020154
 ] 

Michael Osipov commented on VELOCITY-880:
-

Can you assemble a small sample project for me? I'd to try this at work with 
11.2.0.4.0. This entire DBResourceLoader thing seems to be pure foobar. I will 
retry with MySQL too.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-05-18 Thread James R Doyle (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016857#comment-16016857
 ] 

James R Doyle commented on VELOCITY-880:


Further attempts to get getBinaryStream() to work as the solution yields even 
more intrigue. Most interesting is that the HSQLDB
and Oracle drivers have quite different semantics.  If we were to want to 
support BLOB columns, then I think we would be OK.
However, the getCharacterStream() approach is working with both VARCHAR and 
CLOB as well as between Oracle and HSQLDB.
See further below for sample output.

Your original ask: Extracting the raw bytes from getBinaryStream() using Oracle 
shows that the UTF-8 code for Euro symbol IS present.  Have a look and observe 
that 0x20ac is there. However, the conversion problem is going to be another 
bug inside ResourceLoader::buildReader

546865204575726f2043757272656e63792053796d626f6c20ac20697320612074776f2d62797465205554462d38206368617261637465722e00


HSQLDB, VARCHAR, getBinaryStream  : Fault due to JDBC driver not supporting 
getBinaryStream()

incompatible data type in conversion
java.sql.SQLSyntaxErrorException: incompatible data type in conversion
  at 
org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
  .
  at 
org.apache.velocity.runtime.resource.loader.DataSourceResourceLoader.getResourceReader

Oracle12, VARCHAR, getBinaryStream:  Test failure due to charset coercion 
problem. 
==
 org.junit.ComparisonFailure: Unicode test failed.
 Expected :The Euro Currency Symbol € is a two-byte UTF-8 character.
 Actual   :The Euro Currency Symbol � is a two-byte UTF-8 
character.org.junit.ComparisonFailure: Unicode test failed.
 Expected :The Euro Currency Symbol € is a two-byte UTF-8 character.
 Actual   :The Euro Currency Symbol � is a two-byte UTF-8 character.


How would you like to proceed. I believe we should take the changes to CLOB, 
because that is primary use case, VARCHAR should also work as a requirement. 
The JavaDoc changes for CLOB should also be done because that is how we expect 
people to learn and try to use with this resource loader, and what databases 
really support this 'TEXT' column type anyways?   Should we move to ApacheDB 
while at it a the reference database for embedded unit tests?This resource 
loader does not work, and I'm sure people are either abandoning the approach 
altogether (which is sad), or building workaround like I did if they are able 
to. 

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-05-18 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016452#comment-16016452
 ] 

Michael Osipov commented on VELOCITY-880:
-

Have you tried to write out the raw bytes returned by {{getBinaryStream()}}? 
The result looks awkward to me.

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org



[jira] [Commented] (VELOCITY-880) DataSourceResourceLoader corrupts UTF-8 encoded characters in template

2017-05-17 Thread James R Doyle (JIRA)

[ 
https://issues.apache.org/jira/browse/VELOCITY-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014900#comment-16014900
 ] 

James R Doyle commented on VELOCITY-880:


As suggested above, I tried:
  InputStream rawStream = rs.getBinaryStream(templateColumn);
All tests pass except the testUnicode test - which results in mangled output
  org.junit.ComparisonFailure: Unicode test failed.  
  Expected :The Euro Currency Symbol € is a two-byte UTF-8 character.
  Actual   :The Euro Currency Symbol � is a two-byte UTF-8 character.

getCharacterStream() is the safest, and it works with both VARCHAR and CLOB 
columns. 
The reason I use IOUtils to convert the Reader to an InputStream is that the 
base class method expects InpuStream:
return buildReader(rawStream, encoding);

Given what we now know about #getBinaryStream(), the scope of this bug now 
includes the 1.7.x branch.  Again, the broken unit test fails to reveal this.
I'd love to fix this problem for you, I've been using a workaround (custom 
Resource Loader) for a few years. Being a good citizen sharing the problem.
Database hosted Velocity Templates are in production on two large clients of 
mine - one being a prestigious University and another a US Federal agency.
I'd love to help fix this bug in the main release pool.   How shall I proceed?

> DataSourceResourceLoader corrupts UTF-8 encoded characters in template
> --
>
> Key: VELOCITY-880
> URL: https://issues.apache.org/jira/browse/VELOCITY-880
> Project: Velocity
>  Issue Type: Bug
>Affects Versions: 2.1.x
> Environment: Oracle12c and HSQLDB 2.3.4, JDK 1.8
>Reporter: James R Doyle
> Attachments: velocity-880.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A long-withstanding bug in the DataSourceResourceLoader corrupts UTF-8 
> templates retrieved from the database.  The Unit Test suite for this resource 
> loader has deficiencies that hide the bug. 
> The cause of the problem is this:
> {code}
>   InputStream rawStream = rs.getAsciiStream(templateColumn);
> The resolution of the problem is simply:
>   Reader r = rs.getCharacterStream(templateColumn);
>   InputStream rawStream = null;
>try {
> rawStream = IOUtils.toInputStream(IOUtils.toString(r), 
> encoding);
> } catch (IOException ioe) {}
> {code}
> Once done, the test failure vanishes:
> org.junit.ComparisonFailure: Unicode test failed.  
> Expected :The Euro Currency Symbol € is a two-byte UTF-8 encoded 
> character.
> Actual   :The Euro Currency Symbol ? is a two-byte UTF-8 encoded 
> character.
> The bug was verified and the fix was tested against Oracle12c and HSQLDB 
> 2.3.4 using a CLOB column to store the template data.
> The Unit Tests for this resource loader need attention.
> Please see VELOCITY-599 ; long standing problem, which has been erroneously 
> marked as resolved but has been in the codebase for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@velocity.apache.org
For additional commands, e-mail: dev-h...@velocity.apache.org