[
https://issues.apache.org/jira/browse/DERBY-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032532#comment-13032532
]
Knut Anders Hatlen commented on DERBY-5068:
-------------------------------------------
Thanks for looking at the patch, Dag. I'm still learning the API myself. :)
You're probably right that we should handle those conditions. I'm not sure how
unmappable-character errors can happen with UTF-8, but malformed-input errors
seem to be raised for characters in the range \uD800 to \uDFFF.
We have two alternatives:
1) Make the CharsetEncoder replace problematic characters with '?' instead of
reporting an error. (By calling onMalformedInput() and onUnmappableCharacter()
with CodingErrorAction.REPLACE.)
2) Detect and report the conditions. (By checking the CoderResult and raising
an exception.)
Option 2 sounds like the right thing to do. However, the original code used
String.getBytes(String) to do the encoding, which implements option 1 (the API
javadoc says that it's unspecified what it does when it cannot encode the
string, but its actual behaviour matches option 1). Also, we still have the
convertFromJavaString(String,Agent) method which matches option 1.
On the other hand, all the encoding methods in EbcdicCcsidManager do raise an
exception if the string contains characters not in the EBCDIC range, so there's
no clear precedence. I guess no matter what we choose to do, we should make all
these methods consistent. I think my preference would be option 2.
> Investigate increased CPU usage on client after introduction of UTF-8
> CcsidManager
> ----------------------------------------------------------------------------------
>
> Key: DERBY-5068
> URL: https://issues.apache.org/jira/browse/DERBY-5068
> Project: Derby
> Issue Type: Task
> Affects Versions: 10.7.1.1
> Reporter: Knut Anders Hatlen
> Attachments: d5068-1a.diff, d5068-2a.diff, d5068-2a.stat
>
>
> While looking at the performance graphs for the single-record select test
> during the last year -
> http://home.online.no/~olmsan/derby/perf/select_1y.html - I noticed that
> there was a significant increase (10-20%) in CPU usage per transaction on the
> client early in October 2010. To be precise, the increase seems to have
> happened between revision 1004381 and revision 1004794. In that period, there
> were three commits: two related to DERBY-4757, and one related to DERBY-4825
> (tests only).
> We should try to find out what's causing the increased CPU usage and see if
> there's some way to reduce it.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira