[
https://issues.apache.org/jira/browse/CAMEL-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496159#comment-13496159
]
Francois Kritzinger commented on CAMEL-5718:
--------------------------------------------
I have had a look and I am of the opinion that leaving 8-bit bodies as byte[]
is the only solution.
It is not possible to store a byte[] inside a String (using e.g.
String.String(byte[])) and read those bytes out again later (using e.g.
String.getBytes()) without involving the system's default charset. And as patch
[1] shows, this fails on at least one of my system's supported charsets (Big5,
in this case). In a nutshell, storing 8-bit data inside a String is just plain
wrong.
I don't think consistency (String vs. byte[] bodies) is an issue after all
because users will still be able to get a String version of the body by calling
SmppMessage.getBody(String.class) (although this will cause conversion on some
systems). I think it's actually quite intuitive to retrieve 8-bit data using
getBody(byte[].class) or (byte[])getBody().
I have attached two patches. *Note that these patches are not compatible*:
patch [1] exists solely to show how I reproduced the CI failures on my machine.
The real fix and its unit tests are in patch [2].
[1] _ci_failures_reproduced.diff_: For informational purposes only; modifies
the tests that were causing the CI machine to fail so that they fail on my
machine. (They should now fail on any machine which supports a reasonable set
of character encodings.)
[2] _ci_failures_fixed_and_tested.diff_: The real fix and its unit tests.
> Bodies of SMs with 8-bit data_coding are mangled
> ------------------------------------------------
>
> Key: CAMEL-5718
> URL: https://issues.apache.org/jira/browse/CAMEL-5718
> Project: Camel
> Issue Type: Bug
> Components: camel-smpp
> Reporter: Francois Kritzinger
> Assignee: Christian Müller
> Fix For: 2.9.5, 2.10.3, 2.11.0
>
> Attachments: 8bit_deliver_sm_bodies_mangled.diff,
> camel_smpp_8bit_messages.diff, ci_failures_fixed_and_tested.diff,
> ci_failures_reproduced.diff
>
>
> Bytes in the body of 8-bit SUBMIT_SMs which do not fall within the chosen
> charset's range are set to '?', which is obviously wrong because 8-bit/binary
> data should not be modified in any way.
> EDIT: Turns out the RX SMs (DELIVER_SM, etc.) were also affected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira