[
https://issues.apache.org/jira/browse/COMPRESS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698727#action_12698727
]
Sebb commented on COMPRESS-63:
------------------------------
For example:
final byte[] expected = ArArchiveEntry.HEADER.getBytes();
and
final byte[] expected = ArArchiveEntry.TRAILER.getBytes();
both depend on the default encoding.
For "magic" strings - such as HEADER and TRAILER - I think we can assume that
ASCII is OK to use.
If there are any other conversions to/from String, then it may depend on the
archive type or indeed the archive itself if it allows different encodings.
These need to be fixed and documented.
Note that the Turkish character set in particular has some unexpected features,
e.g. upper case "i" has a special character which is not the same as "I".
==
As to repeated encoding of the same strings - byte arrays are tricky to protect
against malicious/accidental changes, so it may be best to ignore the overhead
of the repeated conversions for now.
> String#getBytes() is platform dependent
> ---------------------------------------
>
> Key: COMPRESS-63
> URL: https://issues.apache.org/jira/browse/COMPRESS-63
> Project: Commons Compress
> Issue Type: Bug
> Reporter: Sebb
>
> Many methods use the getBytes() method on Strings, however getBytes() uses
> the platform default encoding, which may not be suitable.
> It's also a bit inefficient to keep encoding the same strings.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.