ppkarwasz commented on code in PR #710:
URL: https://github.com/apache/commons-compress/pull/710#discussion_r2378378137
##########
src/main/java/org/apache/commons/compress/archivers/sevenz/SevenZFile.java:
##########
@@ -1353,7 +1365,10 @@ private void readFilesInfo(final ByteBuffer header,
final Archive archive) throw
for (int i = 0; i < namesLength; i += 2) {
if (names[i] == 0 && names[i + 1] == 0) {
computeIfAbsent(fileMap, nextFile);
- fileMap.get(nextFile).setName(new String(names,
nextName, i - nextName, UTF_16LE));
+ // Entry name length in UTF-16LE characters (not bytes)
+ final int entryNameLength =
+ ArchiveUtils.checkEntryNameLength((i -
nextName) / 2, maxEntryNameLength, "7z");
+ fileMap.get(nextFile).setName(new String(names,
nextName, 2 * entryNameLength, UTF_16LE));
Review Comment:
I documented `maxEntryNameLength` as limiting the path length in **bytes**,
based on the character set provided through `setCharset()`.
In this case, however, the 7z format mandates **UTF-16LE** for file names,
so the limit is enforced in *characters* rather than raw bytes.
This prevents the effective maximum length from being cut in half: UTF-16LE
always uses 2 bytes per character, whereas most other archive formats rely on
encodings where ASCII fits in 1 byte. Applying the documented
`maxEntryNameLength` directly could therefore be surprising to users, while
recalculating the exact byte size with `getCharset()` would add unnecessary
overhead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]