[ https://issues.apache.org/jira/browse/COMPRESS-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16301746#comment-16301746 ]
Stefan Bodewig commented on COMPRESS-429: ----------------------------------------- Thanks [~dalbani] and sorry for the delay. I will find time to look into your patch the coming week (but not during Christmas :-). > Expose whether ZIP entry name & comment come from Unicode extra field > --------------------------------------------------------------------- > > Key: COMPRESS-429 > URL: https://issues.apache.org/jira/browse/COMPRESS-429 > Project: Commons Compress > Issue Type: Improvement > Reporter: Damiano Albani > Priority: Minor > Labels: Unicode, ZIP > > It is known fact that detecting the encoding of the name/comment of ZIP > entries is a messy process. And that the general purpose bit 11 is often > unreliable. > Only the so-called Unicode extra field (if present) can be trusted to > reliably determine a ZIP entry name & comment, as far as I understand. > But the current API of Commons Compress doesn't (easily) expose in which > situation the ZIP archive reader is. > That's why I propose to add a couple of new getter/setter-exposed fields to > {{ZipArchiveEntry}}, e.g.: > {noformat} > boolean hasUnicodeName > boolean hasUnicodeComment > {noformat} > This way it can be easily determined if the value returned by > {{ZipArchiveEntry::getName}} or {{ZipArchiveEntry::getComment}} can be > trusted. Or if it needs some "character encoding sniffing" of sorts. > What do you think? -- This message was sent by Atlassian JIRA (v6.4.14#64029)