DamnedElric commented on code in PR #378:
URL: https://github.com/apache/commons-compress/pull/378#discussion_r1408469557


##########
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java:
##########
@@ -368,9 +369,14 @@ public static void closeQuietly(final ZipFile zipFile) {
     private final List<ZipArchiveEntry> entries = new LinkedList<>();
 
     /**
-     * Maps String to list of ZipArchiveEntrys, name -> actual entries.
+     * Maps a string to the first entry named it.
      */
-    private final Map<String, LinkedList<ZipArchiveEntry>> nameMap = new 
HashMap<>(HASH_SIZE);
+    private final Map<String, ZipArchiveEntry> nameMap = new 
HashMap<>(HASH_SIZE);
+
+    /**
+     * If multiple entries have the same name, maps the name to entries named 
it.

Review Comment:
   More nitpicking: "If multiple entries have the same name, maps the name to 
entries by that name."



##########
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java:
##########
@@ -368,9 +369,14 @@ public static void closeQuietly(final ZipFile zipFile) {
     private final List<ZipArchiveEntry> entries = new LinkedList<>();
 
     /**
-     * Maps String to list of ZipArchiveEntrys, name -> actual entries.
+     * Maps a string to the first entry named it.

Review Comment:
   Nitpicking: "Maps a string to the first entry by that name" sounds clearer 
to me.



##########
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java:
##########
@@ -820,7 +839,13 @@ public Enumeration<ZipArchiveEntry> getEntries() {
      * @since 1.6
      */
     public Iterable<ZipArchiveEntry> getEntries(final String name) {
-        return nameMap.getOrDefault(name, ZipArchiveEntry.EMPTY_LINKED_LIST);

Review Comment:
   You mentioned a potential 30% slowdown in this method in return for 
additional safety, which may or may not be a fair tradeoff. But I'm a bit 
concerned that this change potentially breaks the current behaviour. As you 
say, this currently returns a modifiable LinkedList, which some people may 
(though I hope not) depend on? Because it requires an explicit cast, I'm not 
too worried.



##########
src/main/java/org/apache/commons/compress/archivers/zip/ZipFile.java:
##########
@@ -749,8 +755,21 @@ private void fillNameMap() {
             // entries are filled in populateFromCentralDirectory and
             // never modified
             final String name = ze.getName();
-            final LinkedList<ZipArchiveEntry> entriesOfThatName = 
nameMap.computeIfAbsent(name, k -> new LinkedList<>());
-            entriesOfThatName.addLast(ze);
+            final ZipArchiveEntry firstEntry = nameMap.putIfAbsent(name, ze);
+
+            if (firstEntry != null) {
+                if (duplicateNameMap == null) {
+                    duplicateNameMap = new HashMap<>();
+                }
+
+                final List<ZipArchiveEntry> entriesOfThatName = 
duplicateNameMap.computeIfAbsent(name, k -> {
+                    final ArrayList<ZipArchiveEntry> list = new ArrayList<>(2);

Review Comment:
   Small remark: `final List<ZipArchiveEntry> list = new ArrayList<>(2);` would 
have my preference here (List on the left hand side as opposed to ArrayList).
   
   As Gary already indicated, the magic number 2 is non-obvious. A simple 
comment (or constant) would make this clearer.
   The default initial capacity of ArrayList is 10, and I agree that you're 
unlikely to encounter many zips with duplicates, let alone 10-fold duplicates.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to