frankgh commented on code in PR #106:
URL: https://github.com/apache/cassandra-sidecar/pull/106#discussion_r1518878637


##########
src/main/java/org/apache/cassandra/sidecar/restore/RestoreJobUtil.java:
##########
@@ -34,23 +34,38 @@
 import org.slf4j.LoggerFactory;
 
 import com.datastax.driver.core.utils.UUIDs;
-import net.jpountz.xxhash.StreamingXXHash32;
-import net.jpountz.xxhash.XXHashFactory;
+import com.google.inject.Inject;
+import com.google.inject.Singleton;
+import com.google.inject.name.Named;
 import org.apache.cassandra.sidecar.exceptions.RestoreJobException;
 import org.apache.cassandra.sidecar.exceptions.RestoreJobFatalException;
+import org.apache.cassandra.sidecar.utils.DigestAlgorithm;
+import org.apache.cassandra.sidecar.utils.DigestAlgorithmProvider;
 
 /**
- * Utilities that only makes sense in the context of restore jobs. Avoid using 
it in the other scenarios.
+ * Utilities that only makes sense in the context of restore jobs.
+ *
+ * Note: Avoid using it in the other scenarios.
+ *
  */
+@Singleton
 public class RestoreJobUtil
 {
     private static final Logger LOGGER = 
LoggerFactory.getLogger(RestoreJobUtil.class);
-    private RestoreJobUtil() {}
     private static final int KB_512 = 512 * 1024;
     // it is part of upload id and get validated by
     // 
org.apache.cassandra.sidecar.utils.SSTableUploadsPathBuilder.UPLOAD_ID_PATTERN
     private static final String RESTORE_JOB_PREFIX = "c0ffee-";
     private static final int RESTORE_JOB_PREFIX_LEN = 
RESTORE_JOB_PREFIX.length();
+    private static final int RESTORE_JOB_DEFAULT_HASH_SEED = 0;
+
+    private DigestAlgorithmProvider digestAlgorithmProvider;
+
+    @Inject
+    public RestoreJobUtil(@Named("xxhash32") DigestAlgorithmProvider 
digestAlgorithmProvider)

Review Comment:
   something to think for a future improvement. I think the manifest file 
should have metadata including the hashing algorithm, and we should pick the 
correct implementation based on that metadata file



##########
src/main/java/org/apache/cassandra/sidecar/restore/RestoreJobUtil.java:
##########
@@ -211,4 +167,60 @@ public static void cleanDirectory(Path path) throws 
IOException
             });
         }
     }
+
+    /**
+     * Create a file that is protected from zip slip attack 
(https://security.snyk.io/research/zip-slip-vulnerability)
+     * @param zipEntry zip entry to be extracted
+     * @param targetDir directory to keep the unzipped files
+     * @return a new file
+     * @throws IOException failed to resolving path
+     * @throws RestoreJobException if the zip file is malicious
+     */
+    private static File newProtectedTargetFile(ZipEntry zipEntry, File 
targetDir)
+    throws IOException, RestoreJobException
+    {
+        File targetFile = new File(targetDir, zipEntry.getName());
+
+        // Normalize the paths of both target dir and file
+        String targetDirPath = targetDir.getCanonicalPath();
+        String targetFilePath = targetFile.getCanonicalPath();
+
+        if (!targetFilePath.startsWith(targetDirPath))
+        {
+            throw new RestoreJobException("Bad zip entry: " + 
zipEntry.getName());
+        }
+
+        return targetFile;
+    }
+
+    /**
+     * @param file the file to use to perform the checksum
+     * @return the checksum hex string of the file's content. XXHash32 is 
employed as the hash algorithm.
+     */
+    public String checksum(File file) throws IOException
+    {
+        return checksum(file, RESTORE_JOB_DEFAULT_HASH_SEED);
+    }
+
+    /**
+     * @param file the file to use to perform the checksum
+     * @param seed the seed to use for the hasher
+     * @return the checksum hex string of the file's content. XXHash32 is 
employed as the hash algorithm.
+     */
+    public String checksum(File file, int seed) throws IOException
+    {
+        try (FileInputStream fis = new FileInputStream(file))
+        {
+            try (DigestAlgorithm digestAlgorithm = 
digestAlgorithmProvider.get(seed))

Review Comment:
   NIT: 
   ```suggestion
           try (FileInputStream fis = Files.newInputStream(file.toPath());
                  DigestAlgorithm digestAlgorithm = 
digestAlgorithmProvider.get(seed))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to