github-advanced-security[bot] commented on code in PR #18544:
URL: https://github.com/apache/druid/pull/18544#discussion_r2361303351


##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentPuller.java:
##########
@@ -112,16 +99,37 @@
         );
         log.info("Loaded %d bytes from [%s] to [%s]", result.size(), 
s3Coords.toString(), outDir.getAbsolutePath());
         return result;
-      }
-      if (CompressionUtils.isGz(s3Coords.getPath())) {
+      } else if (CompressionUtils.isGz(s3Coords.getPath())) {
+        if (!isObjectInBucket(s3Coords)) {
+          throw new SegmentLoadingException("IndexFile[%s] does not exist.", 
s3Coords);
+        }
+        final URI uri = s3Coords.toUri(S3StorageDruidModule.SCHEME);
+        final ByteSource byteSource = getByteSource(uri);
         final String fname = Files.getNameWithoutExtension(uri.getPath());
         final File outFile = new File(outDir, fname);
-
         final FileUtils.FileCopyResult result = 
CompressionUtils.gunzip(byteSource, outFile, S3Utils.S3RETRY);
         log.info("Loaded %d bytes from [%s] to [%s]", result.size(), 
s3Coords.toString(), outFile.getAbsolutePath());
         return result;
+      } else if (s3Coords.getPath().endsWith("/")) {
+        // segment is not compressed, list objects and pull them all
+        final ListObjectsV2Result list = s3Client.listObjectsV2(
+            new ListObjectsV2Request().withBucketName(s3Coords.getBucket())
+                                      .withPrefix(s3Coords.getPath())
+        );
+        FileUtils.FileCopyResult copyResult = new FileUtils.FileCopyResult();
+        for (S3ObjectSummary objectSummary : list.getObjectSummaries()) {
+          final CloudObjectLocation objectLocation = 
S3Utils.summaryToCloudObjectLocation(objectSummary);
+          final URI uri = objectLocation.toUri(S3StorageDruidModule.SCHEME);
+          final ByteSource byteSource = getByteSource(uri);
+          final File outFile = new File(outDir, 
Paths.get(objectLocation.getPath()).getFileName().toString());
+          outFile.createNewFile();

Review Comment:
   ## Uncontrolled data used in path expression
   
   This path depends on a [user-provided value](1).
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10342)



##########
extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentPullerTest.java:
##########
@@ -315,4 +316,74 @@
     EasyMock.verify(s3Client);
     Assert.assertEquals(0, modifiedDate);
   }
+
+  @Test
+  public void testGetNozip() throws IOException, SegmentLoadingException
+  {
+    final String bucket = "bucket";
+    final String keyPrefix = "prefix/dir/0/";
+    final ServerSideEncryptingAmazonS3 s3Client = 
EasyMock.createStrictMock(ServerSideEncryptingAmazonS3.class);
+    final byte[] value = bucket.getBytes(StandardCharsets.UTF_8);
+
+    final File tmpFile = temporaryFolder.newFile("meta.smoosh");
+    final File tmpFile2 = temporaryFolder.newFile("00000.smoosh");
+
+    try (OutputStream outputStream = new FileOutputStream(tmpFile)) {
+      outputStream.write(value);
+    }
+    try (OutputStream outputStream = new FileOutputStream(tmpFile2)) {
+      outputStream.write(value);
+    }
+
+    ListObjectsV2Result list = EasyMock.createMock(ListObjectsV2Result.class);
+    S3ObjectSummary objectSummary1 = 
EasyMock.createMock(S3ObjectSummary.class);
+    
EasyMock.expect(objectSummary1.getBucketName()).andReturn(bucket).anyTimes();
+    EasyMock.expect(objectSummary1.getKey()).andReturn(keyPrefix + 
"meta.smoosh").anyTimes();
+    S3ObjectSummary objectSummary2 = 
EasyMock.createMock(S3ObjectSummary.class);
+    
EasyMock.expect(objectSummary2.getBucketName()).andReturn(bucket).anyTimes();
+    EasyMock.expect(objectSummary2.getKey()).andReturn(keyPrefix + 
"00000.smoosh").anyTimes();
+    
EasyMock.expect(list.getObjectSummaries()).andReturn(List.of(objectSummary1, 
objectSummary2)).once();
+    
EasyMock.expect(s3Client.listObjectsV2(EasyMock.anyObject())).andReturn(list).once();
+
+    final S3Object object1 = new S3Object();
+    object1.setBucketName(bucket);
+    object1.setKey(keyPrefix + "meta.smoosh");
+    object1.getObjectMetadata().setLastModified(new Date(0));
+    object1.setObjectContent(new FileInputStream(tmpFile));
+
+    final S3Object object2 = new S3Object();
+    object2.setBucketName(bucket);
+    object2.setKey(keyPrefix + "00000.smoosh");
+    object2.getObjectMetadata().setLastModified(new Date(0));
+    object2.setObjectContent(new FileInputStream(tmpFile));

Review Comment:
   ## Potential input resource leak
   
   This FileInputStream is not always closed on method exit.
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10345)



##########
extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentKillerTest.java:
##########
@@ -86,6 +90,30 @@
       1
   );
 
+  private static final DataSegment DATA_SEGMENT_1_NO_ZIP = new DataSegment(
+      "test",
+      Intervals.of("2015-04-12/2015-04-13"),
+      "1",
+      ImmutableMap.of("bucket", TEST_BUCKET, "key", KEY_1 + "/"),
+      null,
+      null,
+      NoneShardSpec.instance(),
+      0,
+      1
+  );

Review Comment:
   ## Deprecated method or constructor invocation
   
   Invoking [DataSegment.DataSegment](1) should be avoided because it has been 
deprecated.
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10346)



##########
extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentKillerTest.java:
##########
@@ -86,6 +90,30 @@
       1
   );
 
+  private static final DataSegment DATA_SEGMENT_1_NO_ZIP = new DataSegment(
+      "test",
+      Intervals.of("2015-04-12/2015-04-13"),
+      "1",
+      ImmutableMap.of("bucket", TEST_BUCKET, "key", KEY_1 + "/"),
+      null,
+      null,
+      NoneShardSpec.instance(),
+      0,
+      1
+  );
+
+  private static final DataSegment DATA_SEGMENT_2_NO_ZIP = new DataSegment(
+      "test",
+      Intervals.of("2015-04-13/2015-04-14"),
+      "1",
+      ImmutableMap.of("bucket", TEST_BUCKET, "key", KEY_2 + "/"),
+      null,
+      null,
+      NoneShardSpec.instance(),
+      0,
+      1
+  );

Review Comment:
   ## Deprecated method or constructor invocation
   
   Invoking [DataSegment.DataSegment](1) should be avoided because it has been 
deprecated.
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10347)



##########
extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentPuller.java:
##########
@@ -112,16 +99,37 @@
         );
         log.info("Loaded %d bytes from [%s] to [%s]", result.size(), 
s3Coords.toString(), outDir.getAbsolutePath());
         return result;
-      }
-      if (CompressionUtils.isGz(s3Coords.getPath())) {
+      } else if (CompressionUtils.isGz(s3Coords.getPath())) {
+        if (!isObjectInBucket(s3Coords)) {
+          throw new SegmentLoadingException("IndexFile[%s] does not exist.", 
s3Coords);
+        }
+        final URI uri = s3Coords.toUri(S3StorageDruidModule.SCHEME);
+        final ByteSource byteSource = getByteSource(uri);
         final String fname = Files.getNameWithoutExtension(uri.getPath());
         final File outFile = new File(outDir, fname);
-
         final FileUtils.FileCopyResult result = 
CompressionUtils.gunzip(byteSource, outFile, S3Utils.S3RETRY);
         log.info("Loaded %d bytes from [%s] to [%s]", result.size(), 
s3Coords.toString(), outFile.getAbsolutePath());
         return result;
+      } else if (s3Coords.getPath().endsWith("/")) {
+        // segment is not compressed, list objects and pull them all
+        final ListObjectsV2Result list = s3Client.listObjectsV2(
+            new ListObjectsV2Request().withBucketName(s3Coords.getBucket())
+                                      .withPrefix(s3Coords.getPath())
+        );
+        FileUtils.FileCopyResult copyResult = new FileUtils.FileCopyResult();
+        for (S3ObjectSummary objectSummary : list.getObjectSummaries()) {
+          final CloudObjectLocation objectLocation = 
S3Utils.summaryToCloudObjectLocation(objectSummary);
+          final URI uri = objectLocation.toUri(S3StorageDruidModule.SCHEME);
+          final ByteSource byteSource = getByteSource(uri);
+          final File outFile = new File(outDir, 
Paths.get(objectLocation.getPath()).getFileName().toString());
+          outFile.createNewFile();

Review Comment:
   ## Ignored error status of call
   
   Method getSegmentFiles ignores exceptional return value of 
File.createNewFile.
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10343)



##########
extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentPullerTest.java:
##########
@@ -315,4 +316,74 @@
     EasyMock.verify(s3Client);
     Assert.assertEquals(0, modifiedDate);
   }
+
+  @Test
+  public void testGetNozip() throws IOException, SegmentLoadingException
+  {
+    final String bucket = "bucket";
+    final String keyPrefix = "prefix/dir/0/";
+    final ServerSideEncryptingAmazonS3 s3Client = 
EasyMock.createStrictMock(ServerSideEncryptingAmazonS3.class);
+    final byte[] value = bucket.getBytes(StandardCharsets.UTF_8);
+
+    final File tmpFile = temporaryFolder.newFile("meta.smoosh");
+    final File tmpFile2 = temporaryFolder.newFile("00000.smoosh");
+
+    try (OutputStream outputStream = new FileOutputStream(tmpFile)) {
+      outputStream.write(value);
+    }
+    try (OutputStream outputStream = new FileOutputStream(tmpFile2)) {
+      outputStream.write(value);
+    }
+
+    ListObjectsV2Result list = EasyMock.createMock(ListObjectsV2Result.class);
+    S3ObjectSummary objectSummary1 = 
EasyMock.createMock(S3ObjectSummary.class);
+    
EasyMock.expect(objectSummary1.getBucketName()).andReturn(bucket).anyTimes();
+    EasyMock.expect(objectSummary1.getKey()).andReturn(keyPrefix + 
"meta.smoosh").anyTimes();
+    S3ObjectSummary objectSummary2 = 
EasyMock.createMock(S3ObjectSummary.class);
+    
EasyMock.expect(objectSummary2.getBucketName()).andReturn(bucket).anyTimes();
+    EasyMock.expect(objectSummary2.getKey()).andReturn(keyPrefix + 
"00000.smoosh").anyTimes();
+    
EasyMock.expect(list.getObjectSummaries()).andReturn(List.of(objectSummary1, 
objectSummary2)).once();
+    
EasyMock.expect(s3Client.listObjectsV2(EasyMock.anyObject())).andReturn(list).once();
+
+    final S3Object object1 = new S3Object();
+    object1.setBucketName(bucket);
+    object1.setKey(keyPrefix + "meta.smoosh");
+    object1.getObjectMetadata().setLastModified(new Date(0));
+    object1.setObjectContent(new FileInputStream(tmpFile));

Review Comment:
   ## Potential input resource leak
   
   This FileInputStream is not always closed on method exit.
   
   [Show more 
details](https://github.com/apache/druid/security/code-scanning/10344)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to