[jira] [Updated] (BEAM-11788) TextIO.read() will ignore custom AwsServiceEndpoint

Moritz Mack (Jira) Fri, 10 Dec 2021 08:10:07 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Moritz Mack updated BEAM-11788:
-------------------------------
    Resolution: Invalid
        Status: Resolved  (was: Open)

> TextIO.read() will ignore custom AwsServiceEndpoint
> ---------------------------------------------------
>
>                 Key: BEAM-11788
>                 URL: https://issues.apache.org/jira/browse/BEAM-11788
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws
>    Affects Versions: 2.27.0
>            Reporter: Reza Safi
>            Priority: P3
>
> For testing purposes we makes use of S3Mock to mimic S3 when working with 
> beam. We can do something similar to S3FileSystem unit tests in the apache 
> beam code base. First doing the setup for S3Mock and then create a test:
> {code:java}
> public class S3ReadTransformSuite {
>     private static S3Mock api;
>     private static AmazonS3 client;
>     private static S3Options opt;
>     @BeforeClass
>     public static void beforeClass() {
>         api = new S3Mock.Builder().withInMemoryBackend().build();
>         Http.ServerBinding binding = api.start();
>         AwsClientBuilder.EndpointConfiguration endpoint =
>                 new AwsClientBuilder.EndpointConfiguration(
>                         "http://localhost:"; + 
> binding.localAddress().getPort(), "us-west-2");
>         client =
>                 AmazonS3ClientBuilder.standard()
>                         .withPathStyleAccessEnabled(true)
>                         .withEndpointConfiguration(endpoint)
>                         .withCredentials(new AWSStaticCredentialsProvider(new 
> AnonymousAWSCredentials()))
>                         .build();
>         opt = PipelineOptionsFactory.as(S3Options.class);
>        //********** setting the custom entry point for opt
>         opt.setAwsServiceEndpoint(endpoint.getServiceEndpoint());
>     }
>     @AfterClass
>     public static void afterClass() {
>        api.stop();
>     }
>     @Test
>     public void testS3ReadTransform() throws IOException {
>         // create a text file locally
>         byte[] writtenArray = new byte[] {0};
>         ByteBuffer bb = ByteBuffer.allocate(writtenArray.length);
>         bb.put(writtenArray);
>         File tempDir = Files.createTempDir();
>         tempDir.deleteOnExit();
>         File output = new File(tempDir, "output.txt");
>         String outputDir = output.getAbsolutePath();
>         FileWriter fw = new FileWriter(output);
>         BufferedWriter bw = new BufferedWriter(fw);
>         bw.write("something");
>         bw.close();
>         // create a bucket and put the local file there
>         client.createBucket("mytest");
>         client.putObject("mytest","output.txt",outputDir);
>         List<com.amazonaws.services.s3.model.Bucket> l =client.listBuckets();
>         // check whether the bucket and file in it exist
>         assert(l.get(0).getName().equals("mytest"));
>         
> assert(client.getObject("mytest","output.txt").getKey().equals("output.txt"));
>          // create the pipeline using the options with custom endpointentry 
> set earlier
>          Pipeline p = TestPipeline.create(opt);
>          // the following line will fail with FileNotFound exception 
>          PCollection<String> sr = p.apply(
>                 TextIO.read().from("s3://mytest/output.txt")
>          );
>         PAssert.that(sr).containsInAnyOrder("something");
>         p.run();
>     }
> }
> {code}
> The above test will always with an error like:
> {code:java}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.io.FileNotFoundException: No files matched spec: s3://mytest/output.txt
>       at 
> test.mycomp.S3ReadTransformSuite.testS3ReadTransform(S3ReadTransformSuite.java:125)
> Caused by: java.io.FileNotFoundException: No files matched spec: 
> s3://mytest/output.txt
> {code}
> (It should be mentioned that for this test we added "127.0.0.1 
> mytest.localhost" to "/etc/hosts" to workaround another known issue.)
> It seems that TextIO.read().from() ignores the entrypoint that is set in the 
> pipeline options and basically it tries to read from the normal s3 and as a 
> result it will fail.
> This issue restricts our ability to write unit tests for custom stuff like 
> new Ptrasnforms, since we need to mock s3 for unit tests. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (BEAM-11788) TextIO.read() will ignore custom AwsServiceEndpoint

Reply via email to