[
https://issues.apache.org/jira/browse/BEAM-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Moritz Mack updated BEAM-11788:
-------------------------------
Resolution: Invalid
Status: Resolved (was: Open)
> TextIO.read() will ignore custom AwsServiceEndpoint
> ---------------------------------------------------
>
> Key: BEAM-11788
> URL: https://issues.apache.org/jira/browse/BEAM-11788
> Project: Beam
> Issue Type: Bug
> Components: io-java-aws
> Affects Versions: 2.27.0
> Reporter: Reza Safi
> Priority: P3
>
> For testing purposes we makes use of S3Mock to mimic S3 when working with
> beam. We can do something similar to S3FileSystem unit tests in the apache
> beam code base. First doing the setup for S3Mock and then create a test:
> {code:java}
> public class S3ReadTransformSuite {
> private static S3Mock api;
> private static AmazonS3 client;
> private static S3Options opt;
> @BeforeClass
> public static void beforeClass() {
> api = new S3Mock.Builder().withInMemoryBackend().build();
> Http.ServerBinding binding = api.start();
> AwsClientBuilder.EndpointConfiguration endpoint =
> new AwsClientBuilder.EndpointConfiguration(
> "http://localhost:" +
> binding.localAddress().getPort(), "us-west-2");
> client =
> AmazonS3ClientBuilder.standard()
> .withPathStyleAccessEnabled(true)
> .withEndpointConfiguration(endpoint)
> .withCredentials(new AWSStaticCredentialsProvider(new
> AnonymousAWSCredentials()))
> .build();
> opt = PipelineOptionsFactory.as(S3Options.class);
> //********** setting the custom entry point for opt
> opt.setAwsServiceEndpoint(endpoint.getServiceEndpoint());
> }
> @AfterClass
> public static void afterClass() {
> api.stop();
> }
> @Test
> public void testS3ReadTransform() throws IOException {
> // create a text file locally
> byte[] writtenArray = new byte[] {0};
> ByteBuffer bb = ByteBuffer.allocate(writtenArray.length);
> bb.put(writtenArray);
> File tempDir = Files.createTempDir();
> tempDir.deleteOnExit();
> File output = new File(tempDir, "output.txt");
> String outputDir = output.getAbsolutePath();
> FileWriter fw = new FileWriter(output);
> BufferedWriter bw = new BufferedWriter(fw);
> bw.write("something");
> bw.close();
> // create a bucket and put the local file there
> client.createBucket("mytest");
> client.putObject("mytest","output.txt",outputDir);
> List<com.amazonaws.services.s3.model.Bucket> l =client.listBuckets();
> // check whether the bucket and file in it exist
> assert(l.get(0).getName().equals("mytest"));
>
> assert(client.getObject("mytest","output.txt").getKey().equals("output.txt"));
> // create the pipeline using the options with custom endpointentry
> set earlier
> Pipeline p = TestPipeline.create(opt);
> // the following line will fail with FileNotFound exception
> PCollection<String> sr = p.apply(
> TextIO.read().from("s3://mytest/output.txt")
> );
> PAssert.that(sr).containsInAnyOrder("something");
> p.run();
> }
> }
> {code}
> The above test will always with an error like:
> {code:java}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException:
> java.io.FileNotFoundException: No files matched spec: s3://mytest/output.txt
> at
> test.mycomp.S3ReadTransformSuite.testS3ReadTransform(S3ReadTransformSuite.java:125)
> Caused by: java.io.FileNotFoundException: No files matched spec:
> s3://mytest/output.txt
> {code}
> (It should be mentioned that for this test we added "127.0.0.1
> mytest.localhost" to "/etc/hosts" to workaround another known issue.)
> It seems that TextIO.read().from() ignores the entrypoint that is set in the
> pipeline options and basically it tries to read from the normal s3 and as a
> result it will fail.
> This issue restricts our ability to write unit tests for custom stuff like
> new Ptrasnforms, since we need to mock s3 for unit tests.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)