[ https://issues.apache.org/jira/browse/BEAM-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916053#comment-16916053 ]
Harshit Dwivedi commented on BEAM-8089: --------------------------------------- The data ingested into GCS is around 250Gb for us per day, so we are incurring a lot of network charges. I wanted to avoid this charge by storing everything in Dataflow PD instead of GCS. > Error while using customGcsTempLocation() with Dataflow > ------------------------------------------------------- > > Key: BEAM-8089 > URL: https://issues.apache.org/jira/browse/BEAM-8089 > Project: Beam > Issue Type: Bug > Components: io-java-gcp > Affects Versions: 2.13.0 > Reporter: Harshit Dwivedi > Assignee: Chamikara Jayalath > Priority: Major > > I have the following code snippet which writes content to BigQuery via File > Loads. > Currently the files are being written to a GCS Bucket, but I want to write > them to the local file storage of Dataflow instead and want BigQuery to load > data from there. > > > > {code:java} > BigQueryIO > .writeTableRows() > .withNumFileShards(100) > .withTriggeringFrequency(Duration.standardSeconds(90)) > .withMethod(BigQueryIO.Write.Method.FILE_LOADS) > .withSchema(getSchema()) > .withoutValidation() > .withCustomGcsTempLocation(new ValueProvider<String>() { > @Override > public String get(){ > return "/home/harshit/testFiles"; > } > @Override > public boolean isAccessible(){ > return true; > }}) > .withTimePartitioning(new TimePartitioning().setType("DAY")) > .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) > .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND) > .to(tableName)); > {code} > > > On running this, I don't see any files being written to the provided path and > the BQ load jobs fail with an IOException. > > I looked at the docs, but I was unable to find any working example for this. -- This message was sent by Atlassian Jira (v8.3.2#803003)