Fallon created BEAM-2761:

             Summary: Write to empty BigQuery partition fails with "No schema 
specified on job or table." despite having provided schema
                 Key: BEAM-2761
                 URL: https://issues.apache.org/jira/browse/BEAM-2761
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
    Affects Versions: 2.1.0, 2.2.0
            Reporter: Fallon
            Assignee: Thomas Groh
            Priority: Minor

In 2.1.0-SNAPSHOT and 2.2.0-SNAPSHOT, jobs writing an empty PCollection to a 
BigQuery partition fail with "java.lang.RuntimeException: Failed to create load 
job with id prefix". This is associated with a message "No schema specified on 
job or table" even though a schema is provided.

Command to run job:
mvn compile exec:java 
-Dexec.mainClass=org.apache.beam.examples.EmptyPCollection \
     -Dexec.args="--runner=DataflowRunner --project=<GCP project> \
                  --gcpTempLocation=<tmp location>" \

Code to reproduce the problem:
public class EmptyPCollection {

  public static void main(String[] args) {

    PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create();
    options.setTempLocation("<your tmp location>");
    Pipeline pipeline = Pipeline.create(options);

    String schema = "{\"fields\": [{\"name\": \"pet\", \"type\": \"string\", 
\"mode\": \"required\"}]}";
    String table = "mydataset.pets";
    List<String> pets = Arrays.asList("Dog", "Cat", "Goldfish");
    PCollection<String> inputText = 
    PCollection<TableRow> rows = inputText.apply(ParDo.of(new DoFn<String, 
TableRow>() {
      public void processElement(ProcessContext c) {
        String text = c.element();
        if (text.startsWith("X")) {  // change to (D)og and works fine
          TableRow row = new TableRow();
          row.set("pet", text);




This message was sent by Atlassian JIRA

Reply via email to