[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception

2022-06-03 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17546972#comment-17546972
 ] 

Kenneth Knowles commented on BEAM-4486:
---

This issue has been migrated to https://github.com/apache/beam/issues/18778

> BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing 
> schema exception
> --
>
> Key: BEAM-4486
> URL: https://issues.apache.org/jira/browse/BEAM-4486
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Glenn Ammons
>Priority: P3
>
> Our pipeline gets this error from BigQuery when using 
> BigQueryIO.Write.Method.FILE_LOADS, 
> BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time 
> partitioning (full exception at the bottom of this note):
>     Table with field based partitioning must have a schema.
> We do supply a schema when we create the pipeline by calling 
> BigQuery.Write.withSchema, but this schema is ignored because the 
> processElement method here:
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java]
> always provides a null schema when using CREATE_NEVER.
> I would expect Beam to use the provided schema no matter what setting we are 
> using for the CreateDisposition.
>  
> Full exception:
> java.io.IOException: Unable to insert job: 
> 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0,
>  aborting after 9 . 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  Caused by: 
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad 
> Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : 
> "Table with field based partitioning must have a schema.", "reason" : 
> "invalid" } ], "message" : "Table with field based partitioning must have a 
> schema." } 
> com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
>  com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown
>  Source) 
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
>  
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
>  
> com.google.cloud.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
>  
> com.google.cloud.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:300)
>  
> com.google.cloud.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:226)
>  
> com.google.cloud.dataflow.worker.util.common.

[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception

2020-06-16 Thread Beam JIRA Bot (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17137759#comment-17137759
 ] 

Beam JIRA Bot commented on BEAM-4486:
-

This issue was marked "stale-P2" and has not received a public comment in 14 
days. It is now automatically moved to P3. If you are still affected by it, you 
can comment and move it back to P2.

> BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing 
> schema exception
> --
>
> Key: BEAM-4486
> URL: https://issues.apache.org/jira/browse/BEAM-4486
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Glenn Ammons
>Priority: P3
>
> Our pipeline gets this error from BigQuery when using 
> BigQueryIO.Write.Method.FILE_LOADS, 
> BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time 
> partitioning (full exception at the bottom of this note):
>     Table with field based partitioning must have a schema.
> We do supply a schema when we create the pipeline by calling 
> BigQuery.Write.withSchema, but this schema is ignored because the 
> processElement method here:
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java]
> always provides a null schema when using CREATE_NEVER.
> I would expect Beam to use the provided schema no matter what setting we are 
> using for the CreateDisposition.
>  
> Full exception:
> java.io.IOException: Unable to insert job: 
> 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0,
>  aborting after 9 . 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  Caused by: 
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad 
> Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : 
> "Table with field based partitioning must have a schema.", "reason" : 
> "invalid" } ], "message" : "Table with field based partitioning must have a 
> schema." } 
> com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
>  com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown
>  Source) 
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
>  
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
>  
> com.google.cloud.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
>  
> com.google.cloud.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:300)
>  
> com.google.cloud.da

[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception

2020-06-01 Thread Beam JIRA Bot (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123322#comment-17123322
 ] 

Beam JIRA Bot commented on BEAM-4486:
-

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing 
> schema exception
> --
>
> Key: BEAM-4486
> URL: https://issues.apache.org/jira/browse/BEAM-4486
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Glenn Ammons
>Priority: P2
>  Labels: stale-P2
>
> Our pipeline gets this error from BigQuery when using 
> BigQueryIO.Write.Method.FILE_LOADS, 
> BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time 
> partitioning (full exception at the bottom of this note):
>     Table with field based partitioning must have a schema.
> We do supply a schema when we create the pipeline by calling 
> BigQuery.Write.withSchema, but this schema is ignored because the 
> processElement method here:
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java]
> always provides a null schema when using CREATE_NEVER.
> I would expect Beam to use the provided schema no matter what setting we are 
> using for the CreateDisposition.
>  
> Full exception:
> java.io.IOException: Unable to insert job: 
> 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0,
>  aborting after 9 . 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  Caused by: 
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad 
> Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : 
> "Table with field based partitioning must have a schema.", "reason" : 
> "invalid" } ], "message" : "Table with field based partitioning must have a 
> schema." } 
> com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
>  
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
>  com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
>  
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204)
>  
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144)
>  org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) 
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155)
>  
> org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown
>  Source) 
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
>  
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
>  
> com.google.cloud.datafl