[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception
[ https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17546972#comment-17546972 ] Kenneth Knowles commented on BEAM-4486: --- This issue has been migrated to https://github.com/apache/beam/issues/18778 > BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing > schema exception > -- > > Key: BEAM-4486 > URL: https://issues.apache.org/jira/browse/BEAM-4486 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Glenn Ammons >Priority: P3 > > Our pipeline gets this error from BigQuery when using > BigQueryIO.Write.Method.FILE_LOADS, > BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time > partitioning (full exception at the bottom of this note): > Table with field based partitioning must have a schema. > We do supply a schema when we create the pipeline by calling > BigQuery.Write.withSchema, but this schema is ignored because the > processElement method here: > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java] > always provides a null schema when using CREATE_NEVER. > I would expect Beam to use the provided schema no matter what setting we are > using for the CreateDisposition. > > Full exception: > java.io.IOException: Unable to insert job: > 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0, > aborting after 9 . > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > Caused by: > com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad > Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : > "Table with field based partitioning must have a schema.", "reason" : > "invalid" } ], "message" : "Table with field based partitioning must have a > schema." } > com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321) > com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown > Source) > org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177) > > org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138) > > com.google.cloud.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) > > com.google.cloud.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:300) > > com.google.cloud.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:226) > > com.google.cloud.dataflow.worker.util.common.
[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception
[ https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17137759#comment-17137759 ] Beam JIRA Bot commented on BEAM-4486: - This issue was marked "stale-P2" and has not received a public comment in 14 days. It is now automatically moved to P3. If you are still affected by it, you can comment and move it back to P2. > BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing > schema exception > -- > > Key: BEAM-4486 > URL: https://issues.apache.org/jira/browse/BEAM-4486 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Glenn Ammons >Priority: P3 > > Our pipeline gets this error from BigQuery when using > BigQueryIO.Write.Method.FILE_LOADS, > BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time > partitioning (full exception at the bottom of this note): > Table with field based partitioning must have a schema. > We do supply a schema when we create the pipeline by calling > BigQuery.Write.withSchema, but this schema is ignored because the > processElement method here: > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java] > always provides a null schema when using CREATE_NEVER. > I would expect Beam to use the provided schema no matter what setting we are > using for the CreateDisposition. > > Full exception: > java.io.IOException: Unable to insert job: > 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0, > aborting after 9 . > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > Caused by: > com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad > Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : > "Table with field based partitioning must have a schema.", "reason" : > "invalid" } ], "message" : "Table with field based partitioning must have a > schema." } > com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321) > com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown > Source) > org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177) > > org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138) > > com.google.cloud.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) > > com.google.cloud.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:300) > > com.google.cloud.da
[jira] [Commented] (BEAM-4486) BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing schema exception
[ https://issues.apache.org/jira/browse/BEAM-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123322#comment-17123322 ] Beam JIRA Bot commented on BEAM-4486: - This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > BigQuery: FILE_LOADS + CREATE_NEVER + field-based partitioning => missing > schema exception > -- > > Key: BEAM-4486 > URL: https://issues.apache.org/jira/browse/BEAM-4486 > Project: Beam > Issue Type: Bug > Components: io-java-gcp >Affects Versions: 2.4.0 >Reporter: Glenn Ammons >Priority: P2 > Labels: stale-P2 > > Our pipeline gets this error from BigQuery when using > BigQueryIO.Write.Method.FILE_LOADS, > BigQueryIO.Write.CreateDisposition.CREATE_NEVER, and field-based time > partitioning (full exception at the bottom of this note): > Table with field based partitioning must have a schema. > We do supply a schema when we create the pipeline by calling > BigQuery.Write.withSchema, but this schema is ignored because the > processElement method here: > [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java] > always provides a null schema when using CREATE_NEVER. > I would expect Beam to use the provided schema no matter what setting we are > using for the CreateDisposition. > > Full exception: > java.io.IOException: Unable to insert job: > 078646f70a664daaa1ed96832b233036_19e873cd24cf1968559515e49b3d868d_1_0-0, > aborting after 9 . > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:236) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > Caused by: > com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad > Request \{ "code" : 400, "errors" : [ { "domain" : "global", "message" : > "Table with field based partitioning must have a schema.", "reason" : > "invalid" } ], "message" : "Table with field based partitioning must have a > schema." } > com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113) > > com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321) > com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1065) > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) > > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:218) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:204) > > org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startLoadJob(BigQueryServicesImpl.java:144) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.load(WriteTables.java:259) > org.apache.beam.sdk.io.gcp.bigquery.WriteTables.access$600(WriteTables.java:77) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn.processElement(WriteTables.java:155) > > org.apache.beam.sdk.io.gcp.bigquery.WriteTables$WriteTablesDoFn$DoFnInvoker.invokeProcessElement(Unknown > Source) > org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177) > > org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138) > > com.google.cloud.datafl