Ismaël, The latest postcommit failure https://builds.apache.org/job/beam_PostCommit_SQL/3951/ was 4:15:45 PM Brian's successful case https://builds.apache.org/job/beam_PostCommit_SQL_PR/243/ started 4:29:57 PM. I hope the next SQL postcommit should succeed.
On Thu, Feb 6, 2020 at 11:57 AM Ismaël Mejía <ieme...@gmail.com> wrote: > > This one is still broken, maybe there are two different data sources, one for > the '_PR' version and the normal one, can you please confirm? > https://builds.apache.org/job/beam_PostCommit_SQL/ > > > On Thu, Feb 6, 2020 at 5:44 PM Brian Hulette <bhule...@google.com> wrote: >> >> Sorry for the delay. I had some issues updating the schema, I ended up >> having to drop it and re-create for some reason. Looks like SQL PostCommit >> is green on https://github.com/apache/beam/pull/10765 now. >> >> > setting up from scratch is a good idea. >> +1, I filed https://issues.apache.org/jira/browse/BEAM-9260 for this and >> assigned it to Kenn for now. >> >> > I’m still feeling it’s ok to read an Integer as Long. >> In this case the issue is that we were reading data with a schema (id: >> INT64), and then passing it to a PAssert that checked against Rows with a >> different schema (id: INT32). So I think it's a legitimate error because we >> were asserting that a PCollection contained rows with one schema but it >> actually had a different schema. The message should definitely be better in >> this case though. >> >> Brian >> >> On Wed, Feb 5, 2020 at 9:28 PM Tomo Suzuki <suzt...@google.com> wrote: >>> >>> (My understanding) >>> The test ensures the CSV data stored in GCS should be readable through >>> Datacatalog. It fails because an Integer value in the CSV was read as Long >>> as per Datacatalog. >>> >>> >>> > setting up from scratch is a good idea. >>> >>> I agree. Furthermore, it would be nice if it can test different type-cast >>> behaviors. I’m still feeling it’s ok to read an Integer as Long. (If this >>> is the case, how about Long to Integer? What if the long is small enough to >>> fit in 32 bits? and so on) >>> >>> On Wed, Feb 5, 2020 at 23:15 Kenneth Knowles <k...@apache.org> wrote: >>>> >>>> I think that was me... sorry! >>>> >>>> Is this a test where it is important that the data is pre-existing? >>>> Otherwise I would say that setting up from scratch is a good idea. Does >>>> anyone have context on it? I am happy to take on the small bit of coding, >>>> since I broke it. >>>> >>>> Kenn >>>> >>>> On Wed, Feb 5, 2020 at 1:22 PM Brian Hulette <bhule...@google.com> wrote: >>>>> >>>>> So it looks like the schema for `integ_test_small_csv_test_1` was updated >>>>> yesterday around the same time that PR#10563 went in, and it no longer >>>>> matches the schema we expect in the test. >>>>> >>>>> I'm just going to change it back for now. I am curious who changed it and >>>>> why, if the perpetrator is on this list please let us know :) >>>>> >>>>> >>>>> Note the updateTime: >>>>> ``` >>>>> ❯ gcloud beta data-catalog entries lookup >>>>> '`datacatalog`.`entry`.`apache-beam-testing`.`us-central1`.`samples`.`integ_test_small_csv_test_1`'`` >>>>> gcsFilesetSpec: >>>>> filePatterns: >>>>> - gs://apache-beam-samples/integration_test_small_csv/test.csv >>>>> linkedResource: >>>>> //datacatalog.googleapis.com/projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1 >>>>> name: >>>>> projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1 >>>>> schema: >>>>> columns: >>>>> - column: id >>>>> mode: NULLABLE >>>>> type: INT64 >>>>> - column: name >>>>> mode: NULLABLE >>>>> type: STRING >>>>> - column: type >>>>> mode: NULLABLE >>>>> type: STRING >>>>> sourceSystemTimestamps: >>>>> createTime: '2019-08-16T01:49:06.235Z' >>>>> updateTime: '2020-02-04T17:18:17.671Z' >>>>> type: FILESET >>>>> ``` >>> >>> -- >>> Regards, >>> Tomo -- Regards, Tomo