This one is still broken, maybe there are two different data sources, one for the '_PR' version and the normal one, can you please confirm? https://builds.apache.org/job/beam_PostCommit_SQL/
On Thu, Feb 6, 2020 at 5:44 PM Brian Hulette <bhule...@google.com> wrote: > Sorry for the delay. I had some issues updating the schema, I ended up > having to drop it and re-create for some reason. Looks like SQL PostCommit > is green on https://github.com/apache/beam/pull/10765 now. > > > setting up from scratch is a good idea. > +1, I filed https://issues.apache.org/jira/browse/BEAM-9260 for this and > assigned it to Kenn for now. > > > I’m still feeling it’s ok to read an Integer as Long. > In this case the issue is that we were reading data with a schema (id: > INT64), and then passing it to a PAssert that checked against Rows with a > different schema (id: INT32). So I think it's a legitimate error because we > were asserting that a PCollection contained rows with one schema but it > actually had a different schema. The message should definitely be better in > this case though. > > Brian > > On Wed, Feb 5, 2020 at 9:28 PM Tomo Suzuki <suzt...@google.com> wrote: > >> (My understanding) >> The test ensures the CSV data stored in GCS should be readable through >> Datacatalog. It fails because an Integer value in the CSV was read as Long >> as per Datacatalog. >> >> >> > setting up from scratch is a good idea. >> >> I agree. Furthermore, it would be nice if it can test different type-cast >> behaviors. I’m still feeling it’s ok to read an Integer as Long. (If this >> is the case, how about Long to Integer? What if the long is small enough to >> fit in 32 bits? and so on) >> >> On Wed, Feb 5, 2020 at 23:15 Kenneth Knowles <k...@apache.org> wrote: >> >>> I think that was me... sorry! >>> >>> Is this a test where it is important that the data is pre-existing? >>> Otherwise I would say that setting up from scratch is a good idea. Does >>> anyone have context on it? I am happy to take on the small bit of coding, >>> since I broke it. >>> >>> Kenn >>> >>> On Wed, Feb 5, 2020 at 1:22 PM Brian Hulette <bhule...@google.com> >>> wrote: >>> >>>> So it looks like the schema for `integ_test_small_csv_test_1` was >>>> updated yesterday around the same time that PR#10563 went in, and it no >>>> longer matches the schema we expect in the test. >>>> >>>> I'm just going to change it back for now. I am curious who changed it >>>> and why, if the perpetrator is on this list please let us know :) >>>> >>>> >>>> Note the updateTime: >>>> ``` >>>> ❯ gcloud beta data-catalog entries lookup >>>> '`datacatalog`.`entry`.`apache-beam-testing`.`us-central1`.`samples`.`integ_test_small_csv_test_1`'`` >>>> gcsFilesetSpec: >>>> filePatterns: >>>> - gs://apache-beam-samples/integration_test_small_csv/test.csv >>>> linkedResource: // >>>> datacatalog.googleapis.com/projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1 >>>> name: >>>> projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1 >>>> schema: >>>> columns: >>>> - column: id >>>> mode: NULLABLE >>>> type: INT64 >>>> - column: name >>>> mode: NULLABLE >>>> type: STRING >>>> - column: type >>>> mode: NULLABLE >>>> type: STRING >>>> sourceSystemTimestamps: >>>> createTime: '2019-08-16T01:49:06.235Z' >>>> updateTime: '2020-02-04T17:18:17.671Z' >>>> type: FILESET >>>> ``` >>>> >>> -- >> Regards, >> Tomo >> >