This one is still broken, maybe there are two different data sources, one
for the '_PR' version and the normal one, can you please confirm?
https://builds.apache.org/job/beam_PostCommit_SQL/


On Thu, Feb 6, 2020 at 5:44 PM Brian Hulette <bhule...@google.com> wrote:

> Sorry for the delay. I had some issues updating the schema, I ended up
> having to drop it and re-create for some reason. Looks like SQL PostCommit
> is green on https://github.com/apache/beam/pull/10765 now.
>
> > setting up from scratch is a good idea.
> +1, I filed https://issues.apache.org/jira/browse/BEAM-9260 for this and
> assigned it to Kenn for now.
>
> > I’m still feeling it’s ok to read an Integer as Long.
> In this case the issue is that we were reading data with a schema (id:
> INT64), and then passing it to a PAssert that checked against Rows with a
> different schema (id: INT32). So I think it's a legitimate error because we
> were asserting that a PCollection contained rows with one schema but it
> actually had a different schema. The message should definitely be better in
> this case though.
>
> Brian
>
> On Wed, Feb 5, 2020 at 9:28 PM Tomo Suzuki <suzt...@google.com> wrote:
>
>> (My understanding)
>> The test ensures the CSV data stored in GCS should be readable through
>> Datacatalog. It fails because an Integer value in the CSV was read as Long
>> as per Datacatalog.
>>
>>
>> > setting up from scratch is a good idea.
>>
>> I agree. Furthermore, it would be nice if it can test different type-cast
>> behaviors. I’m still feeling it’s ok to read an Integer as Long. (If this
>> is the case, how about Long to Integer? What if the long is small enough to
>> fit in 32 bits? and so on)
>>
>> On Wed, Feb 5, 2020 at 23:15 Kenneth Knowles <k...@apache.org> wrote:
>>
>>> I think that was me... sorry!
>>>
>>> Is this a test where it is important that the data is pre-existing?
>>> Otherwise I would say that setting up from scratch is a good idea. Does
>>> anyone have context on it? I am happy to take on the small bit of coding,
>>> since I broke it.
>>>
>>> Kenn
>>>
>>> On Wed, Feb 5, 2020 at 1:22 PM Brian Hulette <bhule...@google.com>
>>> wrote:
>>>
>>>> So it looks like the schema for `integ_test_small_csv_test_1` was
>>>> updated yesterday around the same time that PR#10563 went in, and it no
>>>> longer matches the schema we expect in the test.
>>>>
>>>> I'm just going to change it back for now. I am curious who changed it
>>>> and why, if the perpetrator is on this list please let us know :)
>>>>
>>>>
>>>> Note the updateTime:
>>>> ```
>>>> ❯ gcloud beta data-catalog entries lookup
>>>> '`datacatalog`.`entry`.`apache-beam-testing`.`us-central1`.`samples`.`integ_test_small_csv_test_1`'``
>>>> gcsFilesetSpec:
>>>>   filePatterns:
>>>>   - gs://apache-beam-samples/integration_test_small_csv/test.csv
>>>> linkedResource: //
>>>> datacatalog.googleapis.com/projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>>>> name:
>>>> projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>>>> schema:
>>>>   columns:
>>>>   - column: id
>>>>     mode: NULLABLE
>>>>     type: INT64
>>>>   - column: name
>>>>     mode: NULLABLE
>>>>     type: STRING
>>>>   - column: type
>>>>     mode: NULLABLE
>>>>     type: STRING
>>>> sourceSystemTimestamps:
>>>>   createTime: '2019-08-16T01:49:06.235Z'
>>>>   updateTime: '2020-02-04T17:18:17.671Z'
>>>> type: FILESET
>>>> ```
>>>>
>>> --
>> Regards,
>> Tomo
>>
>

Reply via email to