(My understanding)
The test ensures the CSV data stored in GCS should be readable through
Datacatalog. It fails because an Integer value in the CSV was read as Long
as per Datacatalog.


> setting up from scratch is a good idea.

I agree. Furthermore, it would be nice if it can test different type-cast
behaviors. I’m still feeling it’s ok to read an Integer as Long. (If this
is the case, how about Long to Integer? What if the long is small enough to
fit in 32 bits? and so on)

On Wed, Feb 5, 2020 at 23:15 Kenneth Knowles <k...@apache.org> wrote:

> I think that was me... sorry!
>
> Is this a test where it is important that the data is pre-existing?
> Otherwise I would say that setting up from scratch is a good idea. Does
> anyone have context on it? I am happy to take on the small bit of coding,
> since I broke it.
>
> Kenn
>
> On Wed, Feb 5, 2020 at 1:22 PM Brian Hulette <bhule...@google.com> wrote:
>
>> So it looks like the schema for `integ_test_small_csv_test_1` was updated
>> yesterday around the same time that PR#10563 went in, and it no longer
>> matches the schema we expect in the test.
>>
>> I'm just going to change it back for now. I am curious who changed it and
>> why, if the perpetrator is on this list please let us know :)
>>
>>
>> Note the updateTime:
>> ```
>> ❯ gcloud beta data-catalog entries lookup
>> '`datacatalog`.`entry`.`apache-beam-testing`.`us-central1`.`samples`.`integ_test_small_csv_test_1`'``
>> gcsFilesetSpec:
>>   filePatterns:
>>   - gs://apache-beam-samples/integration_test_small_csv/test.csv
>> linkedResource: //
>> datacatalog.googleapis.com/projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>> name:
>> projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>> schema:
>>   columns:
>>   - column: id
>>     mode: NULLABLE
>>     type: INT64
>>   - column: name
>>     mode: NULLABLE
>>     type: STRING
>>   - column: type
>>     mode: NULLABLE
>>     type: STRING
>> sourceSystemTimestamps:
>>   createTime: '2019-08-16T01:49:06.235Z'
>>   updateTime: '2020-02-04T17:18:17.671Z'
>> type: FILESET
>> ```
>>
> --
Regards,
Tomo

Reply via email to