Ismaël,

The latest postcommit failure
https://builds.apache.org/job/beam_PostCommit_SQL/3951/ was 4:15:45 PM
Brian's successful case
https://builds.apache.org/job/beam_PostCommit_SQL_PR/243/ started
4:29:57 PM.
I hope the next SQL postcommit should succeed.

On Thu, Feb 6, 2020 at 11:57 AM Ismaël Mejía <ieme...@gmail.com> wrote:
>
> This one is still broken, maybe there are two different data sources, one for 
> the '_PR' version and the normal one, can you please confirm?
> https://builds.apache.org/job/beam_PostCommit_SQL/
>
>
> On Thu, Feb 6, 2020 at 5:44 PM Brian Hulette <bhule...@google.com> wrote:
>>
>> Sorry for the delay. I had some issues updating the schema, I ended up 
>> having to drop it and re-create for some reason. Looks like SQL PostCommit 
>> is green on https://github.com/apache/beam/pull/10765 now.
>>
>> > setting up from scratch is a good idea.
>> +1, I filed https://issues.apache.org/jira/browse/BEAM-9260 for this and 
>> assigned it to Kenn for now.
>>
>> > I’m still feeling it’s ok to read an Integer as Long.
>> In this case the issue is that we were reading data with a schema (id: 
>> INT64), and then passing it to a PAssert that checked against Rows with a 
>> different schema (id: INT32). So I think it's a legitimate error because we 
>> were asserting that a PCollection contained rows with one schema but it 
>> actually had a different schema. The message should definitely be better in 
>> this case though.
>>
>> Brian
>>
>> On Wed, Feb 5, 2020 at 9:28 PM Tomo Suzuki <suzt...@google.com> wrote:
>>>
>>> (My understanding)
>>> The test ensures the CSV data stored in GCS should be readable through 
>>> Datacatalog. It fails because an Integer value in the CSV was read as Long 
>>> as per Datacatalog.
>>>
>>>
>>> > setting up from scratch is a good idea.
>>>
>>> I agree. Furthermore, it would be nice if it can test different type-cast 
>>> behaviors. I’m still feeling it’s ok to read an Integer as Long. (If this 
>>> is the case, how about Long to Integer? What if the long is small enough to 
>>> fit in 32 bits? and so on)
>>>
>>> On Wed, Feb 5, 2020 at 23:15 Kenneth Knowles <k...@apache.org> wrote:
>>>>
>>>> I think that was me... sorry!
>>>>
>>>> Is this a test where it is important that the data is pre-existing? 
>>>> Otherwise I would say that setting up from scratch is a good idea. Does 
>>>> anyone have context on it? I am happy to take on the small bit of coding, 
>>>> since I broke it.
>>>>
>>>> Kenn
>>>>
>>>> On Wed, Feb 5, 2020 at 1:22 PM Brian Hulette <bhule...@google.com> wrote:
>>>>>
>>>>> So it looks like the schema for `integ_test_small_csv_test_1` was updated 
>>>>> yesterday around the same time that PR#10563 went in, and it no longer 
>>>>> matches the schema we expect in the test.
>>>>>
>>>>> I'm just going to change it back for now. I am curious who changed it and 
>>>>> why, if the perpetrator is on this list please let us know :)
>>>>>
>>>>>
>>>>> Note the updateTime:
>>>>> ```
>>>>> ❯ gcloud beta data-catalog entries lookup 
>>>>> '`datacatalog`.`entry`.`apache-beam-testing`.`us-central1`.`samples`.`integ_test_small_csv_test_1`'``
>>>>> gcsFilesetSpec:
>>>>>   filePatterns:
>>>>>   - gs://apache-beam-samples/integration_test_small_csv/test.csv
>>>>> linkedResource: 
>>>>> //datacatalog.googleapis.com/projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>>>>> name: 
>>>>> projects/apache-beam-testing/locations/us-central1/entryGroups/samples/entries/integ_test_small_csv_test_1
>>>>> schema:
>>>>>   columns:
>>>>>   - column: id
>>>>>     mode: NULLABLE
>>>>>     type: INT64
>>>>>   - column: name
>>>>>     mode: NULLABLE
>>>>>     type: STRING
>>>>>   - column: type
>>>>>     mode: NULLABLE
>>>>>     type: STRING
>>>>> sourceSystemTimestamps:
>>>>>   createTime: '2019-08-16T01:49:06.235Z'
>>>>>   updateTime: '2020-02-04T17:18:17.671Z'
>>>>> type: FILESET
>>>>> ```
>>>
>>> --
>>> Regards,
>>> Tomo



-- 
Regards,
Tomo

Reply via email to