The test runner intentionally does some ugly things in order to expose
problems which might otherwise be missed. In particular, I believe the test
runner enforces coding between each transform and scrambles order of
elements whereas production pipelines will often have many transforms fused
together without serializing data.

> I've tried TableRowJsonCoder, but seems like it converts all object
inside TableRow to LinkedHashMaps

This is likely intended. I would expect only the top-level container to be
a TableRow and that nested maps would be some other map type.

On Wed, Jul 8, 2020 at 10:52 AM Kirill Zhdanovich <[email protected]>
wrote:

> Hi Jeff,
> It's a simple pipeline that takes PCollection of TableRow which is
> selected from Google Analytics export to BigQuery. So each TableRow follows
> this scheme https://support.google.com/analytics/answer/3437719?hl=en
> I have part of the code doing casting to TableRow like this:
>
> Boolean isMobile = (Boolean) (((TableRow) row.get("device")).get("isMobile"));
>
> or
>
> List<Hit> hits = ((List<TableRow>) 
> row.get("hits")).stream().map(Hit::new).collect(Collectors.toList());
>
> I don't have issues running this pipeline in production. I have this
> issue, only when I tried to write end to end test.
> Do you know if there are existing coders for TableRow that I can use? I've
> tried TableRowJsonCoder, but seems like it converts all object inside
> TableRow to LinkedHashMaps
>
> On Wed, 8 Jul 2020 at 17:30, Jeff Klukas <[email protected]> wrote:
>
>> Kirill - Can you tell us more about what Job.runJob is doing? I would not
>> expect the Beam SDK itself to do any casting to TableRow, so is there a
>> line in your code where you're explicitly casting to TableRow? There may be
>> a point where you need to explicitly set the coder on a PCollection to
>> deserialize back to TableRow objects.
>>
>> On Wed, Jul 8, 2020 at 10:11 AM Kirill Zhdanovich <[email protected]>
>> wrote:
>>
>>> Here is a code example:
>>>
>>> List<TableRow> ss = Arrays.asList(session1, session2);
>>> PCollection<TableRow> sessions = p.apply(Create.of(ss));
>>> PCollection<MetricsWithDimension> res = Job.runJob(sessions, "20200614", 
>>> false, new ProductCatalog());
>>> p.run();
>>>
>>>
>>> On Wed, 8 Jul 2020 at 17:07, Kirill Zhdanovich <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>> I want to test pipeline and the input for it is PCollection of
>>>> TableRows. I've created a test, and when I run it I get an error:
>>>>
>>>> java.lang.ClassCastException: class java.util.LinkedHashMap cannot be
>>>> cast to class com.google.api.services.bigquery.model.TableRow
>>>>
>>>> Is it a known issue? Thank you in advance
>>>>
>>>> --
>>>> Best Regards,
>>>> Kirill
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Kirill
>>>
>>
>
> --
> Best Regards,
> Kirill
>

Reply via email to