Re: Scala Code Generation

Ufuk Celebi Wed, 14 Oct 2015 02:31:14 -0700

> On 13 Oct 2015, at 16:06, [email protected] wrote:
> 
> Hello,
> 
> I am currently working on a compilation unit translating AsterixDB's AQL
> into runnable Scala code for Flink's Scala API. During code generation I
> discovered some things that are quite hard to work around. I am still
> working with Flink version 0.8, so some of the problems I have might
> already be fixed in 0.9 and if so please tell me.
> 
> First, whenever a record gets projected down to only a single field (e.g.
> by a map or reduce function) it is no longer considered a record, but a
> variable of the type of that field. If afterwards I want to apply
> additional functions like .sum(0) I get an error message like


A workaround is to return Tuple1<X> for this. Then you can run the aggregation. 
I think that the Tuple0 class has been added after 0.8 though.

> "Aggregating on field positions is only possible on tuple data types."
> 
> This is the same for all functions (like write or join) as the "record" is
> no longer considered a dataset.

What do you mean? At least in the current versions, the join projections return 
a Tuple type as well.

> Second, I found that records longer than 22 fields are not supported.
> Whenever I have a record that is longer than that I receive a build error
> as

Flink’s Tuple classes go up to Tuple25. You can work around this by using a 
custom PoJo type, e.g.

class TPCHRecord {
    public int f0;
    ...
    public int f99;
}

If possible, I would suggest to update to the latest 0.9 or the upcoming 0.10 
release. A lot of stuff has been fixed since 0.8. I think it will be worth it. 
If you encounter any problems while doing this, feel free to ask here. :)

– Ufuk

Re: Scala Code Generation

Reply via email to