Comments inlined.
pi song wrote:
I came across a couple more issues:-
1) Currently we don't allow specifying only data type but no alias in
schema declaration. I work around by using "null" keyword
null:int, null:long null means no alias specified
This is obviously not the right solution. We again need a discussion
on schema declaration for different cases:-
- Specify both type and alias (Currently supported)
- Specify only alias, no type (Currently supported)
- Specify only type, no alias (Currently not supported)
As specified, there isn't support for giving a field's type without
giving it an alias. I don't know that we need to allow this.
2) Current cogroup implementation doesn't output Bag fields with
tuples wrapped inside but Bag with schema instead. This is apparently
inconsistent with schema definition. I don't know which one is right.
We've discussed about this before but didn't come up with a consensus.
BTW, a quick way to work around this would be altering(hacking) schema
loading in LogicalPlanLoader.createLOCogroup()
Santhosh, can you comment on this? My understanding was that bags could
have schemas too, as that implied that they contained tuples with that
schema.
<snip>
Alan.