I was referring to 1 below.  I think making this mandatory instead of allowed 
is sufficient (obviously over time, so we don't break computability).  

Alan.

On Apr 9, 2012, at 10:21 AM, Dmitriy Ryaboy wrote:

> Alan, which idea are you +1 on? I think (int) D is the current syntax.
> 
> There are a couple problems that people hit in the current scalar
> implementation, both of which I think can be fixed without introducing
> new syntax:
> 
> 1) Require the cast, don't do it implicitly. This was actually in the
> design doc but didn't get implemented for some reason.
> 
> 2) Throw an error on the frontend if the scalar relation is the
> relation being iterated on. Meaning:
> 
> foreach foo generate (int) foo.id; -- this will cause the second "foo"
> to be interpreted as a scalar invocation, although clearly it's just a
> bug, and the programmer mean to say "generate (int) id"
> 
> We can just detect this error case and throw during compilation.
> 
> 3) Improve MR-side logging to make it clear that a relation is being
> loaded from the side, what the relation is, etc.
> 
> I believe we have jiras open for all of these..
> 
> D
> 
> On Mon, Apr 9, 2012 at 10:15 AM, Alan Gates <[email protected]> wrote:
>> I'm +1 on this idea, since it's been a problem since the beginning.  Why not 
>> use regular casting notation though, rather than develop another notation?  
>> That's what we discussed originally when we were deciding whether to require 
>> casting or do it silently.  So instead of D->a or SCALAR(D) it would be 
>> (int)D.
>> 
>> Alan.
>> 
>> On Apr 8, 2012, at 7:42 AM, Jonathan Coveney wrote:
>> 
>>> I like this idea, and I think we should deprecate the old syntax, and we
>>> can discuss later when it'd get deleted (and when that would be worth it...
>>> if we have a new syntax, it seems pretty painless to have the other one
>>> float around for backwards compatibility, and if anyone uses it it's a sort
>>> of "caveat emptor").
>>> 
>>> 2012/4/8 Aniket Mokashi <[email protected]>
>>> 
>>>> Hi,
>>>> 
>>>> I have noticed early users of pig often hit issues because of confusing
>>>> syntax between scalars and projections. I think scalar syntax should be
>>>> made more explicit for users to use in order to avoid these problems. For
>>>> example- D = foreach C generate B->count; etc.
>>>> I am sure we might break some backward compatibility but we can at least
>>>> deprecate the syntax for a few versions and eventually move to new syntax.
>>>> 
>>>> Thoughts?
>>>> 
>>>> Thanks,
>>>> Aniket
>>>> 
>> 

Reply via email to