I get the part about using it with window, but most other window operators
also work as aggregator operator and in this case, it is specifically
mentioned in the jira issue as well. I asked on dev list and not user list
as it was already mentioned in the issue.

On Mon, Nov 2, 2015 at 4:15 PM, Herman van Hövell tot Westerflier <
hvanhov...@questtec.nl> wrote:

> Hi,
>
> This is more a question for the User list.
>
> Lead and Lag imply ordering of the whole dataset, and this is not
> supported. You can use Lead/Lag in an ordered window function and you'll be
> fine:
>
> *select lead(max(expenses)) over (order by customerId) from tbl group by
> customerId*
>
> HTH
>
> Met vriendelijke groet/Kind regards,
>
> Herman van Hövell tot Westerflier
>
> QuestTec B.V.
> Torenwacht 98
> 2353 DC Leiderdorp
> hvanhov...@questtec.nl
> +31 6 420 590 27
>
>
> 2015-11-02 11:33 GMT+01:00 Shagun Sodhani <sshagunsodh...@gmail.com>:
>
>> Hi! I was trying out window functions in SparkSql (using hive context)
>> and I noticed that while this
>> <https://issues.apache.org/jira/browse/TAJO-919?jql=text%20~%20%22lag%20window%22>
>> mentions that *lead* is implemented as an aggregate operator, it seems
>> not to be the case.
>>
>> I am using the following configuration:
>>
>> Query : SELECT lead(max(`expenses`)) FROM `table` GROUP BY `customerId`
>> Spark Version: 10.4
>> SparkSql Version: 1.5.1
>>
>> I am using the standard example of (`customerId`, `expenses`) scheme
>> where each customer has multiple values for expenses (though I am setting
>> age as Double and not Int as I am trying out maths functions).
>>
>>
>> *java.lang.NullPointerException at
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFLeadLag.evaluate(GenericUDFLeadLag.java:57)*
>>
>> The entire error stack can be found here <http://pastebin.com/jTRR4Ubx>.
>>
>> Can someone confirm if this is an actual issue or some oversight on my
>> part?
>>
>> Thanks!
>>
>
>

Reply via email to