Hi,

that error seems to indicate the basic query is not properly expressed. If
you group by just ID, then that means it would need to aggregate all the
time values into one value per ID, so you can't sort by it. Thus it tries
to suggest an aggregate function for time so you can have 1 value per ID
and properly sort it.

On Mon, Apr 27, 2015 at 3:07 PM, Ulanov, Alexander <alexander.ula...@hp.com>
wrote:

>  Hi,
>
>
>
> Could you suggest what is the best way to do “group by x order by y” in
> Spark?
>
>
>
> When I try to perform it with Spark SQL I get the following error (Spark
> 1.3):
>
>
>
> val results = sqlContext.sql("select * from sample group by id order by
> time")
>
> org.apache.spark.sql.AnalysisException: expression 'time' is neither
> present in the group by, nor is it an aggregate function. Add to group by
> or wrap in first() if you don't care which value you get.;
>
>         at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:37)
>
>
>
> Is there a way to do it with just RDD?
>
>
>
> Best regards, Alexander
>

Reply via email to