Re: Facing issue with floor function in spark SQL query

ayan guha Fri, 04 Mar 2016 02:23:23 -0800

Most likely you are missing import of  org.apache.spark.sql.functions.

In any case, you can write your own function for floor and use it as UDF.


On Fri, Mar 4, 2016 at 7:34 PM, ashokkumar rajendran <
ashokkumar.rajend...@gmail.com> wrote:

> Hi,
>
> I load json file that has timestamp (as long in milliseconds) and several
> other attributes. I would like to group them by 5 minutes and store them as
> separate file.
>
> I am facing couple of problems here..
> 1. Using Floor function at select clause (to bucket by 5mins) gives me
> error saying "java.util.NoSuchElementException: key not found: floor". How
> do I use floor function in select clause? I see that floor method is
> available in org.apache.spark.sql.functions clause but not sure why its not
> working here.
> 2. Can I use the same in Group by clause?
> 3. How do I store them as separate file after grouping them?
>
>         String logPath = "my-json.gz";
>         DataFrame logdf = sqlContext.read().json(logPath);
>         logdf.registerTempTable("logs");
>         DataFrame bucketLogs = sqlContext.sql("Select `user.timestamp` as
> rawTimeStamp, `user.requestId` as requestId,
> *floor(`user.timestamp`/72000*) as timeBucket FROM logs");
>         bucketLogs.toJSON().saveAsTextFile("target_file");
>
> Regards
> Ashok
>



-- 
Best Regards,
Ayan Guha

Re: Facing issue with floor function in spark SQL query

Reply via email to