I've been looking for several solutions but I can't find something
efficient to compute many window function efficiently ( optimized
computation or efficient parallelism )
Am I the only one interested by this ?


Regards,

Julien

Le ven. 15 déc. 2017 à 21:34, Julien CHAMP <jch...@tellmeplus.com> a écrit :

> May be I should consider something like impala ?
>
> Le ven. 15 déc. 2017 à 11:32, Julien CHAMP <jch...@tellmeplus.com> a
> écrit :
>
>> Hi Spark Community members !
>>
>> I want to do several ( from 1 to 10) aggregate functions using window
>> functions on something like 100 columns.
>>
>> Instead of doing several pass on the data to compute each aggregate
>> function, is there a way to do this efficiently ?
>>
>>
>>
>> Currently it seems that doing
>>
>>
>> val tw =
>>   Window
>>     .orderBy("date")
>>     .partitionBy("id")
>>     .rangeBetween(-8035200000L, 0)
>>
>> and then
>>
>> x
>>    .withColumn("agg1", max("col").over(tw))
>>    .withColumn("agg2", min("col").over(tw))
>>    .withColumn("aggX", avg("col").over(tw))
>>
>>
>> Is not really efficient :/
>> It seems that it iterates on the whole column for each aggregation ? Am I
>> right ?
>>
>> Is there a way to compute all the required operations on a columns with a
>> single pass ?
>> Event better, to compute all the required operations on ALL columns with
>> a single pass ?
>>
>> Thx for your Future[Answers]
>>
>> Julien
>>
>>
>>
>>
>>
>> --
>>
>>
>> Julien CHAMP — Data Scientist
>>
>>
>> *Web : **www.tellmeplus.com* <http://tellmeplus.com/> — *Email : 
>> **jch...@tellmeplus.com
>> <jch...@tellmeplus.com>*
>>
>> *Phone ** : **06 89 35 01 89 <0689350189> * — *LinkedIn* :  *here*
>> <https://www.linkedin.com/in/julienchamp>
>>
>> TellMePlus S.A — Predictive Objects
>>
>> *Paris* : 7 rue des Pommerots, 78400 Chatou
>> *Montpellier* : 51 impasse des églantiers, 34980 St Clément de Rivière
>>
> --
>
>
> Julien CHAMP — Data Scientist
>
>
> *Web : **www.tellmeplus.com* <http://tellmeplus.com/> — *Email : 
> **jch...@tellmeplus.com
> <jch...@tellmeplus.com>*
>
> *Phone ** : **06 89 35 01 89 <0689350189> * — *LinkedIn* :  *here*
> <https://www.linkedin.com/in/julienchamp>
>
> TellMePlus S.A — Predictive Objects
>
> *Paris* : 7 rue des Pommerots, 78400 Chatou
> *Montpellier* : 51 impasse des églantiers, 34980 St Clément de Rivière
>
-- 


Julien CHAMP — Data Scientist


*Web : **www.tellmeplus.com* <http://tellmeplus.com/> — *Email :
**jch...@tellmeplus.com
<jch...@tellmeplus.com>*

*Phone ** : **06 89 35 01 89 <0689350189> * — *LinkedIn* :  *here*
<https://www.linkedin.com/in/julienchamp>

TellMePlus S.A — Predictive Objects

*Paris* : 7 rue des Pommerots, 78400 Chatou
*Montpellier* : 51 impasse des églantiers, 34980 St Clément de Rivière

-- 

Ce message peut contenir des informations confidentielles ou couvertes par 
le secret professionnel, à l’intention de son destinataire. Si vous n’en 
êtes pas le destinataire, merci de contacter l’expéditeur et d’en supprimer 
toute copie.
This email may contain confidential and/or privileged information for the 
intended recipient. If you are not the intended recipient, please contact 
the sender and delete all copies.


-- 
 <http://www.tellmeplus.com/assets/emailing/banner.html>

Reply via email to