subject:"Need to order iterator values in spark dataframe"

Re: Need to order iterator values in spark dataframe

2020-04-01 Thread Ranjan, Abhinav

Enrico, The below solution works but there is a little glitch. It is working fine in spark-shell but failing for *_/skewed keys/_* while doing a spark-submit. while looking into the execution plan, the partitioning value is same for both repartition and groupByKey and is driven by the value

Re: Need to order iterator values in spark dataframe

2020-03-26 Thread Zahid Rahman

I believe I logged an issue first and I should get a response first. I was ignored. Regards Did you know there are 8 million people in kashmir locked up in their homes by the Hindutwa (Indians) for 8 months. Now the whole planet is locked up in their homes. You didn't take notice of them either.

Re: Need to order iterator values in spark dataframe

2020-03-26 Thread Enrico Minack

Abhinav, you can repartition by your key, then sortWithinPartition, and the groupByKey. Since data are already hash-partitioned by key, Spark should not shuffle the data hence change the sort wihtin each partition: ds.repartition($"key").sortWithinPartitions($"code").groupBy($"key") Enrico

Need to order iterator values in spark dataframe

2020-03-26 Thread Ranjan, Abhinav

Hi, I have a dataframe which has data like: key | code | code_value 1 | c1 | 11 1 | c2 | 12 1 | c2 | 9 1 | c3

Re: Need to order iterator values in spark dataframe

Re: Need to order iterator values in spark dataframe

Re: Need to order iterator values in spark dataframe

Need to order iterator values in spark dataframe

4 matches

Site Navigation

Mail list logo

Footer information