[
https://issues.apache.org/jira/browse/SPARK-25512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-25512.
----------------------------------
Resolution: Invalid
Questions should go to mailing list (see
https://spark.apache.org/community.html). Let's ask there first and file an
issue when it's clear if it's an issue in Spark.
> Using RowNumbers in SparkR Dataframe
> ------------------------------------
>
> Key: SPARK-25512
> URL: https://issues.apache.org/jira/browse/SPARK-25512
> Project: Spark
> Issue Type: Question
> Components: SparkR
> Affects Versions: 2.3.1
> Reporter: Asif Khan
> Priority: Critical
>
> Hi,
> I have a use case , where I have a SparkR dataframe and i want to iterate
> over the dataframe in a for loop using the row numbers of the dataframe. Is
> it possible?
> Only solution I have now is to collect() the SparkR dataframe in R dataframe
> , which brings the entire dataframe on Driver node and then iterate over it
> using row numbers. But as the for loop executes only on driver node, I don't
> get the advantage of parallel processing in Spark which was the whole purpose
> of using Spark. Please Help.
> Thank You,
> Asif Khan
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]