Re: Do existing R packages work with SparkR data frames
Hi SparkR has some support for machine learning algorithm like glm. For existing R packages, currently you would need to collect to convert into R data.frame - assuming it fits into the memory of the driver node, though that would be required to work with R package in any case. _ From: LanSent: Tuesday, December 22, 2015 4:50 PM Subject: Do existing R packages work with SparkR data frames To: Hello, Is it possible for existing R Machine Learning packages (which work with R data frames) such as bnlearn, to work with SparkR data frames? Or do I need to convert SparkR data frames to R data frames? Is "collect" the function to do the conversion, or how else to do that? Many Thanks, Lan -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Do-existing-R-packages-work-with-SparkR-data-frames-tp25772.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
RE: Do existing R packages work with SparkR data frames
Hi, Lan, Generally, it is hard to use existing R packages working with R data frames to work with SparkR data frames transparently. Typically the algorithms have to be re-written to use SparkR DataFrame API. Collect is for collecting the data from a SparkR DataFrame into a local data.frame. Since a SparkR DataFrame is a distributed data set, typically you call methods of SparkR DataFrame API to manipulate its data distributedly and after the result is enough to fit in the memory of local machine, you can collect it for local processing. From: Duy Lan Nguyen [mailto:ndla...@gmail.com] Sent: Wednesday, December 23, 2015 5:50 AM To: user@spark.apache.org Subject: Do existing R packages work with SparkR data frames Hello, Is it possible for existing R Machine Learning packages (which work with R data frames) such as bnlearn, to work with SparkR data frames? Or do I need to convert SparkR data frames to R data frames? Is "collect" the function to do the conversion, or how else to do that? Many Thanks, Lan