I am working on building a custom ML pipeline-model / estimator to impute
missing values, e.g. I want to fill with last good known value.
Using a window function is slow / will put the data into a single partition.
I built some sample code to use the RDD API however, it some None / null
problems with empty partitions.

How should this be implemented properly to handle such empty partitions?
http://stackoverflow.com/questions/41474175/spark-mappartitionswithindex-handling-empty-partitions

Kind regards,
Georg



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/handling-of-empty-partitions-tp20496.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to