Hi all, Apache DataFu <http://datafu.apache.org/> is an Apache project of general purpose Hadoop utils, and *datafu-spark* is a new module in this project with general utilities and UDFs that can be useful to Spark developers.
This is a blog post I wrote introducing some of the APIs in DataFu-Spark: https://medium.com/paypal-tech/introducing-datafu-spark-ba67faf1933a Cheers, Eyal