I've almost completed a library for speeding up current spark models serving - https://github.com/Hydrospheredata/fastserving. It depends on spark, but it provides a way to turn spark logical plan from dataframe sample, that was passed into pipeline/transformer, into an alternative transformer that works with a local data structure and provides a significant performance speedup.
>From the future perspective, I think introducing some dataframe-like structure with exposed catalist-like ast and providing different ways of interpretation(local/spark) possible could solve the current problems with a "minimal" rewriting. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org