Hi All,

Is there any way I can save Input schema along with ml PipelineModel object?
This feature will be really helpful while loading the model and running
transform, as user can get back the schema , prepare the dataset for
model.transform and don't need to remember it.

I see below jira talks about this as one of the update, but I am not able
to get any sub-task for the same(also it is marked as resolved).
https://issues.apache.org/jira/browse/SPARK-6725


"*UPDATE*: In spark.ml, we could save feature metadata using DataFrames.
Other libraries and formats can support this, and it would be great if we
could too. We could do either of the following:

   - save() optionally takes a dataset (or schema), and load will return a
   (model, schema) pair.
   - Models themselves save the input schema.

Both options would mean inheriting from new Saveable, Loadable types."

Please let me know if any update or jira on this.


Thanks,
Satya

Reply via email to