you're right, serialization works.

what is your suggestion on saving a "distributed" model?  so part of the
model is in one cluster, and some other parts of the model are in other
clusters.  during runtime, these sub-models run independently in their own
clusters (load, train, save).  and at some point during run time these
sub-models merge into the master model, which also loads, trains, and saves
at the master level.

much appreciated.



On Fri, Nov 7, 2014 at 2:53 AM, Evan R. Sparks <evan.spa...@gmail.com>
wrote:

> There's some work going on to support PMML -
> https://issues.apache.org/jira/browse/SPARK-1406 - but it's not yet been
> merged into master.
>
> What are you used to doing in other environments? In R I'm used to running
> save(), same with matlab. In python either pickling things or dumping to
> json seems pretty common. (even the scikit-learn docs recommend pickling -
> http://scikit-learn.org/stable/modules/model_persistence.html). These all
> seem basically equivalent java serialization to me..
>
> Would some helper functions (in, say, mllib.util.modelpersistence or
> something) make sense to add?
>
> On Thu, Nov 6, 2014 at 11:36 PM, Duy Huynh <duy.huynh....@gmail.com>
> wrote:
>
>> that works.  is there a better way in spark?  this seems like the most
>> common feature for any machine learning work - to be able to save your
>> model after training it and load it later.
>>
>> On Fri, Nov 7, 2014 at 2:30 AM, Evan R. Sparks <evan.spa...@gmail.com>
>> wrote:
>>
>>> Plain old java serialization is one straightforward approach if you're
>>> in java/scala.
>>>
>>> On Thu, Nov 6, 2014 at 11:26 PM, ll <duy.huynh....@gmail.com> wrote:
>>>
>>>> what is the best way to save an mllib model that you just trained and
>>>> reload
>>>> it in the future?  specifically, i'm using the mllib word2vec model...
>>>> thanks.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/word2vec-how-to-save-an-mllib-model-and-reload-it-tp18329.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>
>

Reply via email to