Re: [SparkML] RandomForestModel save on disk.

Eugene Morozov Fri, 12 Feb 2016 08:39:59 -0800

Here is the exception I discover.

java.lang.RuntimeException: error reading Scala signature of
org.apache.spark.mllib.tree.model.DecisionTreeModel:
scala.reflect.internal.Symbols$PackageClassSymbol cannot be cast to
scala.reflect.internal.Constants$Constant
        at
scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:45)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.runtime.JavaMirrors$JavaMirror.unpickleClass(JavaMirrors.scala:565)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.runtime.SymbolLoaders$TopClassCompleter.complete(SymbolLoaders.scala:32)
~[scala-reflect-2.10.4.jar:na]
        at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1231)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:43)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:161)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:21)
~[scala-reflect-2.10.4.jar:na]
        at
org.apache.spark.mllib.tree.model.TreeEnsembleModel$SaveLoadV1_0$$typecreator1$1.apply(treeEnsembleModels.scala:450)
~[spark-mllib_2.10-1.6.0.jar:1.6.0]
        at
scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231)
~[scala-reflect-2.10.4.jar:na]
        at
scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231)
~[scala-reflect-2.10.4.jar:na]
        at
org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:642)
~[spark-catalyst_2.10-1.6.0.jar:1.6.0]



--
Be well!
Jean Morozov

On Fri, Feb 12, 2016 at 5:57 PM, Eugene Morozov <evgeny.a.moro...@gmail.com>
wrote:

> Hello,
>
> I'm building simple web service that works with spark and allows users to
> train random forest model (mlib API) and use it for prediction. Trained
> models are stored on the local file system (web service and spark of just
> one worker are run on the same machine).
> I'm concerned about prediction performance and established small load
> testing to measure prediction latency. That's initially, I will set up hdfs
> and bigger spark cluster.
>
> At first I run training 5 really small models (all of them can finish
> within 30 seconds).
> Next my perf testing framework waits for a minute and start calling
> prediction method.
>
> Sometimes I see that not all of the 5 models were saved on disk. There is
> a metadata folder for them, but not the data directory that actually
> contains parquet files of the models.
>
> I've looked through spark's jira, but haven't found anything similar.
> Has anyone experience smth like this?
> Could you recommend where to look for?
> Might it be something with flushing it to disk immediately (just a wild
> idea...)?
>
> Thanks in advance.
> --
> Be well!
> Jean Morozov
>

Re: [SparkML] RandomForestModel save on disk.

Reply via email to