Here is the exception I discover. java.lang.RuntimeException: error reading Scala signature of org.apache.spark.mllib.tree.model.DecisionTreeModel: scala.reflect.internal.Symbols$PackageClassSymbol cannot be cast to scala.reflect.internal.Constants$Constant at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:45) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.runtime.JavaMirrors$JavaMirror.unpickleClass(JavaMirrors.scala:565) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.runtime.SymbolLoaders$TopClassCompleter.complete(SymbolLoaders.scala:32) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1231) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:43) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:161) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.internal.Mirrors$RootsBase.staticModule(Mirrors.scala:21) ~[scala-reflect-2.10.4.jar:na] at org.apache.spark.mllib.tree.model.TreeEnsembleModel$SaveLoadV1_0$$typecreator1$1.apply(treeEnsembleModels.scala:450) ~[spark-mllib_2.10-1.6.0.jar:1.6.0] at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) ~[scala-reflect-2.10.4.jar:na] at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) ~[scala-reflect-2.10.4.jar:na] at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:642) ~[spark-catalyst_2.10-1.6.0.jar:1.6.0]
-- Be well! Jean Morozov On Fri, Feb 12, 2016 at 5:57 PM, Eugene Morozov <evgeny.a.moro...@gmail.com> wrote: > Hello, > > I'm building simple web service that works with spark and allows users to > train random forest model (mlib API) and use it for prediction. Trained > models are stored on the local file system (web service and spark of just > one worker are run on the same machine). > I'm concerned about prediction performance and established small load > testing to measure prediction latency. That's initially, I will set up hdfs > and bigger spark cluster. > > At first I run training 5 really small models (all of them can finish > within 30 seconds). > Next my perf testing framework waits for a minute and start calling > prediction method. > > Sometimes I see that not all of the 5 models were saved on disk. There is > a metadata folder for them, but not the data directory that actually > contains parquet files of the models. > > I've looked through spark's jira, but haven't found anything similar. > Has anyone experience smth like this? > Could you recommend where to look for? > Might it be something with flushing it to disk immediately (just a wild > idea...)? > > Thanks in advance. > -- > Be well! > Jean Morozov >