[ https://issues.apache.org/jira/browse/PIO-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980326#comment-16980326 ]
eduard commented on PIO-213: ---------------------------- it does not matter what are pio,spark,es versions all the time pio train fails here at org.apache.predictionio.data.storage.elasticsearch.ESPEvents.delete(ESPEvents.scala:111) due to Exception in thread "main" org.apache.spark.SparkException: Task not serializable > Elastic search as event server does not work > -------------------------------------------- > > Key: PIO-213 > URL: https://issues.apache.org/jira/browse/PIO-213 > Project: PredictionIO > Issue Type: Bug > Components: Build, Core, Documentation > Affects Versions: 0.14.0, 0.15.0 > Reporter: eduard > Priority: Critical > > In docs we see that we can use elastic search as event store instead of hbase > but we tried pio 0.14.0, 0.15.0 and different versions of elastic 5.9, 6.8.1 > and spark 2.1.3, 2.4.0 > all the time train fails because of json4s lib when spark cannot serialise an > object from json4s > we also tried to upgrade json4s lib to newest version but it also did not help > so guys we given up since we cannot use elastic search instead of hbase > without code changes > here is stack trace we are struggling with: ( pio 0.15.0 with spark 2.4 ) > Exception in thread "main" org.apache.spark.SparkException: Task not > serializable > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2326) > at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:934) > at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:933) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) > at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:933) > at > org.apache.predictionio.data.storage.elasticsearch.ESPEvents.delete(ESPEvents.scala:111) > at > org.apache.predictionio.core.SelfCleaningDataSource$class.removePEvents(SelfCleaningDataSource.scala:198) > at co.unreel.DataSource.removePEvents(DataSource.scala:13) > at > org.apache.predictionio.core.SelfCleaningDataSource$class.wipePEvents(SelfCleaningDataSource.scala:184) > at co.unreel.DataSource.wipePEvents(DataSource.scala:13) > at co.unreel.DataSource.cleanPersistedPEvents(DataSource.scala:39) > at co.unreel.DataSource.readTraining(DataSource.scala:48) > at co.unreel.DataSource.readTraining(DataSource.scala:13) > at > org.apache.predictionio.controller.PDataSource.readTrainingBase(PDataSource.scala:40) > at org.apache.predictionio.controller.Engine$.train(Engine.scala:642) > at org.apache.predictionio.controller.Engine.train(Engine.scala:176) > at > org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67) > at > org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251) > at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.io.NotSerializableException: > org.json4s.ext.IntervalSerializer$$anon$1 > Serialization stack: > - object not serializable (class: org.json4s.ext.IntervalSerializer$$anon$1, > value: org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3) > - field (class: org.json4s.ext.ClassSerializer, name: t, type: interface > org.json4s.ext.ClassType) > - object (class org.json4s.ext.ClassSerializer, > ClassSerializer(org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3)) > - writeObject data (class: > scala.collection.immutable.List$SerializationProxy) > - object (class scala.collection.immutable.List$SerializationProxy, > scala.collection.immutable.List$SerializationProxy@6106dfb6) > - writeReplace data (class: > scala.collection.immutable.List$SerializationProxy) > - object (class scala.collection.immutable.$colon$colon, > List(DurationSerializer, InstantSerializer, DateTimeSerializer, > DateMidnightSerializer, > ClassSerializer(org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3), > ClassSerializer(org.json4s.ext.LocalDateSerializer$$anon$2@7dddfc35), > ClassSerializer(org.json4s.ext.LocalTimeSerializer$$anon$3@71316cd7), > PeriodSerializer)) > - field (class: org.json4s.Formats$$anon$3, name: wCustomSerializers$1, > type: class scala.collection.immutable.List) > - object (class org.json4s.Formats$$anon$3, > org.json4s.Formats$$anon$3@7a730479) > - field (class: > org.apache.predictionio.data.storage.elasticsearch.ESPEvents, name: formats, > type: interface org.json4s.Formats) > - object (class > org.apache.predictionio.data.storage.elasticsearch.ESPEvents, > org.apache.predictionio.data.storage.elasticsearch.ESPEvents@3f45dfec) > - field (class: > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$delete$1, > name: $outer, type: class > org.apache.predictionio.data.storage.elasticsearch.ESPEvents) > - object (class > org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$delete$1, > <function1>) > at > org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400) > ... 35 more -- This message was sent by Atlassian Jira (v8.3.4#803005)