eduard created PIO-213:
--------------------------

             Summary: Elastic search as event server does not work
                 Key: PIO-213
                 URL: https://issues.apache.org/jira/browse/PIO-213
             Project: PredictionIO
          Issue Type: Bug
          Components: Build, Core, Documentation
    Affects Versions: 0.14.0, 0.15.0
            Reporter: eduard


In docs we see that we can use elastic search as event store instead of hbase 
but we tried pio 0.14.0, 0.15.0 and different versions of elastic 5.9, 6.8.1  
and spark 2.1.3, 2.4.0
all the time train fails because of json4s lib when spark cannot serialise an 
object from json4s 
we also tried to upgrade json4s lib to newest version but it also did not help
so guys we given up since we cannot use elastic search instead of hbase without 
code changes 

here is stack trace we are struggling with: ( pio 0.15.0 with spark 2.4 ) 



Exception in thread "main" org.apache.spark.SparkException: Task not 
serializable

 at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)

 at 
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)

 at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)

 at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)

 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:934)

 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:933)

 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)

 at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)

 at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:933)

 at 
org.apache.predictionio.data.storage.elasticsearch.ESPEvents.delete(ESPEvents.scala:111)

 at 
org.apache.predictionio.core.SelfCleaningDataSource$class.removePEvents(SelfCleaningDataSource.scala:198)

 at co.unreel.DataSource.removePEvents(DataSource.scala:13)

 at 
org.apache.predictionio.core.SelfCleaningDataSource$class.wipePEvents(SelfCleaningDataSource.scala:184)

 at co.unreel.DataSource.wipePEvents(DataSource.scala:13)

 at co.unreel.DataSource.cleanPersistedPEvents(DataSource.scala:39)

 at co.unreel.DataSource.readTraining(DataSource.scala:48)

 at co.unreel.DataSource.readTraining(DataSource.scala:13)

 at 
org.apache.predictionio.controller.PDataSource.readTrainingBase(PDataSource.scala:40)

 at org.apache.predictionio.controller.Engine$.train(Engine.scala:642)

 at org.apache.predictionio.controller.Engine.train(Engine.scala:176)

 at 
org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)

 at 
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)

 at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:498)

 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)

 at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)

 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)

 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)

 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)

 at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)

 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)

 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: java.io.NotSerializableException: 
org.json4s.ext.IntervalSerializer$$anon$1

Serialization stack:

 - object not serializable (class: org.json4s.ext.IntervalSerializer$$anon$1, 
value: org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3)

 - field (class: org.json4s.ext.ClassSerializer, name: t, type: interface 
org.json4s.ext.ClassType)

 - object (class org.json4s.ext.ClassSerializer, 
ClassSerializer(org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3))

 - writeObject data (class: scala.collection.immutable.List$SerializationProxy)

 - object (class scala.collection.immutable.List$SerializationProxy, 
scala.collection.immutable.List$SerializationProxy@6106dfb6)

 - writeReplace data (class: scala.collection.immutable.List$SerializationProxy)

 - object (class scala.collection.immutable.$colon$colon, 
List(DurationSerializer, InstantSerializer, DateTimeSerializer, 
DateMidnightSerializer, 
ClassSerializer(org.json4s.ext.IntervalSerializer$$anon$1@6d9428f3), 
ClassSerializer(org.json4s.ext.LocalDateSerializer$$anon$2@7dddfc35), 
ClassSerializer(org.json4s.ext.LocalTimeSerializer$$anon$3@71316cd7), 
PeriodSerializer))

 - field (class: org.json4s.Formats$$anon$3, name: wCustomSerializers$1, type: 
class scala.collection.immutable.List)

 - object (class org.json4s.Formats$$anon$3, 
org.json4s.Formats$$anon$3@7a730479)

 - field (class: org.apache.predictionio.data.storage.elasticsearch.ESPEvents, 
name: formats, type: interface org.json4s.Formats)

 - object (class org.apache.predictionio.data.storage.elasticsearch.ESPEvents, 
org.apache.predictionio.data.storage.elasticsearch.ESPEvents@3f45dfec)

 - field (class: 
org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$delete$1, 
name: $outer, type: class 
org.apache.predictionio.data.storage.elasticsearch.ESPEvents)

 - object (class 
org.apache.predictionio.data.storage.elasticsearch.ESPEvents$$anonfun$delete$1, 
<function1>)

 at 
org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)

 at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)

 at 
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)

 at 
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:400)

 ... 35 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to