Hi

So I managed to fix this … took me a while to find out. So in case anyone cares:

...
In my code I had something like this:

def predict(model: ECommModel, query: Query): PredictedResult = {

    val userFeatures = model.userFeatures
    val productModels = model.productModels
…
}

val unavailableItems: Set[String] = try {
      val constr = LEventStore.findByEntity(
        appName = ap.sharedApp,
        entityType = "constraint",
        entityId = "unavailableItems"
…
}

So the idea was that the unavailable items only get populated once during the 
deployment (and therefore to my understanding instantiation of my the 
ECommAlgorithm class). Pulling the unavailable products in every incoming 
request turned out to be too slow …

This worked in 0.10 but in 0.11 I was getting the „env vars not set“ errors.

Apparently something was changed that changes the scoping of the env vars in 
the engines during testing.

Bests

Florian



Am 22. Mai 2017 um 13:58:05, Florian Krause ([email protected]) 
schrieb:

Hi Chan

thanks a lot for reaching out to me ... 

pio@predict-io:/opt/reco-engine$ /opt/PredictionIO-0.11.0-incubating/bin/pio 
status
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/pio-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.11.0-incubating is installed at 
/opt/PredictionIO-0.11.0-incubating
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at 
/opt/PredictionIO-0.11.0-incubating/vendors/spark-2.1.1-bin-hadoop2.7
[INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum requirement of 
1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: PGSQL)...
[INFO] [Storage$] Verifying Model Data Backend (Source: PGSQL)...
[INFO] [Storage$] Verifying Event Data Backend (Source: PGSQL)...
[INFO] [Storage$] Test writing to Event Store (App Id 0)...
[INFO] [Management$] Your system is all ready to go.

---
pio@predict-io:/opt/reco-engine/MatrixProduct2$ 
/opt/PredictionIO-0.11.0-incubating/bin/pio status --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/pio-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.11.0-incubating is installed at 
/opt/PredictionIO-0.11.0-incubating
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at 
/opt/PredictionIO-0.11.0-incubating/vendors/spark-2.1.1-bin-hadoop2.7
[INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum requirement of 
1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: PGSQL)...
[DEBUG] [ConnectionPool$] Registered connection pool : 
ConnectionPool(url:jdbc:postgresql://localhost/pio, user:pio) using factory : 
<default>
[DEBUG] [ConnectionPool$] Registered singleton connection pool : 
ConnectionPool(url:jdbc:postgresql://localhost/pio, user:pio)
[DEBUG] [StatementExecutor$$anon$1] SQL execution completed

  [SQL Execution]
   create table if not exists pio_meta_engineinstances ( id varchar(100) not 
null primary key, status text not null, startTime timestamp DEFAULT 
CURRENT_TIMESTAMP, endTime timestamp DEFAULT CURRENT_TIMESTAMP, engineId text 
not null, engin
eVersion text not null, engineVariant text not null, engineFactory text not 
null, batch text not null, env text not null, sparkConf text not null, 
datasourceParams text not null, preparatorParams text not null, 
algorithmsParams text not n
ull, servingParams text not null); (3 ms)

  [Stack Trace]
    ...
    
org.apache.predictionio.data.storage.jdbc.JDBCEngineInstances$$anonfun$1.apply(JDBCEngineInstances.scala:49)
    
org.apache.predictionio.data.storage.jdbc.JDBCEngineInstances$$anonfun$1.apply(JDBCEngineInstances.scala:32)
    scalikejdbc.DBConnection$class.autoCommit(DBConnection.scala:222)
    scalikejdbc.DB.autoCommit(DB.scala:60)
    scalikejdbc.DB$$anonfun$autoCommit$1.apply(DB.scala:215)
    scalikejdbc.DB$$anonfun$autoCommit$1.apply(DB.scala:214)
    scalikejdbc.LoanPattern$class.using(LoanPattern.scala:18)
    scalikejdbc.DB$.using(DB.scala:138)
-- 
So this works .. building with tests enabled doesn't

---
/opt/PredictionIO-0.11.0-incubating/bin/pio build --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/opt/PredictionIO-0.11.0-incubating/lib/pio-assembly-0.11.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[INFO] [Engine$] Using command '/opt/PredictionIO-0.11.0-incubating/sbt/sbt' at 
/opt/reco-engine/MatrixProduct2 to build.
[INFO] [Engine$] If the path above is incorrect, this process will fail.
[INFO] [Engine$] Uber JAR disabled. Making sure 
lib/pio-assembly-0.11.0-incubating.jar is absent.
[INFO] [Engine$] Going to run: /opt/PredictionIO-0.11.0-incubating/sbt/sbt  
package assemblyPackageDependency in /opt/reco-engine/MatrixProduct2
[INFO] [Engine$] [info] Loading project definition from 
/opt/reco-engine/MatrixProduct2/project
[INFO] [Engine$] [info] Set current project to MatrixProduct2 (in build 
file:/opt/reco-engine/MatrixProduct2/)
[INFO] [Engine$] [success] Total time: 0 s, completed May 22, 2017 11:52:26 AM
[INFO] [Engine$] [info] Including from cache: shared_2.11.jar
[INFO] [Engine$] [info] Including from cache: snappy-java-1.1.1.7.jar
[INFO] [Engine$] [info] Including from cache: scala-library-2.11.8.jar
[ERROR] [Engine$] log4j:WARN No appenders could be found for logger 
(org.apache.predictionio.data.storage.Storage$).
[ERROR] [Engine$] log4j:WARN Please initialize the log4j system properly.
[ERROR] [Engine$] log4j:WARN See 
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
[INFO] [Engine$] org.apache.predictionio.data.storage.StorageClientException: 
Data source PGSQL was not properly initialized.
[INFO] [Engine$]        at 
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
[INFO] [Engine$]        at 
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
[INFO] [Engine$]        at scala.Option.getOrElse(Option.scala:121)
[INFO] [Engine$]        at 
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284)
[INFO] [Engine$]        at 
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:269)
[INFO] [Engine$]        at 
org.apache.predictionio.data.storage.Storage$.getMetaDataApps(Storage.scala:387)
[INFO] [Engine$]        at 
org.apache.predictionio.data.store.Common$.appsDb$lzycompute(Common.scala:27)
[INFO] [Engine$]        at 
org.apache.predictionio.data.store.Common$.appsDb(Common.scala:27)
[INFO] [Engine$]        at 
org.apache.predictionio.data.store.Common$.appNameToId(Common.scala:32)
[INFO] [Engine$]        at 
org.apache.predictionio.data.store.LEventStore$.findByEntity(LEventStore.scala:75)
[INFO] [Engine$]        at 
com.rebelle.MatrixProduct2.ECommAlgorithm.liftedTree1$1(ECommAlgorithm.scala:516)
[INFO] [Engine$]        at 
com.rebelle.MatrixProduct2.ECommAlgorithm.<init>(ECommAlgorithm.scala:515)
[INFO] [Engine$]        at 
com.rebelle.MatrixProduct2.ECommAlgorithmTest.<init>(ECommAlgorithmTest.scala:31)
[INFO] [Engine$]        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
[INFO] [Engine$]        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
[INFO] [Engine$]        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[INFO] [Engine$]        at 
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[INFO] [Engine$]        at java.lang.Class.newInstance(Class.java:442)
[INFO] [Engine$]        at 
org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:641)
[INFO] [Engine$]        at sbt.TestRunner.runTest$1(TestFramework.scala:76)
[INFO] [Engine$]        at sbt.TestRunner.run(TestFramework.scala:85)

I am using the EventStore in my recommender (to pull in products no longer 
available). The test runner seems to instantiate it but then barfs because it 
can't get the configuration from the env

Exactly the same engine compiles just fine under 0.10. When I disable the tests 
with 
test in assembly := {}
in the build.sbt file, compile, train and deploy works fine.

Bests


2017-05-22 12:49 GMT+02:00 Chan Lee <[email protected]>:
Hi Florian,

Can you tell me the output for `pio status`? Does the postgres driver match the 
argument sent to spark-submit?

Best,
Chan

On Mon, May 22, 2017 at 1:53 AM, Florian Krause <[email protected]> 
wrote:
Hi all

I have been unsuccessful at building my two engines with 0.11. I have described 
my attempts here -> 
https://stackoverflow.com/questions/43941915/predictionio-0-11-building-an-engine-fails-with-java-lang-classnotfoundexceptio

It appears that during the pio build phase, the env vars from pio-env.sh are 
not set correctly. 

I have managed to get around this by not running the tests, the compiled 
versions of the engine work flawless, so the database works.

Now what confuses me a bit is the usage of the —env command line param in the 
CreateWorkflow jar. 

This is the command pio sends to spark

/opt/PredictionIO-0.11.0-incubating/vendors/spark-2.1.1-bin-hadoop2.7/bin/spark-submit
 --driver-memory 80G --executor-memory 80G --class 
org.apache.predictionio.workflow.CreateWorkflow --jars 
file:/opt/PredictionIO-0.11.0-incubating/lib/postgresql-42.1.1.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/mysql-connector-java-5.1.40-bin.jar,file:/opt/reco-engine/MatrixProduct2/target/scala-2.11/matrixproduct2_2.11-0.1-SNAPSHOT.jar,file:/opt/reco-engine/MatrixProduct2/target/scala-2.11/MatrixProduct2-assembly-0.1-SNAPSHOT-deps.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-localfs-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-jdbc-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-elasticsearch-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hbase-assembly-0.11.0-incubating.jar
 --files file:/opt/PredictionIO-0.11.0-incubating/conf/log4j.properties 
--driver-class-path 
/opt/PredictionIO-0.11.0-incubating/conf:/opt/PredictionIO-0.11.0-incubating/lib/postgresql-42.1.1.jar:/opt/PredictionIO-0.11.0-incubating/lib/mysql-connector-java-5.1.40-bin.jar
 --driver-java-options -Dpio.log.dir=/home/pio 
file:/opt/PredictionIO-0.11.0-incubating/lib/pio-assembly-0.11.0-incubating.jar 
--engine-id com.rebelle.MatrixProduct2.ECommerceRecommendationEngine 
--engine-version 23bea44eff1a8e08bc80e290e52dc9dc565d9bb7 --engine-variant 
file:/opt/reco-engine/MatrixProduct2/engine.json --verbosity 0 --json-extractor 
Both --env 
PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_HOME=/opt/PredictionIO-0.11.0-incubating,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=<password>,PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL,PIO_CONF_DIR=/opt/PredictionIO-0.11.0-incubating/conf


When I try to run this manually from the command line, I get

[ERROR] [Storage$] Error initializing storage client for source
Exception in thread "main" 
org.apache.predictionio.data.storage.StorageClientException: Data source  was 
not properly initialized.
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
        at 
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
        at scala.Option.getOrElse(Option.scala:121)
        at 
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284)


So even though all needed params are set in —env, Spark cannot find them. I 
have to manually set them via export to make this work. What exactly should 
happen these vars are set through —env?

Perhaps someone can give me pointers in what might be worth trying

Bests & thanks

Florian




--

  
Dr. Florian Krause

Chief Technical Officer

____________________



​REBELLE
- StyleRemains GmbH

Brooktorkai 4, D-20457 Hamburg

Tel.:  
   
+49 40 30 70 19 18

Fax:  
+49 40 30 70 19 29

E-Mail:  
 [email protected]

Website:  www.rebelle.com

Network:  LinkedIn  
 Xing


Managing directors: Sophie-Cécile Gaulke, Max Laurent
Schönemann

Registered in Amtsgericht Hamburg under the No. HRB
126796

This e-mail contains confidential and/or legally protected
information. If you are not the intended recipient or if you have
received this e-mail by error please notify the sender immediately
and destroy this e-mail. Any unauthorized review, copying,
disclosure or distribution of the material in this e-mail is
strictly forbidden. The contents of this e-mail is legally binding
only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect.


Reply via email to