Hi, we are currently using Prediction IO and the Universal Recommender in a small production environment to test it out. The system is running well with our currently small request numbers but we are experiencing some seldom spikes. Our biggest Todo is to split out HBase and ES to separate machines (or switch to ES also for event storage and move ES out to separate machines). So I think that running that all on one machine is probably the cause for this spikes, although we don't see much CPU usage or IO blocks on this machine.
That said, I had a look at the source code of Prediction IO and UR and found a thing that I wanted to ask about here: The LEvents trait allows async calls, e.g. futureFind returns a Scala Future. But if you look at concrete implementations like HBLEvents and ESLEvents I saw that blocking calls / drivers are used, even when there are async variants available (like for ES5+, just use performRequestAsync instead of performRequest). These blocking calls are then "futurized" by using the standard Scala Execution Context. I come back to this later. Also taking a look at the interface for predict algorithms: def predictBase(bm: Any, q: Q): P def predict(model: NullModel, query: Query): PredictedResult I am wondering why it is not like def predictBase(bm: Any, q: Q): Future[P] def predict(model: NullModel, query: Query): Future[PredictedResult] to allow for async, non blocking algorithm implementations. In LEventStore e.g. the above leads to Await.result(eventsDb.futureFind(...) with again a standard import to scala.concurrent.ExecutionContext.Implicits.global so that alogrithms like URAlogrithm that cannot deal with async/Futures in their methods can simply call the synchronous code. Having a look at the ServerActor for the QueryServer, I see that it is implemented using spray and in queries.json route detach() is used to "futurize" synchronous calls inside the route like the synchronous algo.predictBase() call. Again here the standard scala.concurrent.ExecutionContext.Implicits.global is used to make it async using it's internal thread pool. Looking at the doc of scala.concurrent.ExecutionContext.Implicits.global: "The default `ExecutionContext` implementation is backed by a work-stealing thread pool. By default, * the thread pool uses a target number of worker threads equal to the number of * [[https://docs.oracle.com/javase/8/docs/api/java/lang/Runtime.html#availableProcessors-- available processors]]." Especially the part "...target number of worker threads equal to the number of available processors.". I ask myself if that may be a problem, because our machine has 8 processors, so only 8 threads are available to do all the stuff described above and (that's maybe a problem) these few threads may be blocked by IO/Net. What do you think about that? Did I make a mistake somewhere or did I understand sth. wrong? Thought about forking and trying to support full async at least for ES. Would contribute that as a PR. What do you think? Cheers Chris
