Thanks Pat for your reply. I am doing Video on Demand e-commerce in which reatime query would be very helpful but I want to minimize the risks of HDFS synchronization latency between datacenters. Do you have experience running predictionIO + Universal Recommender in multiple DCs that you can share? Did you face any latency issue with the HBASE cluster?
Thanks in advance On Thu, Jun 1, 2017 at 2:53 PM, Pat Ferrel <[email protected]> wrote: > First, I’m not sure this is a good idea. You loose the realtime nature of > recommendations based on the up-to-the-second recording of user behavior. > You get this with live user event input even without re-calculating the > model in realtime. > > Second, no you can’t disable queries for user history, it is the single > most important key to personalized recommendations. > > I’d have to know more about your application but the first line of cost > cutting for us in custom installations (I work for ActionML the maintainer > of the UR Template) is to make the Spark cluster temporary since it is not > needed to serve queries and only needs to run during training. We start it > up, train. then shut it down. > > If you really want to shut the entire system down and don’t want realtime > user behavior you can query for all users and put the results in your DB or > in-memory cache like a hashmap, then just serve from your db or in-memory > cache. This takes you back to the days of the old Mahout Mapreduce > recommenders (pre 2014) but maybe it fits your app. > > If you are doing E-Commerce think about a user’s shopping behavior. They > shop, browse, then buy. Once they buy that old shopping behavior is no > longer indicative of realtime intent. If you miss using that behavior you > may miss the shopping session altogether. But again, your needs may vary. > > > On Jun 1, 2017, at 6:19 AM, Martin Fernandez <[email protected]> > wrote: > > Hello guys, > > we are trying to deploy Universal Recommender + predictionIO in our > infrastructure but we don't want to distribute hbase accross datacenters > cause of the latency. So the idea is to build and train the engine offline > and then copy the model and ealstic data to PIO replicas. I noticed when I > deploy engine, it always tries to connect to HBASE server since it is used > to query user history. Is there any way to disable those user history > queries and avoid connection to HBASE? > > Thanks > > Martin > > -- Saludos / Best Regards, *Martin Gustavo Fernandez* Mobile: +5491132837292
