Re: installing environment (stops when compiling "compiler-interface" for Scala)

2017-10-18 Thread Pat Ferrel
Memory depends on your data and the engine you are using. Spark puts all data into memory across the Spark cluster so if that is one machine, 4g will not allow more than toy or example data. Remember that PIO and Machine Learning in general works best with big data. BTW my laptop has 16g and

Upgrading to PredictionIO 0.12.0

2017-10-18 Thread Pat Ferrel
PIO-0.12.0 by default, compiles and runs expecting ES5. If you are upgrading (not installing from clean) you will have an issue because ES1 indexes are not upgradable in any simple way. The simplest way to upgrade to pio-0.12.0 and ES5 is to do `pio export` to backup BEFORE upgrading—so export

Re: PredictionIO Universal Recommender user rating

2017-10-09 Thread Pat Ferrel
Yes, this is a very important point. We have found that the % of video viewed is indeed a very important factor but rather than sending some fraction to indicate the length viewed we have taken the approach before to determine the % that indicates the user liked the video. This we do by

Re: Universal Recommender and PredictionIO 0.12.0 incompatibility

2017-10-06 Thread Pat Ferrel
s.h...@salesforce.com> wrote: Hi Pat, On 4 October 2017 at 22:04, Pat Ferrel <p...@actionml.com <mailto:p...@actionml.com>> wrote: It looks like PIO 0.12.0 will require a code change in the UR. PIO changed ES1 support drastically when ES5 support was added and broke the UR code.

Re: [ERROR] Timeout when read recent events

2017-10-06 Thread Pat Ferrel
When you query for all users in batch, the system is easily overloaded. This is the worst case query situation where no caching applies (for instance). 1) run batch queries at low input load time, because you are competing with input for access to HBase 2) throttle your query speed and/or

Re: [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 had a not serializable result

2017-10-04 Thread Pat Ferrel
What version of Scala. Spark, PIO, and UR are you using? On Oct 4, 2017, at 6:10 AM, Noelia Osés Fernández wrote: Hi all, I'm still trying to create a very simple app to learn to use PredictionIO and still having trouble. I have done pio build no problem. But when I do

Re: Eventserver API in an Engine?

2017-09-23 Thread Pat Ferrel
fy baseline PredictionIO deployment, both conceptually & technically. My vision with this thread is to: Enable single-process, single network-listener PredictionIO app deployment (i.e. Queries & Events APIs in the same process.) Attempting to address some previous questions & sta

Re: How to training and deploy on different machine?

2017-09-21 Thread Pat Ferrel
erver] gets a copy of the template from machine [TrainingServer] (only need to do this once) Then run `pio deploy` It is not a Spark driver or executor for training Write a cron job of `pio deploy` It is permanent. Thanks Brian On Wed, Sep 20, 2017 at 11:16 PM, Pat Ferrel <p...@occamsmach

Re: How to training and deploy on different machine?

2017-09-20 Thread Pat Ferrel
Yes, this is the recommended config (Postgres is not, but later). Spark is only needed during training but the `pio train` process creates drives and executors in Spark. The driver will be the `pio train` machine so you must install pio on it. You should have 2 Spark machines at least because

Re: Unable to connect to all storage backends successfully

2017-09-20 Thread Pat Ferrel
meaning is “firstcluster” the cluster name in your Elasticsearch configuration? On Sep 19, 2017, at 8:54 PM, Vaghawan Ojha wrote: I think the problem is with Elasticsearch, are you sure the cluster exists in elasticsearch configuration? On Wed, Sep 20, 2017 at 8:17

Re: Universal Recommender - search by subtext/Unicode

2017-09-13 Thread Pat Ferrel
dation that contains some text in the “subtext”? Not sure what you mean by “subtext” - I mean if the I look for an item like "ifone" instead of "iphone", ie. make an error in the spelling, would it still work ?? On Wed, Sep 13, 2017 at 1:37 PM, Pat Ferrel <p...

Re: Universal Recommender : seasonality of product

2017-09-13 Thread Pat Ferrel
This is done with blacklisting. The default config blacklists all items in the training data that the users has taken the primary event on. So if your primary event is “buy” then once a user has bought a particular table they will not be recommended that table again until the “buy” event ages

Re: Universal Recommender - search by subtext/Unicode

2017-09-13 Thread Pat Ferrel
1) Yes, of course. Use UTF-8 encoding. 2) I don’t understand this question. The UR is not a search engine, what kind of recommendation are looking for? The best recommendation from a list of items? The best recommendation that contains some text in the “subtext”? Not sure what you mean by

Re: Validate the built model

2017-09-06 Thread Pat Ferrel
We do cross-validation tests to see how well the model predicts actual behavior. As to the best data mix, cross-validation works with any engine tuning or data input. Typically this requires re-traiing between test runs so make sure you use exatly the same training/test split. If you want to

Re: Train a model without stopping

2017-09-06 Thread Pat Ferrel
The UR does this automatically. Once deployed you never have to deploy a second time. When a new `pio train` happens the new model is hot-swapped to replace the old, which is then erased, so there is no re-deploy and no downtime. Yes, it uses Elasticsearch aliases but most other Templates do

Re: Recommender for social media

2017-09-05 Thread Pat Ferrel
Actually IMO it is not more complex, it is just far better documented and more flexible. If you don’t need the features it is just as simple as the Apache PIO Templates. I could argue the UR is simpler since you don’t need to $set every item and user, they are determined automatically from the

Re: Securing Event Server on Heroku?

2017-09-01 Thread Pat Ferrel
TLS/SSL is required along with authentication of the HTTPS requests. I’m not familiar with Heroku but the Proxy must authenticate the incoming connections. Nginx has basic auth and is a fast proxy, for instance. A cheap, dirty, and not recommended unless it is your only option, is to set your

Re: Error: Could not find or load main class org.apache.predictionio.tools.console.Console

2017-08-31 Thread Pat Ferrel
/passion8/2769147c5352df4dad610100226f3b66> system : Ubuntu 16.04.3 x64 -- Paritosh Piplewar Sent with Airmail On 30 August 2017 at 5:22:43 PM, Pat Ferrel (p...@occamsmachete.com <mailto:p...@occamsmachete.com>) wrote: > Can you explain how you installed and what your problem is? The link below &

Re: Error: Could not find or load main class org.apache.predictionio.tools.console.Console

2017-08-30 Thread Pat Ferrel
Can you explain how you installed and what your problem is? The link below doesn’t contain much information. On Aug 29, 2017, at 9:02 PM, Paritosh Piplewar wrote: Yes I'm running pio from inside bin directory. Sent from my iPhone On 30-Aug-2017, at 3:41 AM, Mars

Re: sbt.ResolveException: unresolved dependency: org.apache.predictionio#pio-build;0.11.0-incubrating

2017-08-22 Thread Pat Ferrel
You template is linking to org.apache.predictionio#pio-build;0.10.0-incubrating, what do you have installed? org.apache.predictionio#pio-build;0.11.0-incubrating? Looks like you have to change your templates build.sbt to link to the artifact you have built. On Aug 22, 2017, at 3:52 AM,

Re: How to config a high availability eventserver for PredictionIO ?

2017-08-07 Thread Pat Ferrel
A truly HA cluster is often not required depending on what you use it for. Can you share what your application is? The EventServer in my experience (I wrote the pages references below) has never crashed because of input. I think the only crash modes I’ve seen involved disk full on some service

Re: Error when importing data

2017-08-03 Thread Pat Ferrel
splitting this apart for production. On Aug 3, 2017, at 8:32 AM, Pat Ferrel <p...@occamsmachete.com> wrote: It should be easy to try a smaller batch of data first since we are just guessing On Aug 2, 2017, at 11:22 PM, Carlos Vidal <carlos.vi...@beeva.com <mailto:carlos.vi...@beeva

Re: Error when importing data

2017-08-03 Thread Pat Ferrel
It should be easy to try a smaller batch of data first since we are just guessing On Aug 2, 2017, at 11:22 PM, Carlos Vidal <carlos.vi...@beeva.com> wrote: Hello Mahesh, Pat Thanks for your answers. I will try with a bigger EC2 instance. Carlos. 2017-08-02 18:42 GMT+02:00 Pat Fer

Re: Error when importing data

2017-08-02 Thread Pat Ferrel
Something is not configured correctly `pio import` should work with any size of file but this may be an undersized instance for that much data. Spark needs memory, it keeps all data that it needs for a particular calculation spread across all cluster machines in memory. That includes derived

Re: Recommendation not taking new data

2017-07-30 Thread Pat Ferrel
Could be several things. Did you delete data and re-run the integration test? If you did this it will wipe your data and re-create handmade data. It’s supposed to be an example and test. It’s an example that must be modified to be useful for your data. To be safe create a new app with new

Re: Survival Regression Queries

2017-07-28 Thread Pat Ferrel
On the Template Gallery there is a github link, you could try creating an issue there to get the author’s attention. I notice there is also a reference for a blog post too under “support” https://github.com/goliasz/pio-template-sr The Gallery is for

Re: Batch recommender

2017-07-28 Thread Pat Ferrel
As it happens I just looked into the concurrency issue. How many connections can be made to the Prediction Server. The answer is that Spray HTTP, the earlier version of what is now merged with akka-http, uses something called tcp back-pressure to optimize the number of connections allowed and

Re: Batch recommender

2017-07-27 Thread Pat Ferrel
That feature requires a template to support the eval template APIs and many template do not including the UR. We do cross-validation tests with an integration test that is supplied with the recommender. I'd like to see the eval stuff removed from PIO. Cross-validation is too different for

Re: Pio build success with error, pio train is faililng.

2017-07-27 Thread Pat Ferrel
Yes, a great article but it and the tapster demo do not use the UR. On Jul 27, 2017, at 4:52 AM, Vaghawan Ojha wrote: Cool, Now every working template has engine.json and appName field as far as I know. Great it worked. Thanks Vaghawan On Thu, Jul 27, 2017 at 5:13

Re: Survival Regression Queries

2017-07-27 Thread Pat Ferrel
Template are supported by their authors. The ones that were donated to Apache are supported by Apache committers here. Others have support links on their gallery entry. The UR has it’s own Google group as do other ActionML templates but often for a non-Apache non-ActionML template you may need

Re: Pio build success with error, pio train is faililng.

2017-07-25 Thread Pat Ferrel
You should take this to the Apache PredictionIO mailing list, this is to support the ActionML applications that use PIIO. Sign up here: http://predictionio.incubator.apache.org/support/ On Jul 25, 2017, at 1:45 PM, darshanan...@gmail.com

Re: Intellij Pio Train

2017-07-24 Thread Pat Ferrel
cubating", Further for “Intellij sbt refresh, is another problem altogether”: I instead added the project SBT dependencies to “pio-runtime-jars” and added pio-runtime-jars as the class path. This does not result in dependencies missing every time SBT refreshes.   <>

Re: Intellij Pio Train

2017-07-20 Thread Pat Ferrel
1.0/storage%20to%20PredictionIO-0.11.0-incubating.Jar> <> From: Pat Ferrel [mailto:p...@occamsmachete.com] Sent: Thursday, July 20, 2017 5:27 PM To: user@predictionio.incubator.apache.org Cc: user-h...@predictionio.incubator.apache.org Subject: Re: Intellij Pio Train +1 This has been a cons

Re: Intellij Pio Train

2017-07-20 Thread Pat Ferrel
+1 This has been a constant problem with PIO due to several non-standard build and execution paths in the code. In this case the version of Elasticsearch you use determines which version of PIO client classes are used. I have given up on using a debugger with PIO after having contributed a fair

Re: UR Query Based on Simulated Events

2017-07-17 Thread Pat Ferrel
would be recommended for various use cases. This way we can avoid polluting the event store with dummy data and have to clean it up later as well. Thanks for the quick response, will look into other means to evaluate potential use cases. On Mon, Jul 17, 2017 at 11:00 AM Pat Ferrel <p...@occamsma

Re: UR Query Based on Simulated Events

2017-07-17 Thread Pat Ferrel
Personalization cannot work for users with no behavioral data. Your app should be sending events in real time to the EventServer. That way the events below will already be in HBase. Then you just query with the user-id. Item-sets are meant for shopping carts, not events that you know about but

Re: Does Universal Recommender need Spark for Serving?

2017-07-14 Thread Pat Ferrel
A Spark cluster is only needed for `pio train`. Spark must be installed on the machine that runs `pio deploy` but is only used for local client APIs and never needs to communicate with the cluster. However the last I checked EMR is will not work. EMR was designed for Hadoop Mapreduce and Spark

Re: Eventserver API in an Engine?

2017-07-12 Thread Pat Ferrel
rbage output." My concern of embedding event server in engine is - what problem are we solving by providing an illusion that events are only limited for one engine? On Wed, Jul 12, 2017 at 12:11 PM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote:

Re: Eventserver API in an Engine?

2017-07-11 Thread Pat Ferrel
system. On Jul 11, 2017, at 10:31 AM, Pat Ferrel <p...@occamsmachete.com> wrote: Understood, you have immediate practical reasons for 1 integrated deployment with the 2 endpoints. But Apache is a do-ology, meaning those who do something win the argument as long as they have enough consen

Re: Need help with select and setup template

2017-07-10 Thread Pat Ferrel
An ALS based recommender template like the E-Com one will handle ratings, but… Rating are not a good way to make recommendations. Netflix found this out after it popularized the idea and does not use ratings now to recommend. After all, would you rate all the movies you like at 10? Of course

Re: Eventserver API in an Engine?

2017-07-09 Thread Pat Ferrel
can provide some API/option for people to store mutable objects and always overwrite. or use better storage structure to capture the changes of mutable object. On Fri, Jun 30, 2017 at 5:29 AM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: Actually I

Re: Elastic search with PIO

2017-07-09 Thread Pat Ferrel
What Template? On Jul 7, 2017, at 7:38 PM, Saarthak Chandra wrote: Hi, I am trying to use elastic search 5.5.0 with PIO 0.11.0, and am getting the following error - "None of

Re: Eventserver API in an Engine?

2017-07-09 Thread Pat Ferrel
I/option for people to store mutable objects and always overwrite. or use better storage structure to capture the changes of mutable object. On Fri, Jun 30, 2017 at 5:29 AM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: Actually I th

Re: Template Suggestions for Collaborative Recommendation

2017-07-09 Thread Pat Ferrel
The only Template designed for mulit-modal input is The Universal Recommender On Jul 9, 2017 6:21 AM, "Humantool Project" wrote: > Hello everyone, > > I'm new to the PredictionIO System and would like to set up a basic > Collabroative Recommendation Engine. I went

Re: Customers clustering

2017-07-07 Thread Pat Ferrel
nd category2 it would rank high for cluster_1, even if it's more interested in category10. Am I wrong? 2017-07-07 17:48 GMT+02:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: You'll have to work out the ES query JSON, use arrays of strings un-analysed. ES docs indexed

Re: Customers clustering

2017-07-07 Thread Pat Ferrel
-07-06 23:13 GMT+02:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: Actually it sounds like you already have clusters that are made up of categories and you want to know which cluster definition is most similar to what the user has bought? If so you don

Re: Customers clustering

2017-07-06 Thread Pat Ferrel
but if you already have user history, it may be overkill. On Jul 6, 2017, at 1:39 PM, Pat Ferrel <p...@occamsmachete.com> wrote: There are 2 clustering templates but it looks like they both need to be moved from Prediction.io <http://prediction.io/> to Apache PIO, which should b

Re: Kafka support as Event Store

2017-07-06 Thread Pat Ferrel
g websockets as an alternative to batch import? Regards, Thomas Pocreau Le 5 juil. 2017 21:36, "Pat Ferrel" <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> a écrit : No, we try not to fork :-) But it would be nice as you say. It can be done with a small intermed

Re: Exception - pio train - Caued by permission denied

2017-07-04 Thread Pat Ferrel
-Installation/Permission-denied-user-mapred-access-WRITE-inode-quot-quot-hdfs/td-p/16318 <https://community.cloudera.com/t5/CDH-Manual-Installation/Permission-denied-user-mapred-access-WRITE-inode-quot-quot-hdfs/td-p/16318> On Jul 4, 2017, at 2:31 PM, Pat Ferrel <p...@occamsmachete.c

Re: Eventserver API in an Engine?

2017-06-30 Thread Pat Ferrel
whole rather than per engine. Kenneth On Thu, Jun 29, 2017 at 10:22 AM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: Are you asking about the EventServer or PredictionServer? The EventServer is multi-tenant with access keys, not really pure REST. W

Re: Eventserver API in an Engine?

2017-06-29 Thread Pat Ferrel
*Mars ( <> .. <> ) > On Jun 28, 2017, at 18:01, Pat Ferrel <p...@occamsmachete.com> wrote: > > Ah, one of my favorite subjects. > > I’m working on a prototype server that handles online learning as well as > Lambda style. There is only one server with everything going t

Re: will property change events affect memory requirements?

2017-06-25 Thread Pat Ferrel
Sounds like the only way to remove these will be to export, process yourself to remove all events for deleted items and re-import them. On Jun 25, 2017, at 8:08 AM, Pat Ferrel <p...@occamsmachete.com> wrote: File a bug on PredictionIO with this info. the db-cleanger template jus

Re: UR Model Metrics Introspection

2017-06-21 Thread Pat Ferrel
If all recs come back score 0 then they are from the most popular items only, not from the Collaborative Filtering algorithm. If you truly mean “all” then you have a problem in config or nearly 0 data. Don’t mess with tuning until you solve this. On Jun 21, 2017, at 2:27 PM, Daniel Gabrieli

Re: Can I use one PIO engine for several website?

2017-06-20 Thread Pat Ferrel
So its not a good idea to use PIO as a software as a service solution. Is that right ? On Jun 20, 2017 10:07 PM, "Pat Ferrel" <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: Not really, the way to do this is have multiple engines on different ports. On Jun

Re: Can I use one PIO engine for several website?

2017-06-20 Thread Pat Ferrel
Not really, the way to do this is have multiple engines on different ports. On Jun 20, 2017, at 10:03 AM, Amir Jebelli wrote: Can I deploy one pio engine and use it for several website with different recommendation algorithm?

Re: PredictionIO return array of objects

2017-06-17 Thread Pat Ferrel
ing to denormalize. If your items' metadata don't change, it would be fine to send items' metadata in as part of the event. If you cannot denormalize and your scale is huge, you might want to consider putting metadata in a KV store. On Fri, Jun 16, 2017 at 12:19 PM, Pat Ferrel <p...@occamsm

Re: PredictionIO return array of objects

2017-06-16 Thread Pat Ferrel
ly way to do this, then I'll follow down that path. On Fri, Jun 16, 2017 at 11:27 AM Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: What template? Generally this requires you take the id and make a query to your catalog DB. On Jun 16, 2017, at 9:50 AM,

Re: PredictionIO return array of objects

2017-06-16 Thread Pat Ferrel
What template? Generally this requires you take the id and make a query to your catalog DB. On Jun 16, 2017, at 9:50 AM, Cody Kimball wrote: Architectural Design Question: I have a model that performs as expected and returns an array of ID's with their

Re: Passing an array of objects to get predictions?

2017-06-16 Thread Pat Ferrel
array of objects, then modify the predict method in the algorithm class to handle them. I believe this is getting asked more that might warrant a framework level support. It would be great help if you could file a feature request on our JIRA. Regards, Donald On Fri, Jun 16, 2017 at 9:04 AM Pat Fer

Re: Passing an array of objects to get predictions?

2017-06-16 Thread Pat Ferrel
Arrays of events need to be posted to POST http://localhost:7070/batch/events.json?accessKey=… Notice the different URL. Look towards the bottom of this page: https://predictionio.incubator.apache.org/datacollection/eventmodel/

Re: Does predictionIO support pyspark

2017-06-16 Thread Pat Ferrel
No, PIO is a framework so calls your template code using JVM bindings so writing a Template in a non-JVM language would be so difficult as to be not worth it. However inside your Scala or Java wrapper template you can use any libs in any language you wish. On Jun 15, 2017, at 11:13 PM, Ravi

Re: Update default build targets

2017-06-08 Thread Pat Ferrel
h as possible. > > In addition, HBase version (0.98.5) looks much old. It's already EOM. > We should upgrade it to 1.2 at least. > > 2017-06-08 5:51 GMT+09:00 Pat Ferrel <p...@occamsmachete.com > <mailto:p...@occamsmachete.com>>: >> Supporting the latest and requ

Re: Update default build targets

2017-06-07 Thread Pat Ferrel
Supporting the latest and requiring them are 2 different things. Requiring them (except for ES) means PIO won’t run unless the clusters for every user are upgraded to match the client because only backward compatibility is supported. Last time I checked if you require HDFS 2.7, PIO won’t run on

Re: externalizing the spark's es.nodes property from the engine.json

2017-06-07 Thread Pat Ferrel
PIO version and template? For the UR, put them in the “sparkConf” section of engine.json named “es.nodes”: “list,of,hosts" comma delimited with no spaces. If not the UR, you may not need this param since the ES master distributes queries in the cluster. On Jun 7, 2017, at 8:42 AM, Dan

Re: Error while pio status

2017-06-07 Thread Pat Ferrel
Sorry for the confusion over support, PIO has many components and the docker container you are using is of unknown origin (to me anyway) It seems to have misconfigured something. Please be sure to tell the author or create a PR for it so it can be fixed for other users, it’s one way to pay for

Re: Error while training : NegativeArraySizeException

2017-06-07 Thread Pat Ferrel
changing the first / primary / conversion event in eventNames changes what the algorithm will predict. CCO can predict anything in the data by changing this conversion events to the one you want. However that means that you must have good data for the primary/conversion event. Removing it will

Re: Error while pio status

2017-06-07 Thread Pat Ferrel
This group is for support of ActionML projects like the Universal Recommender. Please direct PIO questions to the Apache PIO mailing list. On Jun 7, 2017, at 6:32 AM, hed...@gmail.com wrote: I'm trying to setup PIO 0.11.0 in docker using this

GPUs with The Universal Recommender and PIO

2017-06-06 Thread Pat Ferrel
Recently I gave a talk about how Apache Mahout is moving to support GPUs at the lowest layer. This means that the generalized linear algebra of Mahout is GPU accelerated no matter the algorithm constructed with it. So Spark MLlib will have to wait for specialized work to get GPUs (IBM is

Re: Update default build targets

2017-06-06 Thread Pat Ferrel
both Python 2/3, I'll fix them. Regards, shinsuke 2017-06-06 4:57 GMT+09:00 Pat Ferrel <p...@occamsmachete.com>: > What is the policy driving dependency upgrades? > > I don’t run hadoop 2.7 locally and many users that have Cloudera or Horton > contracts may not either. Not sure why

Re: Update default build targets

2017-06-05 Thread Pat Ferrel
What is the policy driving dependency upgrades? I don’t run hadoop 2.7 locally and many users that have Cloudera or Horton contracts may not either. Not sure why this should be the default until it’s the most popular of we need some feature of it. I’d agree with most of what @Shinsuke suggests

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-04 Thread Pat Ferrel
ag use in the code. Is it SimilarityAnalysis.rowSimilarity() in Mahout that implement TT'? (just to confirm) 2017-06-04 22:06 GMT+04:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: No offense Marius but I wrote the slides and the equation so I do indeed know wh

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-04 Thread Pat Ferrel
g/pdf/1203.4487.pdf> (Section 1.4.5 Emerging new classification) p. 40 2017-06-04 8:14 GMT+04:00 Marius Rabenarivo <mariusrabenar...@gmail.com <mailto:mariusrabenar...@gmail.com>>: And what the T in the slides is for? How can we implement it if it's is not implemented yet? 2017-

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-03 Thread Pat Ferrel
s some kind of tag we give to the item by some mean (classification, LDA, etc) 2017-06-03 21:14 GMT+04:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: A = history of all purchases (in the e-com case) B = history of all tag preferences r = [A’A]h_a + [A’B]h_b The

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-03 Thread Pat Ferrel
enchilada formula afterwards. Thank you for your guidance Pat. 2017-06-02 21:35 GMT+04:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: Please refer to the documents. The “event” is the name of the type of event or indicator if preference, it implies the type of

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-02 Thread Pat Ferrel
srabenar...@gmail.com <mailto:mariusrabenar...@gmail.com>> wrote: So I have to send an event like category-preference for each tag associated to an item right? entityId: userd-id event: category-preference targetEntityId : tag/token 2017-06-02 19:47 GMT+04:00 Pat Ferrel <

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-02 Thread Pat Ferrel
erence for each tag associated to an item right? entityId: userd-id event: category-preference targetEntityId : tag/token 2017-06-02 19:47 GMT+04:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>>: When a user expresses a preference for a tag, word or term as

Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences

2017-06-02 Thread Pat Ferrel
When a user expresses a preference for a tag, word or term as in search or even in content like descriptions, these can be considered secondary events. The most useful are tags and search terms in our experience. Content can be used but each term/token needs to be sent as a separate preference

Re: Disable hbase user history queries

2017-06-01 Thread Pat Ferrel
making each service work across multiple datacenters that might end up being complex. On Jun 1, 2017, at 1:47 PM, Pat Ferrel <p...@occamsmachete.com> wrote: I would put PIO in one and query/input from either of your other datacenters. The latency issues with input/query are much much simple

Re: Disable hbase user history queries

2017-06-01 Thread Pat Ferrel
enters in an Active/active mode. So how can I manage to have a PIO instance running in each DC? Do I have to deploy also HBASE as well? How can I maintain HBASE data? On Thu, Jun 1, 2017 at 5:23 PM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: I haven’t

Re: Disable hbase user history queries

2017-06-01 Thread Pat Ferrel
le DCs that you can share? Did you face any latency issue with the HBASE cluster? Thanks in advance On Thu, Jun 1, 2017 at 2:53 PM, Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> wrote: First, I’m not sure this is a good idea. You loose the realtime nature of rec

Re: Disable hbase user history queries

2017-06-01 Thread Pat Ferrel
First, I’m not sure this is a good idea. You loose the realtime nature of recommendations based on the up-to-the-second recording of user behavior. You get this with live user event input even without re-calculating the model in realtime. Second, no you can’t disable queries for user history,

Re: UR optimizing results

2017-05-30 Thread Pat Ferrel
alues": [ "test", "test2", "test3" ], "bias": 0.01 }, { "name": "price", "values": [ "$10-$25", "$20-$50", "$10-$25&q

Re: Are event's properties overwritten by new properties?

2017-05-28 Thread Pat Ferrel
AM, Marius Rabenarivo <mariusrabenar...@gmail.com> wrote: Sorry, it was a confusion from my part. The 2 events are there. I was thinking that the properties will be aggregated in the event server. 2017-05-26 19:47 GMT+04:00 Pat Ferrel <p...@occamsmachete.com <mailto:p...@occam

Re: make-distribution Error Upgrading to PIO 0.11.0

2017-05-27 Thread Pat Ferrel
We’ve run into this on Ubuntu 14.04 but it works on 16.04. The problem is that you are adding Oracle Java and there is a truststore problem The only clue I’ve heard is https://stackoverflow.com/questions/6784463/error-trustanchors-parameter-must-be-non-empty#6788682 Look at all comments and

Re: UR optimizing results

2017-05-26 Thread Pat Ferrel
ImmutableMap.<String, Object>of( "name", "price", "values", "$10-$25", "bias", -1 ) ) )); Fields is hardcoded for testing. Is this the

Re: Are event's properties overwritten by new properties?

2017-05-26 Thread Pat Ferrel
No but if you are using the UR (I’m guessing from the lists you posted on) you would encode those as arrays with only one value. Can you share the entire 2 $set events json? On May 26, 2017, at 4:25 AM, Marius Rabenarivo wrote: Hello everyone, When sending new

Re: How to deploy same template and multiple models in predictionIO

2017-05-25 Thread Pat Ferrel
What you have outlined is exactly how PIO is designed to work. The EventServer is multi-tenant in that separate datasets and entities can store to it and can even be granted different permissions. But the PredictionServer serves one engine per process on a single port. This scales in a

Re: [actionml/universal-recommender] Pio build error (#21)

2017-05-24 Thread Pat Ferrel
0.6.0 is meant for PIO 0.11.0 Off hand I can’t think of a reason it wouldn’t work with 0.10.0. Try deleting your old jars in target/scala-2.10 and the manifest.json and build again. The better fix is update pio and the ur to clean directories and proceed. You shouldn’t loose any data. On

Re: UR optimizing results

2017-05-24 Thread Pat Ferrel
24 mei 2017 om 17:54 heeft Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> het volgende geschreven: > I split answers in 2 since the config is a completely separate thing. > > increasing maxCorrelatorsPerEventType it usually the wrong thing to do. I

Re: Memory leak Executor pio train UR

2017-05-24 Thread Pat Ferrel
24 mei 2017 om 16:14 heeft Pat Ferrel <p...@occamsmachete.com <mailto:p...@occamsmachete.com>> het volgende geschreven: > Can you give me more of the stack trace? This is on a Spark executor? > > Wild guess is that you need more memory available to the JVM on the executor &

Re: UR optimizing results

2017-05-24 Thread Pat Ferrel
I split answers in 2 since the config is a completely separate thing. increasing maxCorrelatorsPerEventType it usually the wrong thing to do. It is making the model fuzzier, for lack of a better term. I fact we’d like to restrict the correlators to only the best and maxCorrelatorsPerEventType

Re: UR optimizing results

2017-05-24 Thread Pat Ferrel
Secondary events are hard to come by for “complimentary purchases" because, as you point out, the entity being tracked is a cart, not a user. The cart has few possible actions or indicators that can be associated with it. The use has many. Also the cart does not have a brain we are trying to

Re: Memory leak Executor pio train UR

2017-05-24 Thread Pat Ferrel
Can you give me more of the stack trace? This is on a Spark executor? Wild guess is that you need more memory available to the JVM on the executor machine. On May 24, 2017, at 1:21 AM, Dennis Honders wrote: I receive the following error when training with UR 0.6.0

Re: Only one result Universal Recommender

2017-05-23 Thread Pat Ferrel
I assume you are using 0.6.0 so you’ll have to wait for docs in progress. Remember that you are substituting the cart id for the typical user id so do a user-based query with the cart id but passed in as “user", or send a list of items as “item-set” in the query. curl -H "Content-Type:

The Universal Recommender v0.6.0

2017-05-23 Thread Pat Ferrel
This is a major release with several new features. Get tag 0.6.0 or pull from the master branch here: https://github.com/actionml/universal-recommender The AML doc site the the UR is being updated so be patient, most docs still apply with a

Re: Host PIO on Github (not ASF Git)

2017-05-19 Thread Pat Ferrel
https://issues.apache.org/jira/browse/INFRA-14191 <https://issues.apache.org/jira/browse/INFRA-14191> On May 18, 2017, at 4:56 PM, Pat Ferrel <p...@occamsmachete.com> wrote: Heard it from Mahout. I agree it’s not completely clear. Unless someone can illuminate us I’ll follow u

Re: mahout spark-rowsimilarity error

2017-05-19 Thread Pat Ferrel
and average items/user for the feature are also useful when compared to the same for conversion or primary feature/indicator. On May 19, 2017, at 9:20 AM, Pat Ferrel <p...@occamsmachete.com> wrote: Some ideas are left on the SO question On May 18, 2017, at 6:14 PM, Daniel Gabrieli &

Re: Host PIO on Github (not ASF Git)

2017-05-18 Thread Pat Ferrel
On Thu, May 18, 2017 at 3:26 PM Pat Ferrel <p...@occamsmachete.com> wrote: > BTW there seems now to be a way to host on Github (not ASF Git) by linking > our ASF accounts to our git accounts. This would make a bunch of things > much easier like PRs can be reviewed and merged dire

Host PIO on Github (not ASF Git)

2017-05-18 Thread Pat Ferrel
BTW there seems now to be a way to host on Github (not ASF Git) by linking our ASF accounts to our git accounts. This would make a bunch of things much easier like PRs can be reviewed and merged directly, direct merging of doc changes by users, lots of nice stuff that is GUI related. This

Re: UR v0.6.0m RC1

2017-05-17 Thread Pat Ferrel
This should be fixed in RC2, now in the UR develop branch Thanks again Bolmo! On May 17, 2017, at 8:15 AM, Pat Ferrel <p...@occamsmachete.com> wrote: Indeed a bug. This is a blocker (no work around) in some conditions so tracking it down now. On May 15, 2017, at 12:02 PM, Pat Fer

  1   2   3   >