[jira] [Commented] (PIO-101) Document usage of Plug-in of event server and engine server

2017-07-14 Thread Kenneth Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088478#comment-16088478
 ] 

Kenneth Chan commented on PIO-101:
--

agree

> Document usage of Plug-in of event server and engine server
> ---
>
> Key: PIO-101
> URL: https://issues.apache.org/jira/browse/PIO-101
> Project: PredictionIO
>  Issue Type: Task
>  Components: Documentation
>    Reporter: Kenneth Chan
>
> see 
> http://mail-archives.apache.org/mod_mbox/incubator-predictionio-dev/201706.mbox/%3CCAF_HxLtEonOVALSQgrCRGXctAbL7eypxwG0ErHpaBJJym15j5Q%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIO-103) Document deploying multiple engine variants

2017-07-08 Thread Kenneth Chan (JIRA)
Kenneth Chan created PIO-103:


 Summary: Document deploying multiple engine variants
 Key: PIO-103
 URL: https://issues.apache.org/jira/browse/PIO-103
 Project: PredictionIO
  Issue Type: Task
  Components: Documentation
Reporter: Kenneth Chan


Add explanation, tutorial and example of how to deploy multiple engine variants 
in this page.
https://predictionio.incubator.apache.org/deploy/enginevariants/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIO-101) Document usage of Plug-in of event server and engine server

2017-07-08 Thread Kenneth Chan (JIRA)
Kenneth Chan created PIO-101:


 Summary: Document usage of Plug-in of event server and engine 
server
 Key: PIO-101
 URL: https://issues.apache.org/jira/browse/PIO-101
 Project: PredictionIO
  Issue Type: Task
  Components: Documentation
Reporter: Kenneth Chan


see 
http://mail-archives.apache.org/mod_mbox/incubator-predictionio-dev/201706.mbox/%3CCAF_HxLtEonOVALSQgrCRGXctAbL7eypxwG0ErHpaBJJym15j5Q%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIO-96) Storage corrupted by sharing databases between engines with different storage configs

2017-07-08 Thread Kenneth Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079037#comment-16079037
 ] 

Kenneth Chan commented on PIO-96:
-

actually I don't understand why use different storage config for different 
engine ? 
Can the user the same storage configuration as universal recommender (given UR 
has a strict requirement)?



> Storage corrupted by sharing databases between engines with different storage 
> configs
> -
>
> Key: PIO-96
> URL: https://issues.apache.org/jira/browse/PIO-96
> Project: PredictionIO
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.11.0-incubating
>Reporter: Mars Hall
>
> When getting started with PredictionIO, it's no problem to spin up an engine 
> and see it work. Problems emerge when a developer tries running multiple 
> engines with different storage configs on the same underlying database, such 
> as:
> * a Classifier with *Postgres* meta, event, & model storage, and
> * the Universal Recommender with *Elasticsearch* meta plus *Postgres* event & 
> model storage.
> The database will become corrupt because the meta tables are stored in 
> different databases, but the dynamically created event & model tables may 
> mistakenly share the same name, like {{pio_event_1}}.
> We are directing folks to avoid this problem with the Heroku buildpack by 
> [isolating each engine's 
> database|https://github.com/heroku/predictionio-buildpack/blob/master/CUSTOM.md#provision-the-database]
>  and [optionally running an eventserver per 
> engine|https://github.com/heroku/predictionio-buildpack/blob/master/CUSTOM.md#user-content-eventserver].
>  It's still a problem with local development, though.
> It would be great if PredictionIO's management of the database schema's would 
> inherently avoid such conflicts, like by using random/UUIDs for dynamically 
> created table names, so that they will never conflict.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [jira] [Commented] (PIO-97) Fixes examples of the official templates

2017-07-08 Thread Kenneth Chan
related to this  - i 'm wondering if we should move the official templates
into PIO repo for easier maintenance as well and then group all these
examples under the corresponding template inside PIO.

- i know we have discussed this before but forgot why we didn't wanna do it
- thoughts again?

On Fri, Jul 7, 2017 at 9:22 AM, ASF GitHub Bot (JIRA) 
wrote:

>
> [ https://issues.apache.org/jira/browse/PIO-97?page=com.
> atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=16078329#comment-16078329 ]
>
> ASF GitHub Bot commented on PIO-97:
> ---
>
> Github user takezoe commented on the issue:
>
> https://github.com/apache/incubator-predictionio/pull/403
>
> I checked fixed documents through test of examples in #400. LGTM!
>
>
> > Fixes examples of the official templates
> > 
> >
> > Key: PIO-97
> > URL: https://issues.apache.org/jira/browse/PIO-97
> > Project: PredictionIO
> >  Issue Type: Sub-task
> >  Components: Documentation, Templates
> >Affects Versions: 0.11.0-incubating
> >Reporter: Takako Shimamoto
> >Assignee: Takako Shimamoto
> >Priority: Minor
> >
> > First I will fix the following examples contained in the documentation.
> > * scala-parallel-classification
> > * scala-parallel-ecommercerecommendation
> > * scala-parallel-recommendation
> > * scala-parallel-similarproduct
> > If needed, fixes documentation as well.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.4.14#64029)
>


[jira] [Resolved] (PIO-55) Fix missing official template detailed doc link

2017-02-18 Thread Kenneth Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIO-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Chan resolved PIO-55.
-
Resolution: Fixed

> Fix missing official template detailed doc link
> ---
>
> Key: PIO-55
> URL: https://issues.apache.org/jira/browse/PIO-55
> Project: PredictionIO
>  Issue Type: Task
>    Reporter: Kenneth Chan
>    Assignee: Kenneth Chan
>
> The detailed explanation of template and example is missing from the doc 
> site. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (PIO-55) Fix missing official template detailed doc link

2017-02-18 Thread Kenneth Chan (JIRA)
Kenneth Chan created PIO-55:
---

 Summary: Fix missing official template detailed doc link
 Key: PIO-55
 URL: https://issues.apache.org/jira/browse/PIO-55
 Project: PredictionIO
  Issue Type: Task
Reporter: Kenneth Chan
Assignee: Kenneth Chan


The detailed explanation of template and example is missing from the doc site. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Binary distribution with sbt native packager

2017-02-16 Thread Kenneth Chan
ic cool! would be great if the package can also handle upgrade when there
is new PIO release.

On Thu, Feb 16, 2017 at 9:44 PM Shinsuke Sugaya <shins...@yahoo.co.jp>
wrote:

> Sbt packager can put an upgrade script or the like
> as post install script for RPM/DEB.
>
>
> 2017-02-17 13:28 GMT+09:00 Kenneth Chan <kenn...@apache.org>:
> > how does rpm/deb handle PIO version upgrade ?
> >
> >
> > On Thu, Feb 16, 2017 at 8:04 PM Shinsuke Sugaya <shins...@yahoo.co.jp>
> > wrote:
> >
> >> > How are you handling submitting PIO's artifact to Spark
> >>
> >> Not changed at the moment.
> >> To keep make-distribution.sh, ZIP distribution generated
> >> by sbt native packager is the same structure as PredictionIO-*.tar.gz.
> >> For RPM/DEB package, I think that we can use pio command
> >> by modifying log message handling.
> >>
> >> -shinsuke
> >>
> >>
> >> 2017-02-17 10:50 GMT+09:00 Donald Szeto <don...@apache.org>:
> >> > This would be a great addition! How are you handling submitting PIO's
> >> > artifact to Spark, specifically the CreateWorkflow class in tools?
> >> >
> >> > Regards,
> >> > Donald
> >> >
> >> > On Thu, Feb 16, 2017 at 4:10 PM Shinsuke Sugaya <shins...@yahoo.co.jp
> >
> >> > wrote:
> >> >
> >> >> Hi
> >> >>
> >> >> Do you have a plan to use sbt native packager?
> >> >> In our forked branch, I added it under assembly directory.
> >> >> https://github.com/jpioug/incubator-predictionio
> >> >> Currently, both make-distribution.sh and
> >> >> sbt assembly/universal:packageBin work to build a binary
> distribution.
> >> >> I'd like to create rpm/deb package in the future...
> >> >> I'll contribute this task if you don't have any concerns.
> >> >>
> >> >> Regards,
> >> >>  shinsuke
> >> >>
> >>
>


Re: Binary distribution with sbt native packager

2017-02-16 Thread Kenneth Chan
how does rpm/deb handle PIO version upgrade ?


On Thu, Feb 16, 2017 at 8:04 PM Shinsuke Sugaya 
wrote:

> > How are you handling submitting PIO's artifact to Spark
>
> Not changed at the moment.
> To keep make-distribution.sh, ZIP distribution generated
> by sbt native packager is the same structure as PredictionIO-*.tar.gz.
> For RPM/DEB package, I think that we can use pio command
> by modifying log message handling.
>
> -shinsuke
>
>
> 2017-02-17 10:50 GMT+09:00 Donald Szeto :
> > This would be a great addition! How are you handling submitting PIO's
> > artifact to Spark, specifically the CreateWorkflow class in tools?
> >
> > Regards,
> > Donald
> >
> > On Thu, Feb 16, 2017 at 4:10 PM Shinsuke Sugaya 
> > wrote:
> >
> >> Hi
> >>
> >> Do you have a plan to use sbt native packager?
> >> In our forked branch, I added it under assembly directory.
> >> https://github.com/jpioug/incubator-predictionio
> >> Currently, both make-distribution.sh and
> >> sbt assembly/universal:packageBin work to build a binary distribution.
> >> I'd like to create rpm/deb package in the future...
> >> I'll contribute this task if you don't have any concerns.
> >>
> >> Regards,
> >>  shinsuke
> >>
>


Re: Using persisted model

2017-02-15 Thread Kenneth Chan
if you want to deploy specific  version of trained model, you can specify
the engine instance id which you can obtain afte train finish

pio deploy --engine-instance-id id


On Tue, Feb 14, 2017 at 8:50 AM Zehao Zhang  wrote:

> Hi, I'm trying to do a project where there are two identical engines. They
> use the naive bayes model which is automatically persisted by predictionio.
> However, I would like to make it so that either engine can deploy directly
> with the model trained and persisted by the other engine. Is it possible to
> do that?
>
> Thanks
>


Re: [VOTE] Apache PredictionIO (incubating) 0.10.0 Release (RC5)

2016-10-01 Thread Kenneth Chan
+1

On Saturday, October 1, 2016, Pat Ferrel  wrote:

> +1 binding
>
> On Oct 1, 2016, at 10:20 AM, Suneel Marthi  > wrote:
>
> +1 binding
>
> On Sat, Oct 1, 2016 at 12:05 PM, Matthew Tovbin  > wrote:
>
> > +1
> >
> > - Matthew
> >
> > On Oct 1, 2016 00:18, "Donald Szeto" >
> wrote:
> >
> >> This is the vote for 0.10.0 of Apache PredictionIO (incubating).
> >>
> >> The vote will run for at least 72 hours and will close on Oct 3rd, 2016.
> >>
> >> RC2 adds the "apache-" prefix to artifact filenames.
> >>
> >> RC3 adds on top of RC2 with proper licenses and notices embedded in the
> >> Maven artifacts. It also changes the license of the documentation from
> >> Creative Commons to APLv2.
> >>
> >> RC4 fixes a build error of RC3.
> >>
> >> RC5 fixes issues raised by the IPMC:
> >> - Removed 3rd party dependencies from documentation sources
> >> - Fixed incorrect licensing of semver.sh from ASF to BSD
> >> - Moved MySQL connector to optional scope
> >>
> >> The release candidate artifacts can be downloaded here:
> >> https://dist.apache.org/repos/dist/dev/incubator/predictionio/0.10.0-
> >> incubating-rc5/
> >>
> >> Test results of RC5 can be found here:
> >> https://travis-ci.org/apache/incubator-predictionio/builds/164221633
> >>
> >> Maven artifacts are built from the release candidate artifacts above,
> and
> >> are provided as convenience for testing with engine templates. The Maven
> >> artifacts are provided at the Maven staging repo here:
> >> https://repository.apache.org/content/repositories/
> >> orgapachepredictionio-1009/
> >>
> >> All JIRAs completed for this release are tagged with 'FixVersion =
> > 0.10.0'.
> >> You can view them here:
> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> >> projectId=12320420=12337844
> >>
> >> The artifacts have been signed with Key : 8BF4ABEB
> >>
> >> Please vote accordingly:
> >>
> >> [ ] +1, accept RC as the official 0.10.0 release
> >> [ ] -1, do not accept RC as the official 0.10.0 release because...
> >>
> >
>
>


Re: HTTP Post to events.json with eventId key

2016-09-29 Thread Kenneth Chan
not sure why.

- are you testing with HTTP API?
- did you try using LEvent client to insert event to make sure it works
first before using HTTP?

to debug, you can use the LEvent client to test out insert same event in a
pio-shell or write a small test program and see what happen:

https://github.com/apache/incubator-predictionio/blob/develop/data/src/main/scala/org/apache/predictionio/data/storage/hbase/HBLEvents.scala#L99

example tests:
https://github.com/apache/incubator-predictionio/blob/develop/data/src/test/scala/org/apache/predictionio/data/storage/LEventsSpec.scala

Kenneth


On Thu, Sep 29, 2016 at 1:34 PM, Hasan Can Saral <hasancansa...@gmail.com>
wrote:

> Yes, I am using the exact time stamp, yet I am creating an event with a
> brand new eventId. Any thoughts?
>
> On Wed, Sep 28, 2016 at 5:15 AM, Kenneth Chan <kenn...@apache.org> wrote:
>
> > when you update event with the same eventId, does the new event have the
> > same eventTime?
> >
> > the eventTime is also used as Hbase's cell timestamp (versions)
> > https://github.com/apache/incubator-predictionio/blob/
> > develop/data/src/main/scala/org/apache/predictionio/data/
> > storage/hbase/HBEventsUtil.scala#L164
> >
> >
> > On Tue, Sep 27, 2016 at 4:44 AM, Hasan Can Saral <
> hasancansa...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > I have a question regarding the RowKey generation in eventToPut in
> > > HBEventsUtil. If that is the wrong place to ask, please correct me.
> > >
> > > I started browsing the source code to see if I can implement updating
> of
> > > events with HTTP put requests in EventServer.
> > >
> > > So basically in HBLEvents, eventsToPut in HBEventsUtil is called an
> > within
> > > eventsToPut, which generates a unique RowKey object if it is None.
> > >
> > > But when the event.eventId.map is there, it simply proceeds with
> > > RowKey(id). Then regarding this <http://stackoverflow.com/a/13685752>,
> > the
> > > value should be updated.
> > >
> > > Hence, if I include "eventId" key in the JSON I post to events.json, I
> > > understand that the document/value in HBase should be updated.
> > >
> > > However, I am receiving a new document/value with a different key than
> > the
> > > one I post, which means I am missing something. I would appreciate if
> you
> > > could help me with it.
> > >
> > > Thanks,
> > > Hasan
> > >
> > > --
> > >
> > > Hasan Can Saral
> > > hasancansa...@gmail.com
> > >
> >
>
>
>
> --
>
> Hasan Can Saral
> hasancansa...@gmail.com
>


Re: HTTP Post to events.json with eventId key

2016-09-27 Thread Kenneth Chan
when you update event with the same eventId, does the new event have the
same eventTime?

the eventTime is also used as Hbase's cell timestamp (versions)
https://github.com/apache/incubator-predictionio/blob/develop/data/src/main/scala/org/apache/predictionio/data/storage/hbase/HBEventsUtil.scala#L164


On Tue, Sep 27, 2016 at 4:44 AM, Hasan Can Saral 
wrote:

> Hi all,
>
> I have a question regarding the RowKey generation in eventToPut in
> HBEventsUtil. If that is the wrong place to ask, please correct me.
>
> I started browsing the source code to see if I can implement updating of
> events with HTTP put requests in EventServer.
>
> So basically in HBLEvents, eventsToPut in HBEventsUtil is called an within
> eventsToPut, which generates a unique RowKey object if it is None.
>
> But when the event.eventId.map is there, it simply proceeds with
> RowKey(id). Then regarding this , the
> value should be updated.
>
> Hence, if I include "eventId" key in the JSON I post to events.json, I
> understand that the document/value in HBase should be updated.
>
> However, I am receiving a new document/value with a different key than the
> one I post, which means I am missing something. I would appreciate if you
> could help me with it.
>
> Thanks,
> Hasan
>
> --
>
> Hasan Can Saral
> hasancansa...@gmail.com
>


Re: Remove engine registration

2016-09-16 Thread Kenneth Chan
Pat, would you explain more about the 'instanceId' as in
`pio register --variant path/to/some-engine.json --instanceId
some-REST-compatible-resource-id`  ?

Currently PIO also has a concept of engineInstanceId, which is output of
train. I think you are referring to different thing, right?

Kenneth


On Fri, Sep 16, 2016 at 12:58 PM, Pat Ferrel  wrote:

> This is a great discussion topic and a great idea.
>
> However the cons must also be addressed, we will need to do this before
> multi-tenant deploys can happen and the benefits are just as large as
> removing `pio build`
>
> It would be great to get rid of manifest.json and put all metadata in the
> store with an externally visible id so all parts of the workflow on all
> machines will get the right metadata and any template specific commands
> will run from anywhere on any cluster machine and in any order. All we need
> is a global engine-instance id. This will make engine-instances behave more
> like datasets, which are given permanent ids with `pio app new …` This
> might be a new form of `pio register` and it implies a new optional param
> to pio template specific commands (the instance id) but removes a lot of
> misunderstandings people have and easy mistakes in workflow.
>
> So workflow would be:
> 1) build with SBT/mvn
> 2) register any time engine.json changes so make the json file an optional
> param to `pio register --variant path/to/some-engine.json --instanceId
> some-REST-compatible-resource-id` the instance could also be
> auto-generated and output or optionally in the engine.json. `pio engine
> list` lists registered instances with instanceId. The path to the binary
> would be put in the instanceId and would be expected to be the same on all
> cluster machines that need it.
> 3) `pio train --instanceId` optional if it’s in engine.json
> 4) `pio deploy --instanceId` optional if it’s in engine.json
> 5) with easily recognized exceptions all the above can happen in any order
> on any cluster machine and from any directory.
>
> This takes one big step to multi-tenancy since the instance data has an
> externally visible id—call it a REST resource id…
>
> I bring this up not to confuse the issue but because if we change the
> workflow commands we should avoid doing it often because of the disruption
> it brings.
>
>
> On Sep 16, 2016, at 10:42 AM, Donald Szeto  wrote:
>
> Hi all,
>
> I want to start the discussion of removing engine registration. How many
> people actually take advantage of being able to run pio commands everywhere
> outside of an engine template directory? This will be a nontrivial change
> on the operational side so I want to gauge the potential impact to existing
> users.
>
> Pros:
> - Stateless build. This would work well with many PaaS.
> - Eliminate the "pio build" command once and for all.
> - Ability to use your own build system, i.e. Maven, Ant, Gradle, etc.
> - Potentially better experience with IDE since engine templates no longer
> depends on an SBT plugin.
>
> Cons:
> - Inability to run pio engine training and deployment commands outside of
> engine template directory.
> - No automatic version matching of PIO binary distribution and artifacts
> version used in the engine template.
> - A less unified user experience: from pio-build-train-deploy to build,
> then pio-train-deploy.
>
> Regards,
> Donald
>
>