[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2017-01-17 Thread chanlee514
Github user chanlee514 commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@pferrel: Yes the hash of engine directory path is used as engineVersion 
(which kind of acts as the id). 

However, I'm working on making updates to this PR so that users can provide 
the directory path as CLI argument for `pio build/train/deploy/undeploy` 
commands. This will enable users to run pio commands anywhere, and hopefully 
make this issue clearer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2017-01-15 Thread pferrel
Github user pferrel commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
Actually the manifest may be gone but the requirement of the same absolute 
path is maintained, I believe. A hash of the path to the template code is used 
to create the id and is part of the metadata for the engine. So the "same path" 
remains but the manifest goes away. Correct me @chanlee514 if I'm wrong.

We are trying to create a roadmap that makes these hidden dependencies more 
clear and removing those that we can.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-13 Thread pferrel
Github user pferrel commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@chanlee514 said: 

If the absolute filepath is the same in the machines (since SHA-1 hash of
filepath is used as engineVersion), there would be no changes in the above
workflow. You can just copy the engine directory and deploy. If not, you
can also provide engine-id, engine-version as CLI arguments.

There's some additional ongoing work to make the process more robust and
clearer, such as enabling engine-directory as CLI input, help messages, and
adding tests to cover such scenarios. I don't think the PR will be merged
before then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-05 Thread dszeto
Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@chanlee514 @pferrel I have moved the discussion to dev@.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-02 Thread pferrel
Github user pferrel commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@dszeto asked: "Do you currently have any production deployments that rely 
heavily on engine IDs and versions? That would be a bigger immediate concern 
regarding this change."

Yes. That is why so many questions above. We create a pio driver machine 
that runs training and is then taken down. This is in sync with bringing up 
Spark and taking it down. The pio driver machine is really part of the cluster 
and for the UR (and other PIO templates we have) the predict part runs without 
Spark. So this requires we use some engine instance id across machines to find 
data in the metastore. We also have deployments with 2 EventServers and 2 
PredictionServers behind a load balancer. This requires that the same metadata 
is available to several different machines running, what I've come to see as 3 
independent processes, ES, PS, train. They are independent except for the 
metadata.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-02 Thread pferrel
Github user pferrel commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
ok, this looks like a good step

@dszeto we would not do `pio build` we would do `sbt build` in the 
directory that has the template and this is fine IMO. The confusion users have 
is over pio commands.

BTW @chanlee514 @dszeto Are we thinking of a new command, something like 
`pio register` that would add metadata to the metastore? This would need to be 
run every time the engine.json changed for instance? It would also do not 
compile? Is there an alternative? What state does this leave us in? After the 
push, what action create binary (I assume `pio build`) what action adds 
metadata to the metastore (I assume `pio build`) So this is only about changing 
the instance id and getting rid of manifest, right?

If so I'm totally cool with this, a good step I agree. I guess I'm using 
this PR as a forum for discussion and will move it to @dev 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-01 Thread dszeto
Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@pferrel For `pio build` we probably can enhance it such that it can take a 
working directory for compilation to run. If we look at some other build tools, 
they also rely on having certain files in the current working directory.

For `pio train` and `pio deploy`, I absolutely agree that it should be able 
to run from anywhere as the most advanced scenario. My personal take to this 
change would be something comparable to a local HBase and distributed HBase 
setup. It would be ideal if someone can start playing with PredictionIO without 
setting up any external services (not even a SQL database). I believe Chan's 
change here is a partial step to make the learning curve less steep, and path 
ways for us to untangle engine registration so that it could become a feature 
that can be more easily documented and understandable.

Do you currently have any production deployments that rely heavily on 
engine IDs and versions? That would be a bigger immediate concern regarding 
this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-01 Thread chanlee514
Github user chanlee514 commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
@pferrel: I agree, and that would be possible with a general metadata 
registry server which would be the next step. The reason I started by removing 
manifest.json is because the previous database-dependent build was a roadblock 
for PredictionIO in PaaS platforms such as Heroku. 

Information in engine.json is written during `pio train` as engineInstance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #328: [PIO-47] Eliminate enginemanifest for sta...

2016-12-01 Thread chanlee514
Github user chanlee514 commented on the issue:

https://github.com/apache/incubator-predictionio/pull/328
  
Right now, the commands throw java.io.FileNotFoundException regarding 
engine.json. May be better if we throw an error message saying the commands 
should be executed inside engine directory.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---