No, AFAIK this looks like an auth/ssl layer for HDFS.

What are you trying to do. If you connect to PIO from your app servers, which 
are behind your firewall, there may be little need for security and SSL is 
probably more than enough. But if you want to connect directly from a mobile 
app or web page javascript you will need much more—at least a proxy, ideally 
with some kind of authentication. Think of PIO like a Database, it is not 
opened directly to the wild internet. 

On Oct 21, 2016, at 10:39 AM, Georg Heiler <[email protected]> wrote:

Thanks again. Do you think https://knox.apache.org/ <https://knox.apache.org/> 
will be a good fit for that?

Gustavo Frederico <[email protected] 
<mailto:[email protected]>> schrieb am Fr., 21. Okt. 2016 um 
19:36 Uhr:
Georg, if you are talking about having some OATH or some security token to 
authenticate/authorize the requests, that is not directly in the PIO stack. 
What PIO has is the application id, which is included in the requests. If you 
need to encrypt data or authenticate requests, you would need to build that 
logic before the requests arrives at PIO. That's how I see the architecture so 
far... 

Gustavo

On Fri, Oct 21, 2016 at 1:20 PM, Pat Ferrel <[email protected] 
<mailto:[email protected]>> wrote:
SSL is supported on the Event and PredictionServers but someone else will have 
to answer how. There is a Jira to add instructions to the site, not sure if 
that has been cleared but you might want to check and vote for the issue. 
https://issues.apache.org/jira/browse/PIO-7?jql=project%20%3D%20PIO 
<https://issues.apache.org/jira/browse/PIO-7?jql=project%20=%20PIO>

The key can be auto-generated or can be specified and is really only an ID for 
the dataset to send events into. It is not used for queries.



On Oct 21, 2016, at 9:37 AM, Georg Heiler <[email protected] 
<mailto:[email protected]>> wrote:

Thanks a lot for this great answer. 

May I add an additional question regarding the api :
I know pio generates an api key. For which operations is this key required and 
is it possible to use encryption and a key with the api in oder to sort of 
force authentication in order to obtain a predicted result?


Cheers 
Georg 
Pat Ferrel <[email protected] <mailto:[email protected]>> schrieb am 
Fr. 21. Okt. 2016 um 18:17:
The command line for any pio command that is launched on Spark can specify the 
master so you can train on one cluster and deploy on another. This is typical 
when using the ALS recommenders, which use a big cluster to train but deploy 
with `pio deploy -- --master local[2]` which would use a local context to load 
and serve the model. Beware of memory use, wherever the pio command is run will 
also run the Spark driver, which can have large memory needs, as large as the 
executors, which run on the cluster. If you run 2 contexts on the same machine, 
one with a local master and one with a cluster master you will have 2 drivers 
and may have executors also.

Yarn allows you to run the driver on a cluster machine but is somewhat 
complicated to setup.



On Oct 21, 2016, at 4:53 AM, Georg Heiler <[email protected] 
<mailto:[email protected]>> wrote:

Hi,
I am curious if prediction.IO supports different environments e.g. is it 
possible to define a separate spark context for training and serving of the 
model in engine.json?

The idea is that a trained model e.g. xgboost could be evaluated very quickly 
outside of a cluster environment (no yarn, ... involved, only prediction.io 
<http://prediction.io/> in docker with a database + model in file system)

Cheers,
Georg




Reply via email to