[ https://issues.apache.org/jira/browse/PIO-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Naoki Takezoe resolved PIO-192. ------------------------------- Resolution: Done Fix Version/s: 0.14.0 > Enhance PySpark support > ----------------------- > > Key: PIO-192 > URL: https://issues.apache.org/jira/browse/PIO-192 > Project: PredictionIO > Issue Type: Improvement > Components: Core > Affects Versions: 0.13.0 > Reporter: Takako Shimamoto > Assignee: Takako Shimamoto > Priority: Major > Fix For: 0.14.0 > > > h3. Summary > Enhance the pypio, which is the Python API for PIO. > h3. Goals > The limitations of the current Python support always force developers to have > access to sbt. This enhancement will get rid of the build phase. > h3. Description > A Python engine has nothing to need. Developers can use the pypio module with > jupyter notebook and Python code. > First, import the necessary modules. > {code:python} > import pypio > {code} > Once the module in imported, the first step is to initialize the pypio module. > {code:python} > pypio.init() > {code} > Next, find data from the event store. > {code:python} > event_df = pypio.find_events('BHPApp') > {code} > And then, save the model. > {code:python} > # model is a PipelineModel, which is produced after a Pipeline’s fit() method > runs > pipeline = Pipeline(...) > model = pipeline.fit(train_df) > engine_instance_id = pypio.save_model(model, ["prediction"]) > {code} > h4. Run & Deploy > h5. Run Jupyter > {code:sh} > pio-shell --with-pyspark > {code} > h5. Run on Spark > {code:sh} > pio train --main-py-file xxxx.py > {code} > h5. Deploy App > {code:sh} > pio deploy --engine-instance-id <engine_instance_id> > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)