Donald,
It would be great to collaborate on that!
- Matthew
On Sat, Apr 14, 2018, 10:23 Pat Ferrel wrote:
> The need for Spark at query time depends on the engine. Which are you
> using? The Universal Recommender, which I maintain, does not require Spark
> for queries but uses PIO. We simply don’t use the Spark context so it is
> ignored. To make PIO work you need to have the Spark code accessible but
> that doesn’t mean there must be a Spark cluster, you can set the Spark
> master to “local” and there are no Spark resources used in the deployed pio
> PredictionServer.
>
> We have infra code to spin up a Spark cluster for training and bring it
> back down afterward. This all works just fine. The UR PredictionServer also
> has no need to be re-deployed since the model is hot-swapped after
> training, Deploy once run forever. And no real requirement for Spark to do
> queries.
>
> So depending on the Engine the requirement for Spark is code level not
> system level.
>
>
> From: Donald Szeto
> Reply: user@predictionio.apache.org
>
> Date: April 13, 2018 at 4:48:15 PM
> To: user@predictionio.apache.org
>
> Subject: Re: pio deploy without spark context
>
> Hi George,
>
> This is unfortunately not possible now without modifying the source code,
> but we are planning to refactor PredictionIO to be runtime-agnostic,
> meaning the engine server would be independent and SparkContext would not
> be created if not necessary.
>
> We will start a discussion on the refactoring soon. You are very welcome
> to add your input then, and any subsequent contribution would be highly
> appreciated.
>
> Regards,
> Donald
>
> On Fri, Apr 13, 2018 at 3:51 PM George Yarish
> wrote:
>
>> Hi all,
>>
>> We use pio engine which doesn't require apache spark in serving time, but
>> from my understanding anyway sparkContext will be created by "pio deploy"
>> process by default.
>> My question is there any way to deploy an engine avoiding creation of
>> spark application if I don't need it?
>>
>> Thanks,
>> George
>>
>>