Re: delay of engines

Kenneth Chan Mon, 26 Sep 2016 22:43:41 -0700

re: kappa vs lambda.
as far as i understand, at high-level, kappa is more like a subset of
lambda (ie. only keep the real-time part)
https://www.ericsson.com/research-blog/data-knowledge/data-processing-architectures-lambda-and-kappa/


Gerog, would you be more specific when you talk about "latency requirement"

1. latency of training a model with new data?
2. latency of deploy new model ? or
3. latency of getting predicted result using the previously trained model
given a query?

if you are talking about 3, depending on how your model calculates the
prediction. It doesn't need spark if the model can be fit into memory.




On Mon, Sep 26, 2016 at 9:41 PM, Georg Heiler <[email protected]>
wrote:

> Hi Donald
> For me it is more about stacking and meta learning. The selection of
> models could be performed offline.
>
> But
> 1 I am concerned about keeping the model up to date - retraining
> 2 having some sort of reinforcement learning to improve / punish based on
> correctness of new ground truth 1/month
> 3 to have Very quick responses. Especially more like an evaluation of a
> random forest /gbt / nnet without staring a yearn job.
>
> Thank you all for the feedback so far
> Best regards to
> Georg
> Donald Szeto <[email protected]> schrieb am Di. 27. Sep. 2016 um 06:34:
>
>> Sorry for side-tracking. I think Kappa architecture is a promising
>> paradigm, but including batch processing from the canonical store to the
>> serving layer store should still be necessary. I believe this somewhat
>> hybrid Kappa-Lambda architecture would be generic enough to handle many use
>> cases. If this is something that sounds good to everyone, we should drive
>> PredictionIO to that direction.
>>
>> Georg, are you talking about updating an existing model in different
>> ways, evaluate them, and select one within a time constraint, say every 1
>> second?
>>
>> On Mon, Sep 26, 2016 at 4:11 PM, Pat Ferrel <[email protected]>
>> wrote:
>>
>>> If you need the model updated in realtime you are talking about a kappa
>>> architecture and PredictionIO does not support that. It does Lambda only.
>>>
>>> The MLlib-based recommenders use live contexts to serve from in-memory
>>> copies of the ALS models but the models themselves were calculated in the
>>> background. There are several scaling issues with doing this but it can be
>>> done.
>>>
>>> On Sep 25, 2016, at 10:23 AM, Georg Heiler <[email protected]>
>>> wrote:
>>>
>>> Wow thanks. This is a great explanation.
>>>
>>> So when I think about writing a spark template for fraud detection (a
>>> combination of spark sql and xgboost ) and would require <1 second latency
>>> how should I store the model?
>>>
>>> As far as I know startup of YARN jobs e.g. A spark job is too slow for
>>> that.
>>> So it would be great if the model could be evaluated without using the
>>> cluster or at least having a hot spark context similar to spark jobserver
>>> or SnappyData.io <http://snappydata.io> is this possible for
>>> prediction.io?
>>>
>>> Regards,
>>> Georg
>>> Pat Ferrel <[email protected]> schrieb am So. 25. Sep. 2016 um
>>> 18:19:
>>>
>>>> Gustavo it correct. To put another way both Oryx and PredictionIO are
>>>> based on what is called a Lambda Architecture. Loosely speaking this means
>>>> a potentially  slow background task computes the predictive “model” but
>>>> this does not interfere with serving queries. Then when the model is ready
>>>> (stored in HDFS or Elasticsearch depending on the template) it is deployed
>>>> and the switch happens in microseconds.
>>>>
>>>> In the case of the Universal Recommender the model is stored in
>>>> Elasticsearch. During `pio train` the new model in inserted into
>>>> Elasticsearch and indexed. Once the indexing is done the index alias used
>>>> to serve queries is switched to the new index in one atomic action so there
>>>> is no downtime and any slow operation happens in the background without
>>>> impeding queries.
>>>>
>>>> The answer will vary somewhat with the template. Templates that use
>>>> HDFS for storage may need to be re-deployed but still the switch from using
>>>> one to having the new one running is microseconds.
>>>>
>>>> PMML is not relevant to this above discussion and is anyway useless for
>>>> many model types including recommenders. If you look carefully at how that
>>>> is implementing in Oryx you will see that the PMML “models” for
>>>> recommenders are not actually stored as PMML, only a minimal description of
>>>> where the real data is stored are in PMML. Remember that it has all the
>>>> problems of XML including no good way to read in parallel.
>>>>
>>>>
>>>> On Sep 25, 2016, at 7:47 AM, Gustavo Frederico <
>>>> [email protected]> wrote:
>>>>
>>>> I undestand that the querying for PredictionIO is very fast, as if it
>>>> were an Elasticsearch query. Also recall that the training moment is a
>>>> different moment that often takes a long time in most learning
>>>> systems, but as long as it's not ridiculously long, it doesn't matter
>>>> that much.
>>>>
>>>> Gustavo
>>>>
>>>> On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler <
>>>> [email protected]> wrote:
>>>> > Hi predictionIO users,
>>>> > I wonder what is the delay of an engine evaluating a model in
>>>> prediction.io.
>>>> > Are the models cached?
>>>> >
>>>> > Another project http://oryx.io/ is generating PMML which can be
>>>> evaluated
>>>> > quickly from a production application.
>>>> >
>>>> > I believe, that very often the latency until the prediction happens,
>>>> is
>>>> > overlooked. How does predictionIO handle this topic?
>>>> >
>>>> > Best regards,
>>>> > Georg
>>>>
>>>>
>>>
>>

Re: delay of engines

Reply via email to