Re: delay of engines

Georg Heiler Tue, 27 Sep 2016 00:24:50 -0700

indeed. I am mostly interested in 2)

Kenneth Chan <[email protected]> schrieb am Di., 27. Sep. 2016 um
09:22 Uhr:


> just want to clarify when you mix ""evaluation / prediction"
> 1. "evaluation" means evaluating the performance of the model
> 2. "prediction" means calculate the prediction (fraud or not in your case)
> understand that evaluation needs generating "prediction" in order to
> evaluate the accuracy of the model.
> But you are referring to low latency for 2 right?
>
> "Regarding the low volume data: some features will require some sort of
> SQL for extraction"
> So if this can be fast, and if using the model to calculate the likelihood
> of fraud can be done in memory (without RDD), then the latency should be
> low.
>
>
>
> On Tue, Sep 27, 2016 at 12:04 AM, Georg Heiler <[email protected]>
> wrote:
>
>> For me, the latency of model evaluation is more important than training
>> latency. This holds true for retraining / model updates as well. I would
>> say that the "evaluation / prediction" latency is the most critical one.
>>
>> Your point regarding 3) is very interesting for me. I have 2 types of
>> data:
>>
>>    - low volume information about a customer
>>    - high volume usage data
>>
>> The high volume data will require aggregation (e.g. spark SQL) prior the
>> model can be evaluated. Here, a higher latency would be OK.
>> Regarding the low volume data: some features will require some sort of
>> SQL for extraction.
>>
>>
>>
>> Kenneth Chan <[email protected]> schrieb am Di., 27. Sep. 2016 um
>> 07:43 Uhr:
>>
>>> re: kappa vs lambda.
>>> as far as i understand, at high-level, kappa is more like a subset of
>>> lambda (ie. only keep the real-time part)
>>>
>>> https://www.ericsson.com/research-blog/data-knowledge/data-processing-architectures-lambda-and-kappa/
>>>
>>> Gerog, would you be more specific when you talk about "latency
>>> requirement"
>>>
>>> 1. latency of training a model with new data?
>>> 2. latency of deploy new model ? or
>>> 3. latency of getting predicted result using the previously trained
>>> model given a query?
>>>
>>> if you are talking about 3, depending on how your model calculates the
>>> prediction. It doesn't need spark if the model can be fit into memory.
>>>
>>>
>>>
>>>
>>> On Mon, Sep 26, 2016 at 9:41 PM, Georg Heiler <[email protected]
>>> > wrote:
>>>
>>>> Hi Donald
>>>> For me it is more about stacking and meta learning. The selection of
>>>> models could be performed offline.
>>>>
>>>> But
>>>> 1 I am concerned about keeping the model up to date - retraining
>>>> 2 having some sort of reinforcement learning to improve / punish based
>>>> on correctness of new ground truth 1/month
>>>> 3 to have Very quick responses. Especially more like an evaluation of a
>>>> random forest /gbt / nnet without staring a yearn job.
>>>>
>>>> Thank you all for the feedback so far
>>>> Best regards to
>>>> Georg
>>>> Donald Szeto <[email protected]> schrieb am Di. 27. Sep. 2016 um 06:34:
>>>>
>>>>> Sorry for side-tracking. I think Kappa architecture is a promising
>>>>> paradigm, but including batch processing from the canonical store to the
>>>>> serving layer store should still be necessary. I believe this somewhat
>>>>> hybrid Kappa-Lambda architecture would be generic enough to handle many 
>>>>> use
>>>>> cases. If this is something that sounds good to everyone, we should drive
>>>>> PredictionIO to that direction.
>>>>>
>>>>> Georg, are you talking about updating an existing model in different
>>>>> ways, evaluate them, and select one within a time constraint, say every 1
>>>>> second?
>>>>>
>>>>> On Mon, Sep 26, 2016 at 4:11 PM, Pat Ferrel <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> If you need the model updated in realtime you are talking about a
>>>>>> kappa architecture and PredictionIO does not support that. It does Lambda
>>>>>> only.
>>>>>>
>>>>>> The MLlib-based recommenders use live contexts to serve from
>>>>>> in-memory copies of the ALS models but the models themselves were
>>>>>> calculated in the background. There are several scaling issues with doing
>>>>>> this but it can be done.
>>>>>>
>>>>>> On Sep 25, 2016, at 10:23 AM, Georg Heiler <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Wow thanks. This is a great explanation.
>>>>>>
>>>>>> So when I think about writing a spark template for fraud detection (a
>>>>>> combination of spark sql and xgboost ) and would require <1 second 
>>>>>> latency
>>>>>> how should I store the model?
>>>>>>
>>>>>> As far as I know startup of YARN jobs e.g. A spark job is too slow
>>>>>> for that.
>>>>>> So it would be great if the model could be evaluated without using
>>>>>> the cluster or at least having a hot spark context similar to spark
>>>>>> jobserver or SnappyData.io <http://snappydata.io> is this possible
>>>>>> for prediction.io?
>>>>>>
>>>>>> Regards,
>>>>>> Georg
>>>>>> Pat Ferrel <[email protected]> schrieb am So. 25. Sep. 2016 um
>>>>>> 18:19:
>>>>>>
>>>>>>> Gustavo it correct. To put another way both Oryx and PredictionIO
>>>>>>> are based on what is called a Lambda Architecture. Loosely speaking this
>>>>>>> means a potentially  slow background task computes the predictive 
>>>>>>> “model”
>>>>>>> but this does not interfere with serving queries. Then when the model is
>>>>>>> ready (stored in HDFS or Elasticsearch depending on the template) it is
>>>>>>> deployed and the switch happens in microseconds.
>>>>>>>
>>>>>>> In the case of the Universal Recommender the model is stored in
>>>>>>> Elasticsearch. During `pio train` the new model in inserted into
>>>>>>> Elasticsearch and indexed. Once the indexing is done the index alias 
>>>>>>> used
>>>>>>> to serve queries is switched to the new index in one atomic action so 
>>>>>>> there
>>>>>>> is no downtime and any slow operation happens in the background without
>>>>>>> impeding queries.
>>>>>>>
>>>>>>> The answer will vary somewhat with the template. Templates that use
>>>>>>> HDFS for storage may need to be re-deployed but still the switch from 
>>>>>>> using
>>>>>>> one to having the new one running is microseconds.
>>>>>>>
>>>>>>> PMML is not relevant to this above discussion and is anyway useless
>>>>>>> for many model types including recommenders. If you look carefully at 
>>>>>>> how
>>>>>>> that is implementing in Oryx you will see that the PMML “models” for
>>>>>>> recommenders are not actually stored as PMML, only a minimal 
>>>>>>> description of
>>>>>>> where the real data is stored are in PMML. Remember that it has all the
>>>>>>> problems of XML including no good way to read in parallel.
>>>>>>>
>>>>>>>
>>>>>>> On Sep 25, 2016, at 7:47 AM, Gustavo Frederico <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I undestand that the querying for PredictionIO is very fast, as if it
>>>>>>> were an Elasticsearch query. Also recall that the training moment is
>>>>>>> a
>>>>>>> different moment that often takes a long time in most learning
>>>>>>> systems, but as long as it's not ridiculously long, it doesn't matter
>>>>>>> that much.
>>>>>>>
>>>>>>> Gustavo
>>>>>>>
>>>>>>> On Sun, Sep 25, 2016 at 2:30 AM, Georg Heiler <
>>>>>>> [email protected]> wrote:
>>>>>>> > Hi predictionIO users,
>>>>>>> > I wonder what is the delay of an engine evaluating a model in
>>>>>>> prediction.io.
>>>>>>> > Are the models cached?
>>>>>>> >
>>>>>>> > Another project http://oryx.io/ is generating PMML which can be
>>>>>>> evaluated
>>>>>>> > quickly from a production application.
>>>>>>> >
>>>>>>> > I believe, that very often the latency until the prediction
>>>>>>> happens, is
>>>>>>> > overlooked. How does predictionIO handle this topic?
>>>>>>> >
>>>>>>> > Best regards,
>>>>>>> > Georg
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>

Re: delay of engines

Reply via email to