Thanks for the guidance Daniel and Donald. A few follow up items to check
my understanding...

Daniel, here is the link to the documentation I think you were referencing
on how to save a model with RDD. I read through it and see a way to train
with an RDD but did not see a way to get the SparkContext in the predict
method. From what I can see the predict method still brings in a model and
a query but not the SparkContext. Am I missing something?
https://predictionio.incubator.apache.org/templates/vanilla/dase/.

Donald, I read through the LEventStore and will start prototyping with that
path as it does not require the SparkContext to get data from the
EventStore. I would still like to test using the SparkContext in the
predict method if it is possible, would you also recommend PAlgorithm as a
solution to get a spark context in the predict method? I'm also going to
look into batch predict. For context of my use case and what I am trying to
do in the predict method:

Use Case:

   - Get 100 ids from the Query object
   - Use those 100 ids to filter down an RDD (oppRDD) with events from the
   "Opportunity" EventType
   - Cogroup or Join oppRDD with a second accountRDD (combinedRDD) and then
   return 45 attributes for each id from the combinedRDD that I will pass into
   the model.
   - Create predictions for all 100 ids and pass to Serving Class

Another thought I had was to do this work in the train method so that the
combinedRDD had the lookup data preprocessed that I could access in the
predict method. The problem with this is that new events would not
influence the predictions as the combinedRDD would only be updated until
the next retrain.

Thank for your support.

Shane

*Shane Johnson | 801.360.3350*
LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
<https://www.facebook.com/shane.johnson.71653>

On Thu, Oct 5, 2017 at 6:50 PM, Shane Johnson <shanewaldenjohn...@gmail.com>
wrote:

> Thanks Donald,
>
> By agggregative, do you mean using aggregateProperties()?
>
> We are looking to use the aggregate function to get the comprehensive most
> recent value for a given entity.
>
> If I were able to use LEvent store I'm assuming I can stay with
> P2Algorithm instead of switching to PAlgorithm. Is that correct?
>
> @Pat Ferrel, did your team do something similar, use ids to go get
> aggregate properties in the predict method? I prototyped with the universal
> Recommendation template a couple months back and remember the predict
> endpoint only required ids.
>
> Thanks for the support.
>
> On Thu, Oct 5, 2017 at 5:15 PM Donald Szeto <don...@apache.org> wrote:
>
>> Hi Shane,
>>
>> If you are not looking to do aggregative on Spark when you retrieve
>> additional information from event store, you probably should look at using
>> LEventStore that does not go through Spark. Depending on your use case, the
>> roundtrip time of involving Spark in your predict method might not be
>> feasible. (If you use `pio batchpredict` that could be an exception.)
>>
>> Regards,
>> Donald
>>
>> On Thu, Oct 5, 2017 at 3:59 PM, Shane Johnson <
>> shanewaldenjohn...@gmail.com> wrote:
>>
>>> Thanks Daniel! I'll go look for that in the docs.
>>>
>>> On Thu, Oct 5, 2017 at 3:23 PM Shane Johnson <
>>> shanewaldenjohn...@gmail.com> wrote:
>>>
>>>> Thanks Daniel. I may be missing what you are saying. I actually think I
>>>> need the Spark Context for what I am trying to do. I am wanting to extend
>>>> the predict method to use id's from the Query object and then go back into
>>>> the EventStore to get additional attributes that were not passed in the
>>>> Query.
>>>>
>>>> To do this I need to have a sc or I cannot get the properties from the
>>>> EventStore. Does that make sense. If I remove the Spark Context it does
>>>> indeed work but not for what I am trying to do.
>>>>
>>>> Thanks
>>>>
>>>> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>
>>>> On Thu, Oct 5, 2017 at 3:19 PM, Daniel O' Shaughnessy <
>>>> danieljamesda...@gmail.com> wrote:
>>>>
>>>>> It actually doesn't look like you use spark context within the predict
>>>>> method itself...
>>>>>
>>>>> Try Removing the spark context ref from the method Params and also the
>>>>> (sc) at the end of the predict method.
>>>>>
>>>>> On Thu, Oct 5, 2017 at 10:13 PM Shane Johnson <
>>>>> shanewaldenjohn...@gmail.com> wrote:
>>>>>
>>>>>> Thanks Daniel. I had the P2Algorithm working before I had to query
>>>>>> the EventStore within the predict method. Do you think this is still the
>>>>>> issue with the context that I had it working before attempting to add the
>>>>>> SparkContext?
>>>>>>
>>>>>> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
>>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
>>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>>
>>>>>> On Thu, Oct 5, 2017 at 3:07 PM, Daniel O' Shaughnessy <
>>>>>> danieljamesda...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Shane,
>>>>>>>
>>>>>>> Your RFAlgorithm class needs to use PAlgorithm instead of
>>>>>>> P2Algorithm. You then need to write some code to save and load your 
>>>>>>> model
>>>>>>> and spark context etc.
>>>>>>>
>>>>>>> There should be examples of this on the predictionio site somewhere
>>>>>>>
>>>>>>> On Thu, Oct 5, 2017 at 10:00 PM Shane Johnson <
>>>>>>> shanewaldenjohn...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi team,
>>>>>>>>
>>>>>>>> Can someone guide how I can add SparkContext into the predict
>>>>>>>> method. I am using unique ids that I gather from Query and pulling back
>>>>>>>> additional attributes from the PEventStore and am getting an error that
>>>>>>>> "sc" cannot be found. When I add SparkContext to the method I get the
>>>>>>>> following error.
>>>>>>>>
>>>>>>>> Can anyone provide direction here?
>>>>>>>>
>>>>>>>> Thank you
>>>>>>>>
>>>>>>>> Adding SparkContext eliminates the first error but produces another.
>>>>>>>>
>>>>>>>>   def predict(sc: SparkContext, model: RFModel, query: Query):
>>>>>>>> PredictedResult = {
>>>>>>>>
>>>>>>>>     val featureIndex = model.featureIndex
>>>>>>>>     val featureCategoricalIntMap = model.featureCategoricalIntMap
>>>>>>>>
>>>>>>>>
>>>>>>>>     val responses: List[PredictionResponses] =
>>>>>>>> query.predictionRequests
>>>>>>>>     .map {
>>>>>>>>         Predictions =>
>>>>>>>>
>>>>>>>>             val oppPost = PEventStore.aggregateProperties(
>>>>>>>>               appName = sys.env("PIO_EVENTSERVER_APP_NAME"),
>>>>>>>>               entityType = "Opportunity"
>>>>>>>>             )(sc)
>>>>>>>>
>>>>>>>>
>>>>>>>> First error when sc: SparkContext is not added to the method
>>>>>>>> parameters:
>>>>>>>>
>>>>>>>> not found: value sc
>>>>>>>> [INFO] [Engine$] [error]             )(sc)
>>>>>>>> [INFO] [Engine$] [error]               ^
>>>>>>>>
>>>>>>>>
>>>>>>>> Error after adding SparkContext:
>>>>>>>>
>>>>>>>> class RFAlgorithm needs to be abstract, since method predict in
>>>>>>>> class P2LAlgorithm of type (model: org.template.liftscori
>>>>>>>> ng.RFModel, query: org.template.liftscoring.Query)org.template
>>>>>>>> .liftscoring.PredictedResult is not defined
>>>>>>>> [INFO] [Engine$] [error] class RFAlgorithm(val ap:
>>>>>>>> RFAlgorithmParams)
>>>>>>>> [INFO] [Engine$] [error]       ^
>>>>>>>> [INFO] [Engine$] [error] one error found
>>>>>>>> [INFO] [Engine$] [error] (compile:compileIncremental) Compilation
>>>>>>>> failed
>>>>>>>> [INFO] [Engine$] [error] Total time: 6 s, completed Oct 5, 2017 2
>>>>>>>> :44:51 PM
>>>>>>>>
>>>>>>>> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
>>>>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
>>>>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>>>>
>>>>>>>
>>>>>>
>>>> --
>>>
>>> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
>>> <https://www.facebook.com/shane.johnson.71653>
>>>
>>
>> --
>
> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
> <https://www.facebook.com/shane.johnson.71653>
>

Reply via email to