Re: Re: Re: Spark Streaming prediction

2017-01-03 Thread Marco Mistroni
Hi
 ok then my suggestion stays.Check out ML
you can train your ML model on past data (let's say, either yesteday or
past x days) to have Spark find out what is the relation betwen the value
you have at T-zero and the value you have at T+n hours and you can try ml
outside your. Streaming app by gathering data for x days , feed it to your
model and see results
Hth

On Mon, Jan 2, 2017 at 9:51 PM, Daniela S <daniela_4...@gmx.at> wrote:

> Dear Marco
>
> No problem, thank you very much for your help!
> Yes, that is correct. I always know the minute values for the next e.g.
> 180 minutes (may vary between the different devices) and I want to predict
> the values for the next 24 hours (one value per minute). So as long as
> I know the values (e.g. 180 minutes) I would of course like to use these
> values and the missing ones to get values for the next 24 hours (one value
> per minute) should be predicted.
>
> Thank you in advance.
>
> Regards,
> Daniela
>
> *Gesendet:* Montag, 02. Januar 2017 um 22:30 Uhr
> *Von:* "Marco Mistroni" <mmistr...@gmail.com>
> *An:* "Daniela S" <daniela_4...@gmx.at>
> *Cc:* User <user@spark.apache.org>
> *Betreff:* Re: Re: Spark Streaming prediction
> Apologies, perhaps i misunderstood your usecase.
> My assumption was that you have 2-3 hours worth fo data and you want to
> know the values for the next 24 based on the values you already have, that
> is why i suggested  the ML path.
> If that is not the case please ignore everything i said..
>
> so, let's take the simple case where you have only 1 device
> So every event contains the minute value of that device for the next 180
> mins. So at any point in time you only  have visibility of the next 180
> minutes, correct?
> Now do you want to predict what the value will be for the next 24 hrs, or
> do you  just want to accumulate data worth of 24 hrs and display it in the
> dashboard?
> or is it something else?
>
> for dashboard update, i guess you either
> - poll 'a  database' (where you store the compuation of your spark logic )
> periodically
> - propagate events from your spark streaming application to your dashboard
> somewhere (via actors/ JMS or whatever mechanism)
>
> kr
>  marco
>
>
>
>
>
>
>
>
>
> On Mon, Jan 2, 2017 at 8:26 PM, Daniela S <daniela_4...@gmx.at> wrote:
>>
>> Hi
>>
>> Thank you very much for your answer!
>>
>> My problem is that I know the values for the next 2-3 hours in advance
>> but i do not know the values from hour 2 or 3 to hour 24. How is it
>> possible to combine the known values with the predicted values as both are
>> values in the future? And how can i ensure that there are always 1440
>> values?
>> And I do not know how to map the values for 1440 minutes to a specific
>> time on the dashboard (e.g. how does the dashboard know that the value for
>> minute 300 maps to time 15:05?
>>
>> Thank you in advance.
>>
>> Best regards,
>> Daniela
>>
>>
>>
>> *Gesendet:* Montag, 02. Januar 2017 um 21:07 Uhr
>> *Von:* "Marco Mistroni" <mmistr...@gmail.com>
>> *An:* "Daniela S" <daniela_4...@gmx.at>
>> *Cc:* User <user@spark.apache.org>
>> *Betreff:* Re: Spark Streaming prediction
>> Hi
>>  you  might want to have a look at the Regression ML  algorithm and
>> integrate it in your SparkStreaming application, i m sure someone on the
>> list has  a similar use case
>> shortly, you'd want to process all your events and feed it through a ML
>> model which,based on your inputs will predict output
>> You say that your events predict minutes values for next 2-3 hrs...
>> gather data for a day and train ur model based on that. Then save it
>> somewhere and have your streaming app load the module and have the module
>> do the predictions based on incoming events from your streaming app.
>> Save the results somewhere and have your dashboard poll periodically your
>> data store to read the predictions
>> I have seen ppl on the list doing ML over a Spark streaming app, i m sure
>> someone can reply back
>> Hpefully i gave u a starting point
>>
>> hth
>>  marco
>>
>> On 2 Jan 2017 4:03 pm, "Daniela S" <daniela_4...@gmx.at> wrote:
>>>
>>> Hi
>>>
>>> I am trying to solve the following problem with Spark Streaming.
>>> I receive timestamped events from Kafka. Each event refers to a device
>>> and contains values for every minute of the next 2 to 3 hours. What I would
>>> like to do is to predict the minute values for the next 2

Aw: Re: Re: Spark Streaming prediction

2017-01-02 Thread Daniela S
Dear Marco

 

No problem, thank you very much for your help!

Yes, that is correct. I always know the minute values for the next e.g. 180 minutes (may vary between the different devices) and I want to predict the values for the next 24 hours (one value per minute). So as long as I know the values (e.g. 180 minutes) I would of course like to use these values and the missing ones to get values for the next 24 hours (one value per minute) should be predicted.

 

Thank you in advance.

 

Regards,

Daniela

 

Gesendet: Montag, 02. Januar 2017 um 22:30 Uhr
Von: "Marco Mistroni" <mmistr...@gmail.com>
An: "Daniela S" <daniela_4...@gmx.at>
Cc: User <user@spark.apache.org>
Betreff: Re: Re: Spark Streaming prediction










Apologies, perhaps i misunderstood your usecase.
My assumption was that you have 2-3 hours worth fo data and you want to know the values for the next 24 based on the values you already have, that is why i suggested  the ML path.
If that is not the case please ignore everything i said..

so, let's take the simple case where you have only 1 device
So every event contains the minute value of that device for the next 180 mins. So at any point in time you only  have visibility of the next 180  minutes, correct?
Now do you want to predict what the value will be for the next 24 hrs, or do you  just want to accumulate data worth of 24 hrs and display it in the dashboard?
or is it something else?
 
for dashboard update, i guess you either
- poll 'a  database' (where you store the compuation of your spark logic ) periodically
- propagate events from your spark streaming application to your dashboard somewhere (via actors/ JMS or whatever mechanism)







 
kr

 marco




 




 











 
On Mon, Jan 2, 2017 at 8:26 PM, Daniela S <daniela_4...@gmx.at> wrote:





Hi

 

Thank you very much for your answer!

 

My problem is that I know the values for the next 2-3 hours in advance but i do not know the values from hour 2 or 3 to hour 24. How is it possible to combine the known values with the predicted values as both are values in the future? And how can i ensure that there are always 1440 values?

And I do not know how to map the values for 1440 minutes to a specific time on the dashboard (e.g. how does the dashboard know that the value for minute 300 maps to time 15:05?

 

Thank you in advance.

 

Best regards,

Daniela

 

 

 

Gesendet: Montag, 02. Januar 2017 um 21:07 Uhr
Von: "Marco Mistroni" <mmistr...@gmail.com>
An: "Daniela S" <daniela_4...@gmx.at>
Cc: User <user@spark.apache.org>
Betreff: Re: Spark Streaming prediction



Hi
 you  might want to have a look at the Regression ML  algorithm and integrate it in your SparkStreaming application, i m sure someone on the list has  a similar use case

shortly, you'd want to process all your events and feed it through a ML  model which,based on your inputs will predict output

You say that your events predict minutes values for next 2-3 hrs... gather data for a day and train ur model based on that. Then save it somewhere and have your streaming app load the module and have the module do the predictions based on incoming events from your streaming app.
Save the results somewhere and have your dashboard poll periodically your data store to read the predictions

I have seen ppl on the list doing ML over a Spark streaming app, i m sure someone can reply back

Hpefully i gave u a starting point
 

hth

 marco


 
On 2 Jan 2017 4:03 pm, "Daniela S" <daniela_4...@gmx.at> wrote:




Hi

 

I am trying to solve the following problem with Spark Streaming.

I receive timestamped events from Kafka. Each event refers to a device and contains values for every minute of the next 2 to 3 hours. What I would like to do is to predict the minute values for the next 24 hours. So I would like to use the known values and to predict the other values to achieve the 24 hours prediction. My thought was to use arrays with a length of 1440 (1440 minutes = 24 hours). One for the known values and one for the predicted values for each device. Then I would like to show the next 24 hours on a dashboard. The dashboard should be updated automatically in realtime. 

 

My questions:

is this a possible solution?

how is it possible to combine known future values and predicted values?

how should I treat the timestamp as the length of 1440 does not correspond to a timestamp?

how is it possible to update the dashboard automatically in realtime?

 

Thank you in advance!

 

Best regards,

Daniela


- To unsubscribe e-mail: user-unsubscr...@spark.apache.org









- To unsubscribe e-mail: user-unsubscr...@spark.apache.org






-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Re: Spark Streaming prediction

2017-01-02 Thread Marco Mistroni
Apologies, perhaps i misunderstood your usecase.
My assumption was that you have 2-3 hours worth fo data and you want to
know the values for the next 24 based on the values you already have, that
is why i suggested  the ML path.
If that is not the case please ignore everything i said..

so, let's take the simple case where you have only 1 device
So every event contains the minute value of that device for the next 180
mins. So at any point in time you only  have visibility of the next 180
minutes, correct?
Now do you want to predict what the value will be for the next 24 hrs, or
do you  just want to accumulate data worth of 24 hrs and display it in the
dashboard?
or is it something else?

for dashboard update, i guess you either
- poll 'a  database' (where you store the compuation of your spark logic )
periodically
- propagate events from your spark streaming application to your dashboard
somewhere (via actors/ JMS or whatever mechanism)

kr
 marco









On Mon, Jan 2, 2017 at 8:26 PM, Daniela S  wrote:

> Hi
>
> Thank you very much for your answer!
>
> My problem is that I know the values for the next 2-3 hours in advance but
> i do not know the values from hour 2 or 3 to hour 24. How is it possible to
> combine the known values with the predicted values as both are values in
> the future? And how can i ensure that there are always 1440 values?
> And I do not know how to map the values for 1440 minutes to a specific
> time on the dashboard (e.g. how does the dashboard know that the value for
> minute 300 maps to time 15:05?
>
> Thank you in advance.
>
> Best regards,
> Daniela
>
>
>
> *Gesendet:* Montag, 02. Januar 2017 um 21:07 Uhr
> *Von:* "Marco Mistroni" 
> *An:* "Daniela S" 
> *Cc:* User 
> *Betreff:* Re: Spark Streaming prediction
> Hi
>  you  might want to have a look at the Regression ML  algorithm and
> integrate it in your SparkStreaming application, i m sure someone on the
> list has  a similar use case
> shortly, you'd want to process all your events and feed it through a ML
> model which,based on your inputs will predict output
> You say that your events predict minutes values for next 2-3 hrs... gather
> data for a day and train ur model based on that. Then save it somewhere and
> have your streaming app load the module and have the module do the
> predictions based on incoming events from your streaming app.
> Save the results somewhere and have your dashboard poll periodically your
> data store to read the predictions
> I have seen ppl on the list doing ML over a Spark streaming app, i m sure
> someone can reply back
> Hpefully i gave u a starting point
>
> hth
>  marco
>
> On 2 Jan 2017 4:03 pm, "Daniela S"  wrote:
>>
>> Hi
>>
>> I am trying to solve the following problem with Spark Streaming.
>> I receive timestamped events from Kafka. Each event refers to a device
>> and contains values for every minute of the next 2 to 3 hours. What I would
>> like to do is to predict the minute values for the next 24 hours. So I
>> would like to use the known values and to predict the other values to
>> achieve the 24 hours prediction. My thought was to use arrays with a length
>> of 1440 (1440 minutes = 24 hours). One for the known values and one for the
>> predicted values for each device. Then I would like to show the next 24
>> hours on a dashboard. The dashboard should be updated automatically in
>> realtime.
>>
>> My questions:
>> is this a possible solution?
>> how is it possible to combine known future values and predicted values?
>> how should I treat the timestamp as the length of 1440 does not
>> correspond to a timestamp?
>> how is it possible to update the dashboard automatically in realtime?
>>
>> Thank you in advance!
>>
>> Best regards,
>> Daniela
>> - To
>> unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
> - To
> unsubscribe e-mail: user-unsubscr...@spark.apache.org