Many thanks for the replies to all of you!

Ok, now I have developed a vague concept how to train Mahout's OnlineLogisticRegression moded using times series (correct me if you detect some issue):

Given the following observations for patient 1, where a predictor is 'Heart Rate' and a target variable is 'State':

Hour |  Heart Rate (mean) | State
-----------------------------------------------
1.      | 90                            | stable
2.      | 92                            | stable
3.      | 94                            | stable
4.      | 98                            | stable
5       | 100                          | instable

I want to train Mahout to predict the 'State' from 1 hour in the future (future window), based on the data from 1 hour in the past (past window). We assume we are in hour number 2 from the table. We should take 'Heart Rate' (or some other deltas, derived from heart rates) from hour 1 and the 'State' from hour 3 in order to create a training example. The next training example will be with 'Heart Rate' from hour 2 and the 'State' from hour 4. And so on.

My question is: how does Mahout discover the 'time'-aspect of the training: won't I achieve the same result when I swap the training examples ? Am I missing something ? Are there other issues in the concept?

Thanks and best regards,

Svetlomir.


Am 06.06.2011 22:30, schrieb Josh Patterson:
I've done a bit of time series data mining with Hadoop; I've written
up some basics on time series and map reduce at our blog:

http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

while these articles wont help you on the LR end of things, it does
give you working code on github to work from as a basis wrt time
series and secondary sort (and sliding window).

Josh

On Sun, Jun 5, 2011 at 10:08 AM, Svetlomir Dimitrov Kasabov
<[email protected]>  wrote:
Hello,

I plan using Apache Mahout's Logistic Regression (LR) implementation in my
Master-Thesis. We plan using time series in order to predict, whether a
particular patient will have an instable blood flow soon or not. Thats's why
I want to ask you if it is possible to use Mahout in connection with time
series ? Do you see any potential problems / risks ?

Many thanks and best regards!

Svetlomir Kasabov.



--
Svetlomir Dimitrov Kasabov

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.





Reply via email to