Many thanks for the replies to all of you!
Ok, now I have developed a vague concept how to train Mahout's
OnlineLogisticRegression moded using times series (correct me if you
detect some issue):
Given the following observations for patient 1, where a predictor is
'Heart Rate' and a target variable is 'State':
Hour | Heart Rate (mean) | State
-----------------------------------------------
1. | 90 | stable
2. | 92 | stable
3. | 94 | stable
4. | 98 | stable
5 | 100 | instable
I want to train Mahout to predict the 'State' from 1 hour in the future
(future window), based on the data from 1 hour in the past (past
window). We assume we are in hour number 2 from the table. We should
take 'Heart Rate' (or some other deltas, derived from heart rates) from
hour 1 and the 'State' from hour 3 in order to create a training
example. The next training example will be with 'Heart Rate' from hour
2 and the 'State' from hour 4. And so on.
My question is: how does Mahout discover the 'time'-aspect of the
training: won't I achieve the same result when I swap the training
examples ? Am I missing something ? Are there other issues in the concept?
Thanks and best regards,
Svetlomir.
Am 06.06.2011 22:30, schrieb Josh Patterson:
I've done a bit of time series data mining with Hadoop; I've written
up some basics on time series and map reduce at our blog:
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
while these articles wont help you on the LR end of things, it does
give you working code on github to work from as a basis wrt time
series and secondary sort (and sliding window).
Josh
On Sun, Jun 5, 2011 at 10:08 AM, Svetlomir Dimitrov Kasabov
<[email protected]> wrote:
Hello,
I plan using Apache Mahout's Logistic Regression (LR) implementation in my
Master-Thesis. We plan using time series in order to predict, whether a
particular patient will have an instable blood flow soon or not. Thats's why
I want to ask you if it is possible to use Mahout in connection with time
series ? Do you see any potential problems / risks ?
Many thanks and best regards!
Svetlomir Kasabov.
--
Svetlomir Dimitrov Kasabov
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.