Re: clustering in a time series

2012-05-16 Thread Marc Lalancette
Thanks for the replies,

Yes the head is assumed rigid and the between marker distances should be 
invariant except for measurement error AND if a marker fell off, which is the 
first thing I want to detect.  Once that is done, I also want to detect 
movement of the subject's head as James said, with the goal of partitioning the 
time series in intervals where there was no detected movement (above noise).  
For now, I've been concentrating on the first aspect, and so dealing with the 3 
distances between pairs of markers that should be constant.

We do co-registration with MRI images and we could also co-register multiple 
MEG scans as Bill suggested, but that is a separate issue, at this point I'm 
interested in within-scan movement.

My main question I suppose is whether the method I described has major flaws, 
or if there would be better (simpler, more efficient) alternatives.  The aspect 
I'm least comfortable with is how to determine (automatically) the measurement 
error so I can detect real change.  In what I suggested, I use the distribution 
of distances between adjacent time points and compare that with distances 
between all time points (within a cluster) to decide if it needs to be split, 
and where.  But that assumes there are few rapid movements (slow ones aren't as 
bad).  Is there a better way to extract the noise profile or to detect real 
change in the unknown noise?  Another question more relevant to this list: I 
also wanted to know if there would be better clustering methods I could use 
here.  I haven't found anywhere yet a clustering method that would be based on 
3-d distances, but still restrict clusters to be time intervals.

Thanks again!
Marc



This e-mail may contain confidential, personal and/or health 
information(information which may be subject to legal restrictions on use, 
retention and/or disclosure) for the sole use of the intended recipient. Any 
review or distribution by anyone other than the person for whom it was 
originally intended is strictly prohibited. If you have received this e-mail in 
error, please contact the sender and delete all copies.

--
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


clustering in a time series

2012-05-15 Thread Marc Lalancette
Hello,

I was referred to this list by someone on sci.stat.math 
(https://groups.google.com/forum/?fromgroups#!topic/sci.stat.math/-bDEys5WTjk). 
 I apologize if this is not the right forum for this type of question.  I have 
limited stats knowledge and I've been doing some research to find a good 
solution to my problem.  I'll first describe what I want to do and then what I 
came up with based on some more or less fruitful research.  I'd appreciate some 
suggestions, tips, references to similar or better methods, etc.

I have 3-d motion data for 3 markers stuck on a person's head, while he/she is 
trying to be still.  The first step is making sure the markers did not fall 
off, so I calculate the 3 between marker distances across time and in what 
follows I basically treat that as a 3-d Euclidean vector, even though it's not 
Euclidean, but I'm not sure how else to combine these 3 distances...  (Also, 
later, I could ask the same questions about each marker's position to see if 
the person moved and in that case it is Euclidean 3-d space.)  I want to detect 
changes, and get a good partition of the time series into intervals based on 
when changes occurred, i.e. a list of roughly stationary intervals and an 
associated position for each interval.  I'm assuming there won't be many 
changes and that they would mostly be fast (i.e. step-like time series), but 
they could also be slow, in which case I'd still want to detect it and split it 
in chunks that are mostly still, depending on the measurement error.

After some research, here's what I came up with.  My first idea was to use 3-d 
distances between samples adjacent in time to evaluate the measurement error, 
i.e. an approximation of the distribution if there were no movement (that's why 
the assumption of few changes is important).  Comparing that distribution with 
the distribution of all inter-point distances (or a subset: maybe all distances 
from the first point) would tell me if there was any movement.  I was thinking 
of using the Kolmogorov-Smirnov test for this.  Then I thought I could use a 
hierarchical clustering method based on the same idea.  Divisive since I expect 
few clusters.  I would recursively look for the boundary point in time that 
would maximize the KS test probability for the intervals on both sides 
(one-sided test since intervals that are unusually stationary would be ok).  
Then I looked for something that would account for model complexity (thinking 
of reduced chi-square) and found the AIC.  Maybe I could interpret the combined 
KS probabilities as a likelihood for that particular partition and use the AIC 
to decide when to stop dividing the intervals.

This is what I came up with based on what I found in my research.  Almost all 
of the concepts and methods I mention I didn't know about a week ago, so I 
assume the resulting amalgamation has quite a few weaknesses even though it 
might work.  I'd be happy to hear what knowledgeable people would have to say 
about this.  Feel free to contact me directly by email.  I'll also monitor the 
list for replies.

Thanks!

Marc Lalancette
Research MEG Lab Project Manager
Program in Neurosciences and Mental Health, Department of Diagnostic Imaging, 
The Hospital for Sick Children, Toronto, Canada



This e-mail may contain confidential, personal and/or health 
information(information which may be subject to legal restrictions on use, 
retention and/or disclosure) for the sole use of the intended recipient. Any 
review or distribution by anyone other than the person for whom it was 
originally intended is strictly prohibited. If you have received this e-mail in 
error, please contact the sender and delete all copies.

--
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


Re: clustering in a time series

2012-05-15 Thread F. James Rohlf
I am not exactly what question is being asked. I assume the head is rigid. So 
the 3 distances should be invariant except for measurement error. To detect 
movement one would want to detect changes in the location of the 3 points in 
space and their orientation would also be of interest. Does that sound 
reasonable? 
---
Sent remotely by F. James Rohlf,
John S. Toll Professor, Stony Brook University

-Original Message-
From: Marc Lalancette marc.lalance...@sickkids.ca
Sender: Classification, clustering, and phylogeny estimation 
CLASS-L@lists.sunysb.edu
Date: Tue, 15 May 2012 21:05:15 
To: CLASS-L@lists.sunysb.edu
Reply-To: Classification, clustering, and phylogeny estimation
  CLASS-L@lists.sunysb.edu
Subject: clustering in a time series

Hello,

I was referred to this list by someone on sci.stat.math 
(https://groups.google.com/forum/?fromgroups#!topic/sci.stat.math/-bDEys5WTjk). 
 I apologize if this is not the right forum for this type of question.  I have 
limited stats knowledge and I've been doing some research to find a good 
solution to my problem.  I'll first describe what I want to do and then what I 
came up with based on some more or less fruitful research.  I'd appreciate some 
suggestions, tips, references to similar or better methods, etc.

I have 3-d motion data for 3 markers stuck on a person's head, while he/she is 
trying to be still.  The first step is making sure the markers did not fall 
off, so I calculate the 3 between marker distances across time and in what 
follows I basically treat that as a 3-d Euclidean vector, even though it's not 
Euclidean, but I'm not sure how else to combine these 3 distances...  (Also, 
later, I could ask the same questions about each marker's position to see if 
the person moved and in that case it is Euclidean 3-d space.)  I want to detect 
changes, and get a good partition of the time series into intervals based on 
when changes occurred, i.e. a list of roughly stationary intervals and an 
associated position for each interval.  I'm assuming there won't be many 
changes and that they would mostly be fast (i.e. step-like time series), but 
they could also be slow, in which case I'd still want to detect it and split it 
in chunks that are mostly still, depending on the measurement error.

After some research, here's what I came up with.  My first idea was to use 3-d 
distances between samples adjacent in time to evaluate the measurement error, 
i.e. an approximation of the distribution if there were no movement (that's why 
the assumption of few changes is important).  Comparing that distribution with 
the distribution of all inter-point distances (or a subset: maybe all distances 
from the first point) would tell me if there was any movement.  I was thinking 
of using the Kolmogorov-Smirnov test for this.  Then I thought I could use a 
hierarchical clustering method based on the same idea.  Divisive since I expect 
few clusters.  I would recursively look for the boundary point in time that 
would maximize the KS test probability for the intervals on both sides 
(one-sided test since intervals that are unusually stationary would be ok).  
Then I looked for something that would account for model complexity (thinking 
of reduced chi-square) and found the AIC.  Maybe I could interpret the combined 
KS probabilities as a likelihood for that particular partition and use the AIC 
to decide when to stop dividing the intervals.

This is what I came up with based on what I found in my research.  Almost all 
of the concepts and methods I mention I didn't know about a week ago, so I 
assume the resulting amalgamation has quite a few weaknesses even though it 
might work.  I'd be happy to hear what knowledgeable people would have to say 
about this.  Feel free to contact me directly by email.  I'll also monitor the 
list for replies.

Thanks!

Marc Lalancette
Research MEG Lab Project Manager
Program in Neurosciences and Mental Health, Department of Diagnostic Imaging, 
The Hospital for Sick Children, Toronto, Canada



This e-mail may contain confidential, personal and/or health 
information(information which may be subject to legal restrictions on use, 
retention and/or disclosure) for the sole use of the intended recipient. Any 
review or distribution by anyone other than the person for whom it was 
originally intended is strictly prohibited. If you have received this e-mail in 
error, please contact the sender and delete all copies.

--
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


--
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l


Re: clustering in a time series

2012-05-15 Thread F. James Rohlf
Perhaps un-morphometrics. In morphometrics we discard location, orientation, 
and size and analyze what variation is left - shape. Here it sounds like you 
want the opposite.  

Jim
---
Sent remotely by F. James Rohlf,
John S. Toll Professor, Stony Brook University

-Original Message-
From: Shannon, William wshan...@dom.wustl.edu
Sender: Classification, clustering, and phylogeny estimation 
CLASS-L@lists.sunysb.edu
Date: Tue, 15 May 2012 18:55:15 
To: CLASS-L@lists.sunysb.edu
Reply-To: Classification, clustering, and phylogeny estimation
  CLASS-L@lists.sunysb.edu
Subject: Re: clustering in a time series

Jim

I was thinking similarly that the goal is to keep the head of the person being 
imaged (using MEG) completely immobile.  If they are running multiple scans 
then the registration of one image with the next is greatly simplified by 
knowing the brain has not moved.

I also was thinking that morphometric techniques might be useful. Instead of 
transforming one set of morphometric measurements onto another set, they could 
use these to test that no transformation is required.

Marc, does this sound right?

Thank you

Bill Shannon, PhD, MBA (In Progress)
Professor of Biostatistics in Medicine
Washington University School of Medicine
Director, Biostatistical Consulting Center
314-454-8356


From: Classification, clustering, and phylogeny estimation 
[CLASS-L@LISTS.SUNYSB.EDU] On Behalf Of F. James Rohlf 
[ro...@life.bio.sunysb.edu]
Sent: Tuesday, May 15, 2012 6:39 PM
To: CLASS-L@LISTS.SUNYSB.EDU
Subject: Re: clustering in a time series

I am not exactly what question is being asked. I assume the head is rigid. So 
the 3 distances should be invariant except for measurement error. To detect 
movement one would want to detect changes in the location of the 3 points in 
space and their orientation would also be of interest. Does that sound 
reasonable?
---
Sent remotely by F. James Rohlf,
John S. Toll Professor, Stony Brook University

From: Marc Lalancette marc.lalance...@sickkids.ca
Sender: Classification, clustering, and phylogeny estimation 
CLASS-L@lists.sunysb.edu
Date: Tue, 15 May 2012 21:05:15 +
To: CLASS-L@lists.sunysb.edu
ReplyTo: Classification, clustering, and phylogeny estimation 
CLASS-L@lists.sunysb.edu
Subject: clustering in a time series

Hello,

I was referred to this list by someone on sci.stat.math 
(https://groups.google.com/forum/?fromgroups#!topic/sci.stat.math/-bDEys5WTjk). 
 I apologize if this is not the right forum for this type of question.  I have 
limited stats knowledge and I've been doing some research to find a good 
solution to my problem.  I'll first describe what I want to do and then what I 
came up with based on some more or less fruitful research.  I'd appreciate some 
suggestions, tips, references to similar or better methods, etc.

I have 3-d motion data for 3 markers stuck on a person's head, while he/she is 
trying to be still.  The first step is making sure the markers did not fall 
off, so I calculate the 3 between marker distances across time and in what 
follows I basically treat that as a 3-d Euclidean vector, even though it's not 
Euclidean, but I'm not sure how else to combine these 3 distances...  (Also, 
later, I could ask the same questions about each marker's position to see if 
the person moved and in that case it is Euclidean 3-d space.)  I want to detect 
changes, and get a good partition of the time series into intervals based on 
when changes occurred, i.e. a list of roughly stationary intervals and an 
associated position for each interval.  I'm assuming there won't be many 
changes and that they would mostly be fast (i.e. step-like time series), but 
they could also be slow, in which case I'd still want to detect it and split it 
in chunks that are mostly still, depending on the measurement error.

After some research, here's what I came up with.  My first idea was to use 3-d 
distances between samples adjacent in time to evaluate the measurement error, 
i.e. an approximation of the distribution if there were no movement (that's why 
the assumption of few changes is important).  Comparing that distribution with 
the distribution of all inter-point distances (or a subset: maybe all distances 
from the first point) would tell me if there was any movement.  I was thinking 
of using the Kolmogorov-Smirnov test for this.  Then I thought I could use a 
hierarchical clustering method based on the same idea.  Divisive since I expect 
few clusters.  I would recursively look for the boundary point in time that 
would maximize the KS test probability for the intervals on both sides 
(one-sided test since intervals that are unusually stationary would be ok).  
Then I looked for something that would account for model complexity (thinking 
of reduced chi-square) and found the AIC.  Maybe I could interpret the combined 
KS probabilities as a likelihood