On Fri, Jan 21, 2011 at 12:39 AM, Vasil Vasilev <[email protected]> wrote:
> > dimension 1: Using linear regression with gradient descent algorithm I find > what is the trend of the line, i.e. is it increasing, decreasing or > straight > line > dimension 2: Knowing the approximating line (from the linear regression) I > count how many times this line gets crossed by the original signal. This > helps in separating the cyclic data from all the rest > dimension 3: What is the biggest increase/decrease of a single signal line. > This helps find shifts > > So to say - I put a semantics for the data that are to be clustered (I > don't > know if it is correct to do that, but I couldn't think of how an algorithm > could cope with the task without such additional semantics) > It is very common for feature extraction like this to be the key for data-mining projects. Such features are absolutely critical for most time series mining and are highly application dependent. One key aspect of your features is that they are shift invariant. > Also I developed a small swing application which visualizes the clustered > signals and which helped me in playing with the algorithms. > Great idea.
