If you have discrete data, then I would think that simple cooccurrence mining would be more useful than full on association mining.
But is your data really a time-series? Are you extracting discrete features from the time series? In the following, I am assuming that when you say "real-time energy data" you actually mean something like smart meter consumption data for electricity. You could probably mean total energy emitted by a particular set of three thousand quasars as well, but I assume the former is more likely. Please correct me if you like. One very useful approach that I have seen with time series uses past data to predict the next sample (in the sense of regression). IF you have such a regression model you can use Bayesian model clustering to find multiple patterns for regression. The output of this clustering is useful as the continuous equivalent of association mining. To be more concrete, suppose that you have several kinds of energy customers: - normal consumers who leave their house empty during the day, but have a substantial bump in energy consumption in the late afternoon or evening and then have a more spread pattern of usage on the weekend. - normal consumers who work a night shift - light offices which have peak usage during normal working hours - light industry with shift work that have relatively constant energy usage If you build models for the energy consumption of these customers normalized to their previous week's total consumption and have the following features - time of day expressed as 4 sinusoids - day of week expressed as a 1 of 7 indicator - weekend expressed as a boolean I think that you will find that Bayesian model clustering will recover your original classes very nicely. On Sun, Jan 13, 2013 at 3:41 PM, Florents Tselai <[email protected]>wrote: > Real-time energy data, > Association mining is in fact the core analysis applied (but not the only > one for e.g. it could be classification as well). > > On Mon, Jan 14, 2013 at 1:34 AM, Ted Dunning <[email protected]> > wrote: > > > Can you say more about what kind of data and what kind of analysis? > > > > It is usually best if the work you do is motivated by your needs. > > > > On Sun, Jan 13, 2013 at 3:18 PM, Florents Tselai > > <[email protected]>wrote: > > > > > Hello, > > > > > > In the next weeks/months I'll be using mahout for analyzing some big > data > > > for a start-up and I'd like my work there to be also reflected in > > mahout. > > > So I'd like to be a committer. I've already read all the wiki's, > > guidlines > > > and have browsed through the jira issues. > > > > > > Firstly, I'de like to have a GOOD overview of the codebase and the > > overall > > > design. > > > So, my first thought was to start doing some refactorings (decomposing > > > methods and so on). > > > > > > Is there a specific place in the code that needs "cleaning"? > > > > > >
