albinali wrote: > Hi all, > Thanks for the responses, I will try and explain the problem more clearly > using a different example: > Lets assume we have a room filled with sensors (ex. temperature sensor, a TV > sensor that senses if the TV is on or not, a weight sensor on the floor, a > pressure sensor on the tables, ...etc.). Lets assume also that activities > are taking place in that room (e.g. dinner, lunch, watching a movie, > chatting on the phone,...etc.). Assume for the sake of this example, we have > 4 activities, the activities can possibly be detected using particular > sensors (Ex. watching the TV can be detected using the TV sesnor, maybe the > weight sensor of the couch infront of the TV (i.e. someone is watching), > possibly also a camera image that determines whether the person is looking > at the screen or not). My goal is to determine what sensor set to use for > infering the activities.
I wonder why you need to eliminate some sensors to infer activities. I don't see why you can't use all the sensors, in some form of predictive model, to predict activities. Correlation (or lack thereof) between variables may be a useful piece of information. > The data set set collected from the space has > sensor readings along with the activities. Moreover, I want to construct a > bayesian network for every activity. Notice that if I include sensors that > are somehow linearly dependant, the bayesian network will get multiple > evidence from the same source, so thats why I would like to eliminate highly > correlated variables. I don't have much background with Bayesian Networks specifically, but in other forms of modelling that I am familiar with, I would not recommend this. You may want to find a single model that predicts all activities, not a model for activity A, and a separate model for activity B, and so on. Many modeling methods, such as discriminant analysis, and neural networks, use a single model to predict whether activity A is happening, whether activity B is happening, and so forth. Eliminate highly correlated variables? Many modelling techniques actually benefit from the presence of highly correlated variables. Nevertheless, I don't see this elimination of variables as a goal of either your study or as a goal of good mathematical modelling. I see it as a requirement that someone has imposed upon a study -- there may be a sound reason for doing so, but if there is such a sound reason, you haven't articulated it yet. > The approach that I am using to tackle this problem, > is : > 1) Variable screening using logistic regression where the response variable > is the activity (non-ordinal, categorical) and the sensors are the > independent variables. Variable selection based upon correlated predictor variables is a notoriously dangerous approach. You may select the wrong variables. There has been lots of discussions of this in the statistical literature and in these newsgroups. > 2) Building a bayesian model using the variables selected Okay, let's assume you have a set of variables obtained somehow, and you are happy with this selection ... then I have no particular problem with this. > 3) determining the probabilities and likelihoods for the bayesion model > using the collected data To do what? To state how good the model actually predicts activities? To understand what influence each predictor variable has? To optimize some criterion? What would a success from all this modelling look like? How would you know if you were successful? Earlier you said "My goal is to determine what sensor set to use for infering the activities". If that is your goal, you don't really need the probabilities and likelihoods, you just need to know that these variables are good predictors. I see an inconsistency here. > Does that make sense? You're getting closer, but I don't really think it makes sense to me yet. -- Paige Miller Eastman Kodak Company [EMAIL PROTECTED] http://www.kodak.com "It's nothing until I call it!" -- Bill Klem, NL Umpire "When you get the choice to sit it out or dance, I hope you dance" -- Lee Ann Womack . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
