Thanks everyone for the replies, they are indeed very helpful. Here are some additional clarifications: - I am suggesting to eliminate highly dependent variables to avoid having the same evidence from different source, for example, a light intensity sensor is likely to be correlated with a sensor that detects if the curtains on a window are open, using both sensors as evidence sources to infer a particular activity ex. a breakfast might erroneously increase the belief about a breakfast activity taking place while infact both sensors are reflecting the same evidence. Thats why I need to filter them out. But again maybe I am missing something - The bayesian network and its probabilities will be used to run against a test set that includes sensor readings along with activities, if the network is successful in classifying the activities, then that is a good model for classifying the activities in that particular space, given some particular individuals and some particular patterns of behavior.
fahd "Paige Miller" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > albinali wrote: > > Hi all, > > Thanks for the responses, I will try and explain the problem more clearly > > using a different example: > > Lets assume we have a room filled with sensors (ex. temperature sensor, a TV > > sensor that senses if the TV is on or not, a weight sensor on the floor, a > > pressure sensor on the tables, ...etc.). Lets assume also that activities > > are taking place in that room (e.g. dinner, lunch, watching a movie, > > chatting on the phone,...etc.). Assume for the sake of this example, we have > > 4 activities, the activities can possibly be detected using particular > > sensors (Ex. watching the TV can be detected using the TV sesnor, maybe the > > weight sensor of the couch infront of the TV (i.e. someone is watching), > > possibly also a camera image that determines whether the person is looking > > at the screen or not). My goal is to determine what sensor set to use for > > infering the activities. > > I wonder why you need to eliminate some sensors to infer activities. I > don't see why you can't use all the sensors, in some form of predictive > model, to predict activities. Correlation (or lack thereof) between > variables may be a useful piece of information. > > > The data set set collected from the space has > > sensor readings along with the activities. Moreover, I want to construct a > > bayesian network for every activity. Notice that if I include sensors that > > are somehow linearly dependant, the bayesian network will get multiple > > evidence from the same source, so thats why I would like to eliminate highly > > correlated variables. > > I don't have much background with Bayesian Networks specifically, but in > other forms of modelling that I am familiar with, I would not recommend > this. You may want to find a single model that predicts all activities, > not a model for activity A, and a separate model for activity B, and so > on. Many modeling methods, such as discriminant analysis, and neural > networks, use a single model to predict whether activity A is happening, > whether activity B is happening, and so forth. > > Eliminate highly correlated variables? Many modelling techniques > actually benefit from the presence of highly correlated variables. > Nevertheless, I don't see this elimination of variables as a goal of > either your study or as a goal of good mathematical modelling. I see it > as a requirement that someone has imposed upon a study -- there may be a > sound reason for doing so, but if there is such a sound reason, you > haven't articulated it yet. > > > The approach that I am using to tackle this problem, > > is : > > 1) Variable screening using logistic regression where the response variable > > is the activity (non-ordinal, categorical) and the sensors are the > > independent variables. > > Variable selection based upon correlated predictor variables is a > notoriously dangerous approach. You may select the wrong variables. > There has been lots of discussions of this in the statistical literature > and in these newsgroups. > > > 2) Building a bayesian model using the variables selected > > Okay, let's assume you have a set of variables obtained somehow, and you > are happy with this selection ... then I have no particular problem with > this. > > > 3) determining the probabilities and likelihoods for the bayesion model > > using the collected data > > To do what? To state how good the model actually predicts activities? To > understand what influence each predictor variable has? To optimize some > criterion? What would a success from all this modelling look like? How > would you know if you were successful? > > Earlier you said "My goal is to determine what sensor set to use for > infering the activities". If that is your goal, you don't really need > the probabilities and likelihoods, you just need to know that these > variables are good predictors. I see an inconsistency here. > > > Does that make sense? > > You're getting closer, but I don't really think it makes sense to me yet. > > -- > Paige Miller > Eastman Kodak Company > [EMAIL PROTECTED] > http://www.kodak.com > > "It's nothing until I call it!" -- Bill Klem, NL Umpire > "When you get the choice to sit it out or dance, I hope you dance" -- > Lee Ann Womack > . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
