On 21 Jan 2003 11:27:56 -0800, [EMAIL PROTECTED] (KR) wrote: > I need to create a big data set out of 3 smaller ones. Each small > data set is arranged by year! > > Each smaller data set is for a certain small geographic area. The big > data should represent a large area that contains these smaller areas. > The problem is that some of the smaller data sets are missing years so > can't just be added together. > What do you mean, "added together"? Sure, you can do that in a mechanical way.
You can concatenate the files. You can put the data for one area into one record, or you can create dummy-lines for missing years. How much do the missing years matter? How much "information" is missing? Well, that's a pragmatic question, isn't it? The human census for towns is going to have a pretty fine linear extrapolation, across just two or three years. The wheat crop? - won't. > I have considered taking averages, if there are two available for a > year take the midpoint, if there are three take the average of three, > and if there is one just use that one, but I'm not sure if this is a > good idea or not! Who has what hypothesis about what? I don't see how you can figure how to handle "missing" unless you bring a particular question to mind. If annual numbers vary drastically, you can't extrapolate; you're pretty much stuck with using whatever is jointly present. > > Someone has suggested to me that one of the areas represents the > entire area better then the rest. > > Any suggestions or opinions on how to form this data set? > -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
