On 21 Jan 2003 11:27:56 -0800, [EMAIL PROTECTED] (KR) wrote:

> I need to create a big data set out of 3 smaller ones.  Each small
> data set is arranged by year!
> 
> Each smaller data set is for a certain small geographic area.  The big
> data should represent a large area that contains these smaller areas. 
> The problem is that some of the smaller data sets are missing years so
> can't just be added together.
> 
What do you mean, "added together"?  Sure, you can do that
in a mechanical way.

You can concatenate the files.  You can put the data for
one area into one record, or you can create dummy-lines for
missing years.

How much do the missing years matter?  How much 
"information"  is missing?  Well, that's a pragmatic 
question, isn't it?  The human census for towns is
going to have a pretty fine linear extrapolation, across 
just two or three years.   The wheat crop? - won't.


> I have considered taking averages, if there are two available for a
> year take the midpoint, if there are three take the average of three,
> and if there is one just use that one, but I'm not sure if this is a
> good idea or not!

Who has what hypothesis about what?
I don't see how you can figure how to handle "missing"
unless you bring a particular question to mind.  
If annual numbers vary drastically, you can't 
extrapolate; you're pretty much stuck with using 
whatever is jointly present.

> 
> Someone has suggested to me that one of the areas represents the
> entire area better then the rest.
> 
> Any suggestions or opinions on how to form this data set?
> 
-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to