I am trying to create a 2x2xk contingency table. The variables are GDP and
an income inequality statistic with year being the k levels. I want to
eventually run a loglinear model with the data. Currently the data is
organized by either year or country. example
Country Year log(GDP) sqrt(INEQ)
1 1980 24 5.3
1 1981 25 5.45
1 1982 24.5 5.4
1 1983 25 5.3
1 1984 25.5 5.5
or
Country Year log(GDP) sqrt(INEQ)
1 1980 25 5.5
2 1980 22 6.5
3 1980 23.8 6.8
4 1980 26.7 5.2
5 1980 24 6
6 1980 26 5.5
I want to reorganize the data so it's like:
Year GDP>median(for the ith year) INEQ>sqrt(40) count
1980 1
1 3
1980 1
0 6
1980 0
1 8
1980 0
0 9
1981 1
1 2
1981 1
0 7
1981 0
1 7
1981 0
0 9
So far, I've been using the "sort()" function to order the data. Then,
f63<-sort(data1963$Gdp)
data1963$INEQ[data1963$Gdp>median(f63)]
data1963$INEQ[data1963$Gdp<median(f63)]
in order to separate the data. But, there's missing data and the NA are
still being counted when I use the function "length()". I'm not sure how to
get the data in the way I need it without simply doing it by hand. I might
have to do that, but I would really rather not.
Any advice would be much appreciated,
Chris
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.