Re: [R] Rare Cases and SOM

2005-02-05 Thread Gabor Grothendieck
Manuel Gutierrez manuel_gutierrez_lopez at yahoo.es writes:

: 
: I am trying to understand how the SOM algorithm works
: using library(class) SOM function.
: I have a 1000*10 matrix and I want to be able to
: summarize the different types of 10-element vectors.
: In my real world case it is likely that most of the
: 1000 values are of one kind the rest of other (this is
: an oversimplification).
: Say for example:
: 
: InputA-matrix(cos(1:10),nrow=900,ncol=10,byrow=TRUE)
: InputB-matrix(sin(5:14),nrow=100,ncol=10,byrow=TRUE)
: Input-rbind(InputA,InputB)
: 
: I though that a small grid of 3*3 would be enough to
: extract the patterns in such simple matrix :
: GridWidth-3
: GridLength-3
: gr - somgrid(xdim=GridWidth,ydim=GridLength,topo =
: hexagonal)
: test.som - SOM(Input, gr)
: par(mfrow=c(GridLength,GridWidth))
: for(i in 1:(GridWidth*GridLength))
: plot(test.som$codes[i,],type=l)
: 
: Only when I use a larger grid (say for example 7*3 ) I
: get some of the representatives for the sin pattern.
: This must have something to do with the initialization
: of the grid, as the sin is so rare it is unlikely that
: I get it as a reference vector. Afterwards, because
: the selection for the training is also random it is
: also unlikely they are picked.
: I've been trying to modify some of the other
: parameters for the SOM also, but I would appreciatte
: some input to keep me going until I receive the
: reference books from my bookstore.
: 
: Are my suspictions right?
: Should I be using the SOM for my study or should I
: look somewhere else?
: NOTE: I have no prior knowledge of whether the
: datasets I want to analyse will have rare cases or not
: or where they will be located.

I don't have a direct answer to your question as I have not
used that package but I have used randomForest and it does have 
stratified sampling facitilities so that you can be sure that a rare case
is represented.  Check out the sampsize= argument.  Also there
is an article in RNews on randomForest and search this list where
you can find some relevant comments by the author of randomForest.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Rare Cases and SOM

2005-02-04 Thread Manuel Gutierrez
I am trying to understand how the SOM algorithm works
using library(class) SOM function.
I have a 1000*10 matrix and I want to be able to
summarize the different types of 10-element vectors.
In my real world case it is likely that most of the
1000 values are of one kind the rest of other (this is
an oversimplification).
Say for example:

InputA-matrix(cos(1:10),nrow=900,ncol=10,byrow=TRUE)
InputB-matrix(sin(5:14),nrow=100,ncol=10,byrow=TRUE)
Input-rbind(InputA,InputB)

I though that a small grid of 3*3 would be enough to
extract the patterns in such simple matrix :
GridWidth-3
GridLength-3
gr - somgrid(xdim=GridWidth,ydim=GridLength,topo =
hexagonal)
test.som - SOM(Input, gr)
par(mfrow=c(GridLength,GridWidth))
for(i in 1:(GridWidth*GridLength))
plot(test.som$codes[i,],type=l)

Only when I use a larger grid (say for example 7*3 ) I
get some of the representatives for the sin pattern.
This must have something to do with the initialization
of the grid, as the sin is so rare it is unlikely that
I get it as a reference vector. Afterwards, because
the selection for the training is also random it is
also unlikely they are picked.
I've been trying to modify some of the other
parameters for the SOM also, but I would appreciatte
some input to keep me going until I receive the
reference books from my bookstore.

Are my suspictions right?
Should I be using the SOM for my study or should I
look somewhere else?
NOTE: I have no prior knowledge of whether the
datasets I want to analyse will have rare cases or not
or where they will be located.
Thanks,
Manuel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html