Dear all,

I have a problem with arrays.
Simplified, I have two arrays:

A =
[,,1]
1   2   3
4   5   6
7   8   9

[,,2]
10  11  12
13  14  15
16  17  18

B=
1   1   2
1   1   2
3   3   2

Basically, B declares to which cluster the values of A belong to.  Now I
want an array C where for each cluster of B, the mean of A is shown (the
mean on the first 2 dimensions):

C=
[,,1]
3    3    6
3    3    6
7.5  7.5  6
[,,2]
12    12    15
12    12    15
16.5  16.5  15

The following short solution works:

A <-
aperm(array(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18),dim=c(3,3,2)),c(2,1,3))
B <- matrix(c(1,1,2,1,1,2,3,3,2),nrow=3,byrow=TRUE)
C <- array(NA,dim=c(3,3,2))

for(a in 1:3){
TS <- matrix(A[B==a],nrow=2,byrow=TRUE)
C[B==a] <- rep(rowMeans(TS),each=sum(B==a))
}

But unfortunately, my dataset is a 64x64x64x120 array, and I have
approximately 29000 clusters, and I'd have to repeat the process on 500
simulated datasets.
If I try the code above, it takes about 2 hours...

Strange enough, it is the first line in the loop that takes so long:
reconstructing a dataset with only the datapoints belonging to a certain
cluster in it...

Would anyone know a faster method?

Kind regards,

Joke Durnez
PhD student
Ghent University
Department of Data Analysis

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to