Anne,

 

> After the above step I need to convert my ff_matrix to a data.frame to 
> discretize the whole matrix and calculate the mutual information.

> The calculated result should be saved as an ffdf-object or something similar.
> disc <- as.ffdf(discretize(as.data.frame(as.ffdf(ffmat)), disc="equalwidth", 
> nbins=5))

 

ffdf are ff's aquivalent to data.frames: they handle many rows (2^31-1) and a 
limited number of columns (with potentially different
column types). Like data.frames, they are not suitable for millions of columns. 
You probably want to store your data in one big ff matrix.



If you use ff objects because you don't have the RAM for standard R objects, 
converting ff to a data.frame is not an option because it will require too much 
RAM.

If 'discretize' expects a data.frame, you cannot call it on an ff matrix 
either. But if 'discretize' works on single columns, you can call discretize on 
chunks of columns that you coerce to data.frames.

 

something like

for (i in chunk(from=1, to=ncol(ffmat), by=10))

ffmat[,i] <- as.matrix(discretize(as.data.frame(ffmat[,i])))

 

If discretize returns integers, you might want to write the results rather to 
an integer ff matrix because this saves disk space and improves caching.

 

HTH

Jens Oehlschlägel

 

 

 

 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to