Uli Tuerk wrote:

Hi everybody!

I've a large dataset, about 2 Mio entries of the format which I would like to import into a frame:
<integer><integer><float><string><float><string><string>


Because to the huge data amount I've choosen a binary format instead of a text format when exporting from Matlab.
My import function is attached below. It works fine for only some entries but is deadly slow when trying to read the complete set.


Does anybody has some pointers for me for improving the import or handling such large data sets?

Suggestion:

a) Use a database!!!



And only for very strong reasons against a):

b) Rewrite your import code in C.

c) optimize the code below by initializing the objects in full length (e.g. imp.v <- numeric(n)) (maybe you can read it from the header or derive the size from the size of the file ....)


Uwe Ligges



Thanks in advance!

Uli



read.DET.data <- function ( f ) {
        counter <- 1
        spk.v <- c()
        imp.v <- c()
        score.v <- c()
        th.v <- c()
        ses.v <- c()
        rec.v <- c()
        type.v <- c()
        fid <- file( f ,"rb")
        tempi <- readBin(fid , integer(), size=1, signed=FALSE)
        while ( length(tempi) != 0) {
                spk.v[ counter ] <- tempi
                imp.v[ counter ] <- readBin(fid, integer(), size=1, signed=FALSE)
                score.v[ counter  ] <- readBin(fid, numeric(), size=4)
                type.v[ counter ] <- readBin(fid, character())
                th.v[ counter ] <- readBin(fid, numeric(), size=4)
                ses.v[ counter ] <- readBin(fid, character())
                rec.v[ counter ] <- readBin(fid, character())
                counter <- counter + 1
                tempi <- readBin(fid, integer(), size=1, signed=FALSE)
        }
        close( fid )
        spkf <- factor ( spk.v )
        impf <- factor ( imp.v )
        
        det.f <- data.frame( spk=spkf, imp=impf, score=score.v, th=th.v, ses=ses.v, 
rec=rec.v, type=type.v)

        det.f
}

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to