Re: [R] best option for big 3D arrays?
Steven, sorry for the delay in responding, I have been investigating this also and here is the way I do it (though probably not the best way): # .. define a 3D array ngen = 904 gratios - ff(NA, dim=rep(ngen,3), vmode=double) # .. fill the array with standard R functions ffsave (gratios, file=mydir/myfile)# without extension finalizer(gratios) - delete # .. so, you firstly define the ff object, you put the data inside, and you ffsave it. The ffsave function will generate two files, with extensions ffdata and a Rdata. Then you set 'delete' to be the 'finalizer' of the object; in this way you avoid ff to save it in some tmp dir and occupy disk space forever. Then, you can access your object in the next R session: ffload(mydir/myfile)# also without extension I hope this helped. Cheers, djordje 2012/2/23 steven mosher mosherste...@gmail.com Did you have to use a particular filename? or extension. I created a similar file but then could not read it back in Steve On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.comwrote: I've been investigating and I partially respond myself. I tried the packages 'bigmemory' and 'ff' and for me the latter did the work I need pretty straightforward. I create the array in filebacked form with the function ff, and it seems that the usual R indexing works well. I have yet to see the limitations, but I hope it helps. a foo example: myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double) myMat - matrix(1:904^2, ncol=904) for ( i in 1:904 ) { myArr[,,i] - myMat } Thanks all, 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best option for big 3D arrays?
Thanks, That helped. Maybe when I get a chance I will do some blog posts on the basics of ff I think some tutorials would be a good idea Steve On Mon, Feb 27, 2012 at 3:47 AM, Djordje Bajic je.li@gmail.com wrote: Steven, sorry for the delay in responding, I have been investigating this also and here is the way I do it (though probably not the best way): # .. define a 3D array ngen = 904 gratios - ff(NA, dim=rep(ngen,3), vmode=double) # .. fill the array with standard R functions ffsave (gratios, file=mydir/myfile)# without extension finalizer(gratios) - delete # .. so, you firstly define the ff object, you put the data inside, and you ffsave it. The ffsave function will generate two files, with extensions ffdata and a Rdata. Then you set 'delete' to be the 'finalizer' of the object; in this way you avoid ff to save it in some tmp dir and occupy disk space forever. Then, you can access your object in the next R session: ffload(mydir/myfile)# also without extension I hope this helped. Cheers, djordje 2012/2/23 steven mosher mosherste...@gmail.com Did you have to use a particular filename? or extension. I created a similar file but then could not read it back in Steve On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.com wrote: I've been investigating and I partially respond myself. I tried the packages 'bigmemory' and 'ff' and for me the latter did the work I need pretty straightforward. I create the array in filebacked form with the function ff, and it seems that the usual R indexing works well. I have yet to see the limitations, but I hope it helps. a foo example: myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double) myMat - matrix(1:904^2, ncol=904) for ( i in 1:904 ) { myArr[,,i] - myMat } Thanks all, 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best option for big 3D arrays?
Did you have to use a particular filename? or extension. I created a similar file but then could not read it back in Steve On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.com wrote: I've been investigating and I partially respond myself. I tried the packages 'bigmemory' and 'ff' and for me the latter did the work I need pretty straightforward. I create the array in filebacked form with the function ff, and it seems that the usual R indexing works well. I have yet to see the limitations, but I hope it helps. a foo example: myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double) myMat - matrix(1:904^2, ncol=904) for ( i in 1:904 ) { myArr[,,i] - myMat } Thanks all, 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best option for big 3D arrays?
I've been investigating and I partially respond myself. I tried the packages 'bigmemory' and 'ff' and for me the latter did the work I need pretty straightforward. I create the array in filebacked form with the function ff, and it seems that the usual R indexing works well. I have yet to see the limitations, but I hope it helps. a foo example: myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double) myMat - matrix(1:904^2, ncol=904) for ( i in 1:904 ) { myArr[,,i] - myMat } Thanks all, 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] best option for big 3D arrays?
On 12-02-10 9:12 AM, Djordje Bajic wrote: Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) I'd really recommend getting more RAM, so you can have the whole thing loaded in memory. 16 Gb would be nice, but even 8Gb should make a substantial difference. It's going to be too big to store as an array since arrays have a limit of 2^31-1 entries, but you could store it as a list of matrices, e.g. x - vector(list, 904) for (i in 1:904) x[[i]] - matrix(0, 904,904) and then refer to entry i,j,k as x[[i]][j,k]. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] best option for big 3D arrays?
Hi all, I am trying to fill a 904x904x904 array, but at some point of the loop R states that the 5.5Gb sized vector is too big to allocate. I have looked at packages such as bigmemory, but I need help to decide which is the best way to store such an object. It would be perfect to store it in this cube form (for indexing and computation purpouses). If not possible, maybe the best is to store the 904 matrices separately and read them individually when needed? Never dealed with such a big dataset, so any help will be appreciated (R+ESS, Debian 64bit, 4Gb RAM, 4core) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.