Re: [R] best option for big 3D arrays?

2012-02-27 Thread Djordje Bajic
Steven, sorry for the delay in responding,

I have been investigating this also and here is the way I do it (though
probably not the best way):

# .. define a 3D array
 ngen = 904
 gratios - ff(NA, dim=rep(ngen,3), vmode=double)

# .. fill the array with standard R functions

 ffsave (gratios, file=mydir/myfile)# without extension
 finalizer(gratios) - delete

# ..

so, you firstly define the ff object, you put the data inside, and you
ffsave it. The ffsave function will generate two files, with extensions
ffdata and a Rdata. Then you set 'delete' to be the 'finalizer' of the
object; in this way you avoid ff to save it in some tmp dir and occupy disk
space forever. Then, you can access your object in the next R session:

 ffload(mydir/myfile)# also without extension

I hope this helped.

Cheers,

djordje



2012/2/23 steven mosher mosherste...@gmail.com

 Did you have to use a particular filename?  or extension.

 I created a similar file but then could not read it back in

 Steve

 On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.comwrote:

 I've been investigating and I partially respond myself. I tried the
 packages 'bigmemory' and 'ff' and for me the latter did the work I need
 pretty straightforward. I create the array in filebacked form with the
 function ff, and it seems that the usual R indexing works well. I have yet
 to see the limitations, but I hope it helps.

 a foo example:

 myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double)
 myMat - matrix(1:904^2, ncol=904)
 for ( i in 1:904 ) {
myArr[,,i] - myMat
 }

 Thanks all,

 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com

  On 12-02-10 9:12 AM, Djordje Bajic wrote:
 
  Hi all,
 
  I am trying to fill a 904x904x904 array, but at some point of the loop
 R
  states that the 5.5Gb sized vector is too big to allocate. I have
 looked
  at
  packages such as bigmemory, but I need help to decide which is the
 best
  way to store such an object. It would be perfect to store it in this
  cube
  form (for indexing and computation purpouses). If not possible, maybe
 the
  best is to store the 904 matrices separately and read them individually
  when needed?
 
  Never dealed with such a big dataset, so any help will be appreciated
 
  (R+ESS, Debian 64bit, 4Gb RAM, 4core)
 
 
  I'd really recommend getting more RAM, so you can have the whole thing
  loaded in memory.  16 Gb would be nice, but even 8Gb should make a
  substantial difference.  It's going to be too big to store as an array
  since arrays have a limit of 2^31-1 entries, but you could store it as a
  list of matrices, e.g.
 
  x - vector(list, 904)
  for (i in 1:904)
   x[[i]] - matrix(0, 904,904)
 
  and then refer to entry i,j,k as x[[i]][j,k].
 
  Duncan Murdoch
 
 
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-27 Thread steven mosher
Thanks,

 That helped. Maybe when I get a chance I will do some blog posts on  the
basics of ff
 I think some tutorials would be a good idea

Steve

On Mon, Feb 27, 2012 at 3:47 AM, Djordje Bajic je.li@gmail.com wrote:

 Steven, sorry for the delay in responding,

 I have been investigating this also and here is the way I do it (though
 probably not the best way):

 # .. define a 3D array
  ngen = 904
  gratios - ff(NA, dim=rep(ngen,3), vmode=double)

 # .. fill the array with standard R functions

  ffsave (gratios, file=mydir/myfile)# without extension
  finalizer(gratios) - delete

 # ..

 so, you firstly define the ff object, you put the data inside, and you
 ffsave it. The ffsave function will generate two files, with extensions
 ffdata and a Rdata. Then you set 'delete' to be the 'finalizer' of the
 object; in this way you avoid ff to save it in some tmp dir and occupy disk
 space forever. Then, you can access your object in the next R session:

  ffload(mydir/myfile)# also without extension

 I hope this helped.

 Cheers,

 djordje



 2012/2/23 steven mosher mosherste...@gmail.com

  Did you have to use a particular filename?  or extension.
 
  I created a similar file but then could not read it back in
 
  Steve
 
  On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.com
 wrote:
 
  I've been investigating and I partially respond myself. I tried the
  packages 'bigmemory' and 'ff' and for me the latter did the work I need
  pretty straightforward. I create the array in filebacked form with the
  function ff, and it seems that the usual R indexing works well. I have
 yet
  to see the limitations, but I hope it helps.
 
  a foo example:
 
  myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double)
  myMat - matrix(1:904^2, ncol=904)
  for ( i in 1:904 ) {
 myArr[,,i] - myMat
  }
 
  Thanks all,
 
  2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com
 
   On 12-02-10 9:12 AM, Djordje Bajic wrote:
  
   Hi all,
  
   I am trying to fill a 904x904x904 array, but at some point of the
 loop
  R
   states that the 5.5Gb sized vector is too big to allocate. I have
  looked
   at
   packages such as bigmemory, but I need help to decide which is the
  best
   way to store such an object. It would be perfect to store it in this
   cube
   form (for indexing and computation purpouses). If not possible, maybe
  the
   best is to store the 904 matrices separately and read them
 individually
   when needed?
  
   Never dealed with such a big dataset, so any help will be appreciated
  
   (R+ESS, Debian 64bit, 4Gb RAM, 4core)
  
  
   I'd really recommend getting more RAM, so you can have the whole thing
   loaded in memory.  16 Gb would be nice, but even 8Gb should make a
   substantial difference.  It's going to be too big to store as an array
   since arrays have a limit of 2^31-1 entries, but you could store it
 as a
   list of matrices, e.g.
  
   x - vector(list, 904)
   for (i in 1:904)
x[[i]] - matrix(0, 904,904)
  
   and then refer to entry i,j,k as x[[i]][j,k].
  
   Duncan Murdoch
  
  
  
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-23 Thread steven mosher
Did you have to use a particular filename?  or extension.

I created a similar file but then could not read it back in

Steve

On Mon, Feb 13, 2012 at 6:45 AM, Djordje Bajic je.li@gmail.com wrote:

 I've been investigating and I partially respond myself. I tried the
 packages 'bigmemory' and 'ff' and for me the latter did the work I need
 pretty straightforward. I create the array in filebacked form with the
 function ff, and it seems that the usual R indexing works well. I have yet
 to see the limitations, but I hope it helps.

 a foo example:

 myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double)
 myMat - matrix(1:904^2, ncol=904)
 for ( i in 1:904 ) {
myArr[,,i] - myMat
 }

 Thanks all,

 2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com

  On 12-02-10 9:12 AM, Djordje Bajic wrote:
 
  Hi all,
 
  I am trying to fill a 904x904x904 array, but at some point of the loop R
  states that the 5.5Gb sized vector is too big to allocate. I have looked
  at
  packages such as bigmemory, but I need help to decide which is the
 best
  way to store such an object. It would be perfect to store it in this
  cube
  form (for indexing and computation purpouses). If not possible, maybe
 the
  best is to store the 904 matrices separately and read them individually
  when needed?
 
  Never dealed with such a big dataset, so any help will be appreciated
 
  (R+ESS, Debian 64bit, 4Gb RAM, 4core)
 
 
  I'd really recommend getting more RAM, so you can have the whole thing
  loaded in memory.  16 Gb would be nice, but even 8Gb should make a
  substantial difference.  It's going to be too big to store as an array
  since arrays have a limit of 2^31-1 entries, but you could store it as a
  list of matrices, e.g.
 
  x - vector(list, 904)
  for (i in 1:904)
   x[[i]] - matrix(0, 904,904)
 
  and then refer to entry i,j,k as x[[i]][j,k].
 
  Duncan Murdoch
 
 
 

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-13 Thread Djordje Bajic
I've been investigating and I partially respond myself. I tried the
packages 'bigmemory' and 'ff' and for me the latter did the work I need
pretty straightforward. I create the array in filebacked form with the
function ff, and it seems that the usual R indexing works well. I have yet
to see the limitations, but I hope it helps.

a foo example:

myArr - ff(NA, dim=rep(904,3), filename=arr.ffd, vmode=double)
myMat - matrix(1:904^2, ncol=904)
for ( i in 1:904 ) {
myArr[,,i] - myMat
}

Thanks all,

2012/2/11 Duncan Murdoch murdoch.dun...@gmail.com

 On 12-02-10 9:12 AM, Djordje Bajic wrote:

 Hi all,

 I am trying to fill a 904x904x904 array, but at some point of the loop R
 states that the 5.5Gb sized vector is too big to allocate. I have looked
 at
 packages such as bigmemory, but I need help to decide which is the best
 way to store such an object. It would be perfect to store it in this
 cube
 form (for indexing and computation purpouses). If not possible, maybe the
 best is to store the 904 matrices separately and read them individually
 when needed?

 Never dealed with such a big dataset, so any help will be appreciated

 (R+ESS, Debian 64bit, 4Gb RAM, 4core)


 I'd really recommend getting more RAM, so you can have the whole thing
 loaded in memory.  16 Gb would be nice, but even 8Gb should make a
 substantial difference.  It's going to be too big to store as an array
 since arrays have a limit of 2^31-1 entries, but you could store it as a
 list of matrices, e.g.

 x - vector(list, 904)
 for (i in 1:904)
  x[[i]] - matrix(0, 904,904)

 and then refer to entry i,j,k as x[[i]][j,k].

 Duncan Murdoch




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-11 Thread Duncan Murdoch

On 12-02-10 9:12 AM, Djordje Bajic wrote:

Hi all,

I am trying to fill a 904x904x904 array, but at some point of the loop R
states that the 5.5Gb sized vector is too big to allocate. I have looked at
packages such as bigmemory, but I need help to decide which is the best
way to store such an object. It would be perfect to store it in this cube
form (for indexing and computation purpouses). If not possible, maybe the
best is to store the 904 matrices separately and read them individually
when needed?

Never dealed with such a big dataset, so any help will be appreciated

(R+ESS, Debian 64bit, 4Gb RAM, 4core)


I'd really recommend getting more RAM, so you can have the whole thing 
loaded in memory.  16 Gb would be nice, but even 8Gb should make a 
substantial difference.  It's going to be too big to store as an array 
since arrays have a limit of 2^31-1 entries, but you could store it as a 
list of matrices, e.g.


x - vector(list, 904)
for (i in 1:904)
  x[[i]] - matrix(0, 904,904)

and then refer to entry i,j,k as x[[i]][j,k].

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] best option for big 3D arrays?

2012-02-10 Thread Djordje Bajic
Hi all,

I am trying to fill a 904x904x904 array, but at some point of the loop R
states that the 5.5Gb sized vector is too big to allocate. I have looked at
packages such as bigmemory, but I need help to decide which is the best
way to store such an object. It would be perfect to store it in this cube
form (for indexing and computation purpouses). If not possible, maybe the
best is to store the 904 matrices separately and read them individually
when needed?

Never dealed with such a big dataset, so any help will be appreciated

(R+ESS, Debian 64bit, 4Gb RAM, 4core)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.