Hello,

I do some computations on datasets that come from climate models. These data
are huge arrays, significantly larger than typically available RAM, so they
have to be accessed row-by-row, or rather slice-by slice, depending on the
task. I would like to make an R package to easily access such datasets
within R. The C++ backend is ready and being used under Windows/.Net/Visual
Basic, but I have yet to learn the specifics of R programming to make a good
R interface.

I think it should be possible to make a package (call it "slice") that could
be used like this:

library (slice)
dataset <- load.virtualarray ("dataset_definition.xml")
ordinaryvector <- dataset [ , 2, 3] # Load a portion of the data from disk
and extract it

In the above "dataset" is an object that holds a definition of a
3-dimensional large dataset, and "ordinaryvector" is an ordinary R vector.
The subscripting operator fetches necessary data from disk and extracts a
required slice, taking care of caching and other technical details. So, my
questions are:

Has anyone ever made a similar extension, with virtual (lazy) arrays?

Can the suscript operator be overloaded like that in R? (I know it can be in
S, at least for vectors.)

And a tough one: is it possible to make an expression like "[1]" (without
quoutes) meaningful in R? At the moment it results in a syntax error. I
would like to make it return an object of a special class that gets
interpreted when subscripting my virtual array as "drop this dimension",
like this:

dataset [, 2, 3, drop = F]  # Return a 3-dimensional array
dataset [, [2], 3, drop = F]  # Return a 2-dimensional array
dataset [, [2], [3], drop = F]  # Return a 1-dimensional array, like dataset
[, 2, 3]

Thanks in advance for any help,

Maciej.

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to