To respond a little more directly to what you seem to be asking for:

You would like an "automatic" conversion from your class (you don't give us its 
name, let's call it frameHDF for now) and "data.frame".

In R (and in OOP generally) this sounds like inheritance: you want  a frameHDF 
to be valid wherever a data.frame is wanted.

_IF_ that is really a good idea, it can be done by using setIs() to define the 
correspondence.  Then methods for data.frame objects (formal methods at least) 
will convert the argument automatically.

(As noted previously, the simple assignment operation doesn't check types.)

However, this doesn't sound like such a good idea.  The point of your class is 
to handle objects too large for ordinary data frames.  Converting automatically 
sounds like a recipe for unpleasant surprises.

A more cautious approach would be for the user to explicitly state when a 
conversion is needed.  The general tool for defining this is setAs(), very 
similar to setIs() but not making things automatic, the user then says as(x, 
"data.frame") to get conversion.

The online documentation for these two functions says some more; also section 
9.3 of my 2008 book referenced in the documentation.

One more comment.  It would be likely that your HDF5 objects have reference 
semantics--any changes made are seen by all the functions using that object.  
This is different from R's functional semantics as in S4 classes, and the 
differences can cause incorrect results in some situations. The more recent 
reference classes (?ReferenceClasses) were designed to mimic C++, Java, etc 
style behavior.  (They are used in Rcpp to import C++ classes.)

John


On Jan 7, 2013, at 3:23 PM, Chris Jewell wrote:

> Hi All,
> 
> I'm currently trying to write an S4 class that mimics a data.frame, but 
> stores data on disc in HDF5 format.  The idea is that the dataset is likely 
> to be too large to fit into a standard desktop machine, and by using 
> subscripts, the user may load bits of the dataset at a time.  eg:
> 
>> myLargeData <- LargeData("/path/to/file")
>> mySubSet <- myLargeData[1:10, seq(1,15,by=3)]
> 
> I've therefore defined by LargeData class thus
> 
>> LargeData <- setClass("LargeData", representation(filename="character"))
>> setMethod("initialize","LargeData", function(.Object,filename) 
>> .Object@filename <- filename)
> 
> I've then defined the "[" method to call a C++ function (Rcpp), opening the 
> HDF5 file, and returning the required rows/cols as a data.frame.
> 
> However, what if the user wants to load the entire dataset into memory?  
> Which method do I overload to achieve the following?
> 
>> fullData <- myLargeData
>> class(fullData)
> [1] "data.frame"
> 
> or apply transformations:
> 
>> myEigen <- eigen(myLargeData)
> 
> In C++ I would normally overload the "double" or "float" operator to achieve 
> this -- can I do the same thing in R?
> 
> Thanks,
> 
> Chris
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to