A bit late and possibly tangential.
The mmap package has something called struct() which is really a row-wise array
of heterogenous columns.
As Simon and others have pointed out, R has no way to handle this natively, but
mmap does provide a very measurable performance gain by orienting rows
Antonio Piccolboni antonio at piccolboni.info writes:
Hi,
I was wondering if there is anything more efficient than split to do the
kind of conversion in the subject. If I create a data frame as in
system.time({fd = data.frame(x=1:2000, y = rnorm(2000), id = paste(x,
1:2000, sep =))})
On 01/05/2012 00:28, Antonio Piccolboni wrote:
Hi,
I was wondering if there is anything more efficient than split to do the
kind of conversion in the subject. If I create a data frame as in
system.time({fd = data.frame(x=1:2000, y = rnorm(2000), id = paste(x,
1:2000, sep =))})
user system
It seems like people need to hear more context, happy to provide it. I am
implementing a serialization format (typedbytes, HADOOP-1722 if people want
the gory details) to make R and Hadoop interoperate better (RHadoop
project, package rmr). It is a row first format and it's already
implemented as
On May 1, 2012, at 1:26 PM, Antonio Piccolboni anto...@piccolboni.info wrote:
It seems like people need to hear more context, happy to provide it. I am
implementing a serialization format (typedbytes, HADOOP-1722 if people want
the gory details) to make R and Hadoop interoperate better
On Tue, May 1, 2012 at 11:29 AM, Simon Urbanek
simon.urba...@r-project.orgwrote:
On May 1, 2012, at 1:26 PM, Antonio Piccolboni anto...@piccolboni.info
wrote:
It seems like people need to hear more context, happy to provide it. I am
implementing a serialization format (typedbytes,
Hi,
I was wondering if there is anything more efficient than split to do the
kind of conversion in the subject. If I create a data frame as in
system.time({fd = data.frame(x=1:2000, y = rnorm(2000), id = paste(x,
1:2000, sep =))})
user system elapsed
0.004 0.000 0.004
and then I try to