I'm wondering if anyone has written some functions or code for handling
very large files in R. I am working with a data file that is 41
variables times who knows how many observations making up 27MB altogether.
The sort of thing that I am thinking of having R do is
- count the number of lines
Hi,
Have you looked at R Data Import/Export?
On Mon, 25 Aug 2003, Murray Jorgensen wrote:
Date: Mon, 25 Aug 2003 16:04:17 +1200
From: Murray Jorgensen [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: R-help [EMAIL PROTECTED]
Subject: [R] R tools for large files
I'm wondering if anyone
Could you be more specific? Do you mean the chapter on connections?
Ko-Kang Kevin Wang wrote:
Hi,
Have you looked at R Data Import/Export?
On Mon, 25 Aug 2003, Murray Jorgensen wrote:
__
[EMAIL PROTECTED] mailing list
Dear Murray,
One way that works very well for many people (including me)
is to store the data in an external database, such as MySQL,
and read in just the bits you want using the excellent
package RODBC. Getting a database to do all the selecting
is very fast and efficient, leaving R to
Andrew,
This is no doubt true, but some things in R work very well with big
files without the need for any extra software:
readLines(c:/data/perry/data.csv,n=12)
# prints out the first 12 lines as strings
flows - read.csv(c:/data/perry/data.csv,na.strings=?,
header=F,nrows=1000)
# makes a data
Dear Murray,
Perhaps if you gave an example of why/what you actually
wish to do, you may get more useful advice. If the data
easily fits into R, then you could do the subsetting there.
Otherwise, the external database approach is good. It
depends a bit on what resources you have available and how
Murray Jorgensen [EMAIL PROTECTED] wrote:
I'm wondering if anyone has written some functions or code for handling
very large files in R. I am working with a data file that is 41
variables times who knows how many observations making up 27MB altogether.
Does that
On Mon, 25 Aug 2003, Patrick Connolly wrote:
version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status
major1
minor7.1
year 2003
I think that is only a medium-sized file.
On Mon, 25 Aug 2003, Murray Jorgensen wrote:
I'm wondering if anyone has written some functions or code for handling
very large files in R. I am working with a data file that is 41
variables times who knows how many observations making up 27MB
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Stored inside a mysql-db-table I've collected question-by-question response
time for a survey. First solution could be:
1) estract with SQL query all fields (with NA values due the presence of
question-filters and conventional jump)
2) handle data
At 08:12 25/08/2003 +0100, Prof Brian Ripley wrote:
I think that is only a medium-sized file.
Large for my purposes means more than I really want to read into memory
which in turn means takes more than 30s. I'm at home now and the file
isn't so I'm not sure if the file is large or not.
More
In plot(), when using option asp=1 the xlim and ylim have no effect because
they are changed
changed in order to fill the whole plot region. Is there a way to
automatically set
xlim and ylim when asp has been set to 1?
For example:
#This is a box of the plot ranges I want:
On Mon, 25 Aug 2003, Murray Jorgensen wrote:
At 08:12 25/08/2003 +0100, Prof Brian Ripley wrote:
I think that is only a medium-sized file.
Large for my purposes means more than I really want to read into memory
which in turn means takes more than 30s. I'm at home now and the file
isn't so
On Mon, 25 Aug 2003, Angel wrote:
In plot(), when using option asp=1 the xlim and ylim have no effect because
they are changed
changed in order to fill the whole plot region.
Not true: try xlim=c(-2,2) in your example.
Is there a way to
automatically set
xlim and ylim when asp has been
?xyplot, look at `scales' and in particular how to rotate axis labels.
as in
xyplot(sunspot ~ 1:37 ,type = l, aspect=xy,
scales = list(x=list(rot=45), y = list(log = TRUE)),
sub = log scales)
On Mon, 25 Aug 2003, Mahbub Latif wrote:
Hi,
I want to use (similar to) las options
Thanks for the advise. Sorry, I should have explained better.
As you say xlim and ylim have an effect. But when they do not match the
width and height of the plot region , one of them is modified in order to
make the plot fill the whole plot region with the aspect ratio given.
I would have
On Mon, 25 Aug 2003, Angel wrote:
Thanks for the advise. Sorry, I should have explained better.
As you say xlim and ylim have an effect. But when they do not match the
width and height of the plot region , one of them is modified in order to
make the plot fill the whole plot region with the
Dear R users,
I'm trying to do some sort of floodfill or seedfill with data stored
within a matrix in R (usually floating numbers), where a marker value is
given to specify the limits of an area to be filled. A reduced example
may demonstrate this below. Although I wrote a simple C function
Hi,
Is there no function in R similar to jpeg(...) or postscript(...) for windows meta
files?
The function savePlot(...) is not really what I need.
I'd like to save the plot on my disk without open a new plot window.
And I don't want to save it on my disk and convert it from a *.* to .wmf
Sorry for my mail, I have found the function win.metafile().
Thomas
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Unternährer Thomas, uth wrote:
Hi,
Is there no function in R similar to jpeg(...) or postscript(...) for windows meta files?
The function savePlot(...) is not really what I need.
I'd like to save the plot on my disk without open a new plot window.
And I don't want to save it on my disk and
I'm wondering if anyone has written some functions or code for
handling
very large files in R. I am working with a data file that is 41
variables times who knows how many observations making up 27MB
altogether.
The sort of thing that I am thinking of having R do is
- count the number
Hi, does anyone out there have a recommendation for multilevel / random
effects and longitudinal analysis?
My dream book would be something that's both accessible to a
non-statistician but rigorous (because I seem to be slowly turning into a
statistician) and ideally would use R.
Peter
Barry Rowlingson wrote:
Howabout this i just bashed up from a quick search:
boundaryFill - function(mat, x,y,fill,boundary)
[...]
note it fills 4-connected regions. I wouldnt like to do it on anything
complex since it'll be awful slow
Yes, this is in principle the same solution I use
On Mon, 25 Aug 2003, Thomas Petzoldt wrote:
Barry Rowlingson wrote:
Howabout this i just bashed up from a quick search:
boundaryFill - function(mat, x,y,fill,boundary)
[...]
note it fills 4-connected regions. I wouldnt like to do it on anything
complex since it'll be awful
Jose C. Pinheiro and Douglas M. Bates (2000)
Mixed effects models in S and S-PLUS. NY, Springer, 2000.
ISBN: 0-387-98957-9, LC: QA 76.73 .S15 P561 2000 (locally)
- tom blackwell - u michigan medical school - ann arbor -
On Mon, 25 Aug 2003, Peter Muhlberger wrote:
Hi, does anyone out
Three suggestions:
1.) Raudenbush and Bryk, _Hierarchical Linear Models: Second Edition_
(Sage, 2002)
2.) Pinheiro and Bates, _Mixed-Effects Models in S and S-Plus_ (Springer)
3.) Fox, _An R and S-Plus Companion to Applied Regression_, plus the
appendix available via the web on multilevel
Hello everybody,
I have tried to connect to external databases (specifically, to a MS Access database
in my computer) from R using the RODBC package. Unfortunatelly I haven't been able to
do it, even if I 'followed' the instructions in the manual. COuld someone please help
me?
I have a MS
So follow the instructions more precisely. You don't say which manual,
but the R Data Import/Export Manual says
We use a database @code{testdb} we created earlier, and had the DSN (data
source name) set up in @file{~/.odbc.ini} under @code{unixODBC}. Exactly
the same code worked using MyODBC
I read with interest comments about diamond graphs recently described in
the American Statistician by my colleagues in the Johns Hopkins Department
of Epidemiology led by Dr. Alvaro Munoz.
Permit three brief reactions.
First, diamond graphs were developed as part of the Multi-center Aids
Cohort
On Mon, 25 Aug 2003 15:37:29 -0400
Scott Zeger [EMAIL PROTECTED] wrote:
I read with interest comments about diamond graphs recently described in
the American Statistician by my colleagues in the Johns Hopkins Department
of Epidemiology led by Dr. Alvaro Munoz.
Permit three brief
On Mon, 25-Aug-2003 at 08:03AM +0100, Prof Brian Ripley wrote:
| On Mon, 25 Aug 2003, Patrick Connolly wrote:
|
| version
[...]
| However, what wasn't obvious to me was that it is necessary to specify
| what family to use. If no family is specified, the default family
| does appear to be
32 matches
Mail list logo