Re: [R] maximum string length in RdbiPgSQL and in R
William McCoy wrote: library(RdbiPgSQL) conn - dbConnect(PgSQL(), host = localhost, dbname = agdb) test.sql readLines(queryfile) test.df - dbGetQuery(conn, paste(test.sql, collapse = )) This works fine for all the multiline files I have tried -- except one. I have recently encountered a problem with a moderately complex, moderately long query (12 lines, 459 characters). I can execute the query with no problem in psql and it returns the 14 rows that I expect. When I execute the query in R as above, I get a dataframe with the expected column names, but no rows. I get no error message. I am wondering if the query string is too long. Is there a maximum length for queries in RdbiPgSQL or for strings in R? I tried using this for a queryfile 8 select length( '0123456789...repaeted for total length of 500...0123456789' ) 8 and it works fine for me: 8 conn - dbConnect(PgSQL(),dbname=regression) sql - readLines(/tmp/queryfile) df - dbGetQuery(conn, paste(sql, collapse = )) df length 1500 8 so I don't think length is the issue. Maybe you have an embedded control character? Or is it possible that you are introducing a space somewhere unexpected in your query, preventing a match? Try doing paste(test.sql, collapse = ) and then cut and paste the result into psql. HTH, Joe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Storing data frame in a RDBMS
Gabor Grothendieck wrote: On 6/4/05, Adam Witney [EMAIL PROTECTED] wrote: I am using PL/R in PostgreSQL amd have written some functions to build my data frame. However this can take some time with some large datasets and I would like to not have to repeat the process and so I would like to save the data frame. Rather than save/load into the file system I would like to be able to save the entire data frame as a single object in the database Is this possible? Check out ?serialize Looks like serialize should work nicely: create or replace function test_serialize(text) returns text as ' mydf - pg.spi.exec(arg1) return (serialize(mydf, NULL, ascii = TRUE)) ' language 'plr'; create table saved_df (id int, df text); insert into saved_df select 1, f from test_serialize('select oid, typname from pg_type where typname = ''oid'' or typname = ''text''') as t(f); create or replace function restore_df(text) returns setof record as ' unserialize(arg1) ' language 'plr'; select * from restore_df((select df from saved_df where id =1)) as t(oid oid, typname name); oid | typname -+- 25 | text 26 | oid (2 rows) HTH, Joe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting with Statistics::R, Perl/R
Dirk Eddelbuettel wrote: On Fri, Jan 21, 2005 at 06:06:45PM -0800, Leah Barrera wrote: I am trying to plot in R from a perl script using the Statistics::R package as my bridge. The following are the conditions: 0. I am running from a Linux server. Plotting certain formats requires the X11 server to be present as the font metrics for those formats can be supplied only the X11 server. Other drivers don;t the font metrics from X11 -- I think pdf is a good counterexample. When you run in 'batch' via a Perl script, you don't have the X11 server -- even though it may be on the machine and running, it is not associated with the particular session running your Perl job. There are two common fixes: a) if you must have png() as a format, you can start a virtual X11 server with the xvfb server -- this is a bit involved, but doable; Attached is an init script I use to start up xvfb on Linux. HTH, Joe #!/bin/bash # # syslogStarts Xvfb. # # # chkconfig: 2345 12 88 # description: Xvfb is a facility that applications requiring an X frame buffer \ # can use in place of actually running X on the server # Source function library. . /etc/init.d/functions [ -f /usr/X11R6/bin/Xvfb ] || exit 0 XVFB=/usr/X11R6/bin/Xvfb :5 -screen 0 1024x768x16 RETVAL=0 umask 077 start() { echo -n $Starting Xvfb: $XVFB RETVAL=$? echo_success echo [ $RETVAL = 0 ] touch /var/lock/subsys/Xvfb return $RETVAL } stop() { echo -n $Shutting down Xvfb: killproc Xvfb RETVAL=$? echo [ $RETVAL = 0 ] rm -f /var/lock/subsys/Xvfb return $RETVAL } restart() { stop start } case $1 in start) start ;; stop) stop ;; restart|reload) restart ;; condrestart) [ -f /var/lock/subsys/Xvfb ] restart || : ;; *) echo $Usage: $0 {start|stop|restart|condrestart} exit 1 esac exit $RETVAL __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PL/R calls fail
Stefan Sobernig wrote: I am currently trying to create a development environment including PostgreSQL 8.0.0rc1, R 2.0.1 and PL/R on a system running Fedora Cora 1. So far, I have suceeded in setting up PostgreSQL and R as a shared library - unfortunately I have not been able to link these two spheres by adding the PostgreSQL add-on PL/R due to some mysterious probs. This is an inappropriate list for such a PL/R specific question. Please sign up for the PL/R list here: http://gborg.postgresql.org/mailman/listinfo/plr-general Thanks, Joe __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calling R from a non-X shell script to plot?
Seth Falcon wrote: On Mon, Dec 13, 2004 at 12:25:01PM -0500, doktora v wrote: Is anyone familiar with this (i.e. running R from a non-X environment)? Is there a way to get around this? I've seen some stuff about virtual devices, but have no idea if it works or where to start. If there is a simpler solution, please let me know. I've used Xvfb in this situation. After installing Xvfb, you can do something like this: Xvfb :15 export DISPLAY=localhost:15 # Run R FWIW, here's what I've used in the past for an Xvfb init script: 8 #!/bin/bash # # syslogStarts Xvfb. # # # chkconfig: 2345 12 88 # description: Xvfb is a facility that applications requiring an X frame buffer \ # can use in place of actually running X on the server # Source function library. . /etc/init.d/functions [ -f /usr/X11R6/bin/Xvfb ] || exit 0 XVFB=/usr/X11R6/bin/Xvfb :5 -screen 0 1024x768x16 RETVAL=0 umask 077 start() { echo -n $Starting Xvfb: $XVFB RETVAL=$? echo_success echo [ $RETVAL = 0 ] touch /var/lock/subsys/Xvfb return $RETVAL } stop() { echo -n $Shutting down Xvfb: killproc Xvfb RETVAL=$? echo [ $RETVAL = 0 ] rm -f /var/lock/subsys/Xvfb return $RETVAL } restart() { stop start } case $1 in start) start ;; stop) stop ;; restart|reload) restart ;; condrestart) [ -f /var/lock/subsys/Xvfb ] restart || : ;; *) echo $Usage: $0 {start|stop|restart|condrestart} exit 1 esac exit $RETVAL 8- HTH, Joe __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cgi/servlets/httpd in R
[EMAIL PROTECTED] wrote: Your solution seems what I am looking for, actually. Questions: FWIW, another possible option if your data is stored in an RDBMS is Postgres with PL/R; see: http://www.joeconway.com/plr/ As of Postgres 7.4 you can preload and initialize libraries at postmaster start, which means that each forked database backend includes a fully initialized copy of libR. If you are interested, there is some information available regarding how to use this with PHP to generate online charts here: http://www.joeconway.com/oscon-pres-2003-1.pdf HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] PL/R article
Douglas Bates wrote: Interesting article. Thanks for bringing it to our attention, Joe. There are a couple of quotes that I like, such as It's simply amazing the things that you can learn when data is presented in a graphical format. It appears that the author is using - when he only needs - in one function definition. There isn't anything peculiar about PL/R that would require -, is there? No, nothing at all. The only time you might need - (and I'm admittedly no expert on R, so this may be inappropriate use), is when you want to create a variable in one PL/R function, and then access it from another PL/R function. For instance, when you want to prepare a query, and then execute it multiple times. By preparing the query, you save the time of parsing and planning each time you execute it. There is an example here (see pg.spi.execp): http://www.joeconway.com/plr/doc/plr-spi-rsupport-funcs.html In Robert's example he's calling pg.spi.exec directly, so there is no need for -. Thanks for the comments! Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] running R from PHP
[EMAIL PROTECTED] wrote: Both run from terminals and png.R will run without a normal X server if Xvfb is running. Neither runs under PHP, though (when invoked as R --no-save xxx.R). They yield the following errors (with the R startup banner deleted for compactness): could not open PNG file `g.png' cannot open `pdf' file argument `/usr/pkg/share/httpd/htdocs/test-R/g.pdf' From the evidence above, I'd guess a file permission error. The web server probably runs as the user apache or something similar -- does that user have write permission to the place where you are trying to create the images? Try writing to /tmp/g.png and /tmp/g.pdf and see if the files get created. HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] running R from PHP
[EMAIL PROTECTED] wrote: Is there a trick to creating a graphics device in the absence of an actual display in order to create an image in a file? Look for Xvfb (X virtual frame buffer). Not sure what OS you are running, but on RH9 and Fedora, at least, there is a package called XFree86-Xvfb. I use Xvfb with the following command: /usr/X11R6/bin/Xvfb :5 -screen 0 1024x768x16 More specifically I wrote an init script and set Xvfb up to start as a service on boot. Then in R I use: x11(display=:5) HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Persistent state of R
michael watson (IAH-C) wrote: I am trying to make my cgi scripts quicker and it turns out that the bottle-neck is the loading of the libraries into R - for example loading up marrayPlots into R takes 10-20 seconds, which although not long, is long enough for users to imagine it is not working and start clicking reload So I just wondered if anyone had a neat solution whereby I could somehow have the required libraries permanently loaded into R - perhaps I need a persistent R process with the libraries in memory that I can pipe commands to? Is this possible? If you are processing data already stored in a database, you could use Postgres and PL/R. See: http://www.joeconway.com/ Use Postgres 7.4 and preload PL/R for the best performance -- i.e put the following line in $PGDATA/postgresql.conf preload_libraries = '$libdir/plr:plr_init' HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R Production Performance
Zitan Broth wrote: Right but R is only preloaded once? Yes. The plr shared library gets loaded and initialized (which in turn loads and initializes libR) only once -- on Postgres's postmaster startup. From that point forward, every new database connection gets a forked copy of the postmaster, and hence a preinitialized copy of the R interpreter. Of course (as I think I mentioned already, but it is worth repeating) to get this performance enhancement you need to be using either Postgres 7.4 beta or a patched version of Postgres 7.3 (found at the URL on the original post), and have the following line in your postgresql.conf: preload_libraries = '$libdir/plr:plr_init' This is getting a bit off topic, so if you have any more PL/R specific questions, please write me off list. Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] R Production Performance
Paul Meagher wrote: Below is the test I ran awhile back on invoking R as a system call. It might be faster if you had a c-extension to R but before I went that route I would want to know 1) roughly how fast Python and Perl are in returning results with their c-bindings/embedded stuff/dcom stuff, 2) whether R can be run as a daemon process so you don't incur start up costs, and 3) whether R can act as a math server in the sense that it will fork children or threads as multiple users establish sessions with it. I agree it would be nice to have a better interface to R than via a system call. I'm doing something similar using PL/R (an R procedural language handler extension to Postgres that I wrote) with Postgres, R, and PHP. In Postgres 7.4 (currently at beta3) or with a back-patched copy of 7.3, you can preload the R interpreter when the Postgres postmaster first starts. This means that essentially R is running as part of the Postgres daemon. Whenever a connection is made to the database, the forked process already has an initialized copy of R running inside it. The startup savings I see are similar to what you did (2.2 seconds versus 0.009 seconds): -- Function -- intentionally very simple: -- create or replace function echo(text) returns text as 'print(arg1)' language 'plr'; Without preloading (first function call): - regression=# explain analyze select echo('hello'); Total runtime: 2195.35 msec Without preloading (second function call): - regression=# explain analyze select echo('hello'); Total runtime: 0.55 msec With preloading (first function call): - regression=# explain analyze select echo('hello'); Total runtime: 9.74 msec With preloading (second function call): - regression=# explain analyze select echo('hello'); Total runtime: 0.59 msec -- In both cases the second (and subsequent) function calls are even faster because the PL/R function itself has been precompiled and cached. I call the PL/R function from PHP to read my data directly from the database, process it, and generate whatever charts I need. Here's a very simple example: The PL/R function: -- create type histtup as ( break float8, count int ); create or replace function hist(text, text) returns setof histtup as ' sql - paste(select id_val from sample_numeric_data , where ia_id='', arg1, '', sep=) rs - pg.spi.exec(sql) if (!is.na(arg2)) { x11(display=:5) jpeg(file=arg2, width = 480, height = 480, pointsize = 12, quality = 75) par(ask = FALSE, bg = #F8F8F8) sql - paste(select ia_attname as val from atts , where ia_id='', arg1, '', sep=) attname - pg.spi.exec(sql) h - hist(rs[,1], col = blue, main = paste(Histogram of, attname$val), xlab = attname$val); dev.off() system(paste(chmod 666 , arg2, sep=), intern = FALSE, ignore.stderr = TRUE) } else h - hist(rs[,1], plot = FALSE); result = data.frame(breaks = h$breaks[1:length(h$breaks)-1], count = h$counts); return(result) ' language 'plr'; -- The PHP page: -- HTMLBODY ?PHP echo FORM ACTION='$PHP_SELF' METHOD='post' NAME='proto_form' TABLE WIDTH='482' CELLSPACING='0' CELLPADDING='1' BORDER='0' TR TDData/TD TDINPUT TYPE='text' NAME='userdata' value='' size='80'/TD /TR TR TD colspan='2' INPUT TYPE='submit' NAME='submit' value='Submit' /TD /TR /TABLE /FORM ; if ($_POST['submit'] == Submit) { $tmpfilename = 'charts/hist1.jpg'; $conn = pg_connect(dbname=oscon user=postgres); $sql = select * from hist(' . $_POST['userdata'] . ',' . /tmp/ . $tmpfilename . '); $rs = pg_query($conn,$sql); echo img src='$tmpfilename' border=0; } ? /BODY/HTML -- Hopefully this gives you some ideas about what is possible. If you're interested in PL/R, you can grab a copy (along with a patched 7.3.4 source RPM for Postgres) here: http://www.joeconway.com/ HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Statistical analysis of huge datasets.
[EMAIL PROTECTED] wrote: One possibility is to use a DBMS like MySQL or Postgresql, and RODBC to connect to these. Search the archives for previous postings about these, have a look at the first R-Newsletter and at Data Import-Export manual. If you use PostgreSQL, you might want to try PL/R; see: http://www.joeconway.com/plr/ It allows your R functions to run inside the backend database process, minimizing data I/O. HTH, Joe __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Need help installing qtoolbox
Toby Popenfoose wrote: Is there another control chart library for R that I should be trying instead? I looked around and could not find an R package for Shewhart control charts. I'll post the function I wrote for my own needs, but note that I am not a statistician, nor am I particularly experienced with R -- so use at your own risk ;-) It only does X-bar, R, and geometric moving average. I used the traditional calculations (i.e. use constants based on sample group size, and range, to calculate UCL/LCL) instead of a more rigorous approach. Below is the function and an example of how to use it (if anyone has suggestions for improvement, I'd love to hear them). HTH, Joe 8- controlChart - function(xdata, ssize, CLnumGroups = 0) { if (!is.vector(xdata)) stop(Data must be a vector) if (!is.numeric(xdata)) stop(Data vector must be numeric) xdatalen - length(xdata) xdataresid - xdatalen %% ssize newxdatalen - xdatalen - xdataresid if (xdataresid != 0) xdata - xdata[1:newxdatalen] if (ssize 1 | ssize 10) { stop(Sample size must be in the range of 1 to 10) } else if (ssize 1 ssize 11) { # Xbar/R factors ng - c(2:10) D3 - c(0,0,0,0,0,0.08,0.14,0.18,0.22) D4 - c(3.27,2.57,2.28,2.11,2.00,1.92,1.86,1.82,1.78) A2 - c(1.88,1.02,0.73,0.58,0.48,0.42,0.37,0.34,0.31) d2 - c(1.13,1.69,2.06,2.33,2.53,2.70,2.85,2.97,3.08) v - data.frame(ng, D3, D4, A2, d2) # put into sample groups m - matrix(xdata, ncol = ssize, byrow = TRUE) # number of groups numgroups - nrow(m) # Adjust number of points used to calculate control limits. if (numgroups CLnumGroups | CLnumGroups == 0) CLnumGroups = numgroups # range for each group r - apply(m, 1, range) r - r[2,] - r[1,] # Rbar rb - mean(r[1:CLnumGroups]) rb - rep(rb, numgroups) # R UCL and LCL rucl - v$D4[match(ssize,v$ng) + 1] * rb rlcl - v$D3[match(ssize,v$ng) + 1] * rb # Xbar xb - apply(m, 1, mean) # Xbarbar xbb - mean(xb[1:numgroups]) xbb - rep(xbb, numgroups) # X UCL and LCL xucl - xbb + (v$A2[match(ssize,v$ng) + 1] * rb) xlcl - xbb - (v$A2[match(ssize,v$ng) + 1] * rb) } else#sample size is 1 { m - xdata # number of groups numgroups - length(m) # Adjust number of points used to calculate control limits. if (numgroups CLnumGroups | CLnumGroups == 0) CLnumGroups = numgroups # set range for each group to 0 r - rep(0, numgroups) # Rbar rb - rep(0, numgroups) # R UCL and LCL rucl - rep(0, numgroups) rlcl - rep(0, numgroups) # Xbar is a copy of the individual data points xb - m # Xbarbar is mean over the data xbb - mean(xb[1:CLnumGroups]) xbb - rep(xbb, numgroups) # standard deviation over the data xsd - sd(xb[1:CLnumGroups]) # X UCL and LCL xucl - xbb + 3 * xsd xlcl - xbb - 3 * xsd } # geometric moving average if (numgroups 1) { rg - 0.25 gma = c(xb[1]) for(i in 2:numgroups) gma[i] = (rg * xb[i]) + ((1 - rg) * gma[i - 1]) } else { gma - rep(0, numgroups) } # create a single dataframe with all the plot data controlChartSummary - data.frame(1:numgroups, xb, xbb, xucl, xlcl, r, rb, rucl, rlcl, gma) return(controlChartSummary) } # sample data xdata - 12 + 4 * rnorm(90) # sample size ssize - 3 # get control chart data cc - controlChart(xdata, ssize) # get number of sample groups numgroups - length(cc$xb) # X Bar chart plotxrange - range(c(1:numgroups)) plotyrange - range(cc$xb, cc$xucl, cc$xlcl) plotyrange[1] - plotyrange[1] - (plotyrange[2] - plotyrange[1]) * 0.1 plotyrange[2] - plotyrange[2] + (plotyrange[2] - plotyrange[1]) * 0.1 plot(c(1:numgroups), xlim = plotxrange, ylim = plotyrange, cc$xb, type = b, lty = 1) lines(c(1:numgroups), cc$xbb, lty = 1) lines(c(1:numgroups), cc$xucl, lty = 1) lines(c(1:numgroups), cc$xlcl, lty = 1) # R chart plotxrange - range(c(1:numgroups)) plotyrange - range(cc$r, cc$rucl, cc$rlcl) plotyrange[1] - plotyrange[1] - (plotyrange[2] - plotyrange[1]) * 0.1 plotyrange[2] - plotyrange[2] + (plotyrange[2] - plotyrange[1]) * 0.1 plot(c(1:numgroups), xlim = plotxrange, ylim = plotyrange, cc$r, type = b, lty = 1) lines(c(1:numgroups), cc$rb, lty = 1) lines(c(1:numgroups), cc$rucl, lty = 1) lines(c(1:numgroups), cc$rlcl, lty = 1) # Geometric Moving Average chart plotxrange - range(c(1:numgroups)) plotyrange - range(cc$gma) plotyrange[1] - plotyrange[1] - (plotyrange[2] - plotyrange[1]) * 0.1 plotyrange[2] - plotyrange[2] + (plotyrange[2] - plotyrange[1]) * 0.1 plot(c(1:numgroups), xlim = plotxrange, ylim =
Re: [R] PL/R - R procedural language handler for PostgreSQL
Hisaji Ono wrote: This can be built on Win32? Not presently (well, maybe under cygwin, but I haven't yet tried). There is a good chance that PostgreSQL will be have a native win32 port when version 7.4 comes out. If so, I'll make sure that PL/R will support it. Joe __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] PL/R - R procedural language handler for PostgreSQL
[EMAIL PROTECTED] wrote: But R does not build under Cygwin (last time I looked and I would be surprised if it would without a lot of tinkering), and the Windows port of R does not have libR.so but a different (and much older) mechanism using R.dll. Hmmm. I neglected to think about that angle :-( Is there a desire to get R to build under Cygwin, or is it preferable to put any effort into the Windows port? Joe __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help