Below is the test I ran awhile back on invoking R as a system call. It might be faster if you had a c-extension to R but before I went that route I would want to know 1) roughly how fast Python and Perl are in returning results with their c-bindings/embedded stuff/dcom stuff, 2) whether R can be run as a daemon process so you don't incur start up costs, and 3) whether R can act as a math server in the sense that it will fork children or threads as multiple users establish sessions with it. I agree it would be nice to have a better interface to R than via a system call.
I'm doing something similar using PL/R (an R procedural language handler extension to Postgres that I wrote) with Postgres, R, and PHP. In Postgres 7.4 (currently at beta3) or with a back-patched copy of 7.3, you can preload the R interpreter when the Postgres postmaster first starts. This means that essentially R is running as part of the Postgres daemon. Whenever a connection is made to the database, the forked process already has an initialized copy of R running inside it. The startup savings I see are similar to what you did (2.2 seconds versus 0.009 seconds):
------------------------------------------------------------------
Function -- intentionally very simple:
--------------------------------------
create or replace function echo(text) returns text as 'print(arg1)' language 'plr';
Without preloading (first function call):
-----------------------------------------
regression=# explain analyze select echo('hello');
Total runtime: 2195.35 msecWithout preloading (second function call):
-----------------------------------------
regression=# explain analyze select echo('hello');
Total runtime: 0.55 msecWith preloading (first function call):
-----------------------------------------
regression=# explain analyze select echo('hello');
Total runtime: 9.74 msecWith preloading (second function call):
-----------------------------------------
regression=# explain analyze select echo('hello');
Total runtime: 0.59 msec
------------------------------------------------------------------In both cases the second (and subsequent) function calls are even faster because the PL/R function itself has been precompiled and cached.
I call the PL/R function from PHP to read my data directly from the database, process it, and generate whatever charts I need. Here's a very simple example:
The PL/R function: ------------------------------------------------------------------ create type histtup as ( break float8, count int );
create or replace function hist(text, text)
returns setof histtup as '
sql <- paste("select id_val from sample_numeric_data ",
"where ia_id=''", arg1, "''", sep="")
rs <- pg.spi.exec(sql) if (!is.na(arg2)) {
x11(display=":5")
jpeg(file=arg2, width = 480, height = 480,
pointsize = 12, quality = 75)
par(ask = FALSE, bg = "#F8F8F8")
sql <- paste("select ia_attname as val from atts ",
"where ia_id=''", arg1, "''", sep="")
attname <- pg.spi.exec(sql)
h <- hist(rs[,1], col = "blue",
main = paste("Histogram of", attname$val),
xlab = attname$val);
dev.off()
system(paste("chmod 666 ", arg2, sep=""),
intern = FALSE, ignore.stderr = TRUE)
}
else
h <- hist(rs[,1], plot = FALSE); result = data.frame(breaks = h$breaks[1:length(h$breaks)-1],
count = h$counts);return(result) ' language 'plr'; ------------------------------------------------------------------
The PHP page:
------------------------------------------------------------------
<HTML><BODY>
<?PHP
echo "
<FORM ACTION='$PHP_SELF' METHOD='post' NAME='proto_form'>
<TABLE WIDTH='482' CELLSPACING='0' CELLPADDING='1' BORDER='0'>
<TR>
<TD>Data</TD>
<TD><INPUT TYPE='text' NAME='userdata' value='' size='80'></TD>
</TR>
<TR>
<TD colspan='2'>
<INPUT TYPE='submit' NAME='submit' value='Submit'>
</TD>
</TR>
</TABLE>
</FORM>
";if ($_POST['submit'] == "Submit")
{
$tmpfilename = 'charts/hist1.jpg';
$conn = pg_connect("dbname=oscon user=postgres");
$sql = "select * from hist('" . $_POST['userdata'] . "','" .
"/tmp/" . $tmpfilename . "')";
$rs = pg_query($conn,$sql);
echo "<img src='$tmpfilename' border=0>";
}
?>
</BODY></HTML>
------------------------------------------------------------------Hopefully this gives you some ideas about what is possible. If you're interested in PL/R, you can grab a copy (along with a patched 7.3.4 source RPM for Postgres) here: http://www.joeconway.com/
HTH,
Joe
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
