Re: [GENERAL] PL/R etc.

Mark Morgan Lloyd Fri, 10 May 2013 13:23:33 -0700

Merlin Moncure wrote:

On Fri, May 10, 2013 at 2:32 PM, Mark Morgan Lloyd
<markmll.pgsql-gene...@telemetry.co.uk> wrote:

Merlin Moncure wrote:

On Fri, May 10, 2013 at 4:41 AM, Mark Morgan Lloyd
<markmll.pgsql-gene...@telemetry.co.uk> wrote:

I don't know whether anybody active on the list has R (and in particular
PL/R) experience, but just in case... :-)


i)   Something like APL can operate on an array with minimal regard for
index order, i.e. operations across the array are as easily-expressed and
as
efficient as operations down the array. Does this apply to PL/R?

ii)  Things like OpenOffice can be very inefficient if operating over a
table comprising a non-trivial number of rows. Does PL/R offer a
significant
improvement, e.g. by using a cursor rather than trying to read an entire
resultset into memory?


pl/r (via R) very terse and expressive.  it will probably meet or beat
any performance expectations you have coming from openoffice.   that
said, it's definitely a memory bound language; typically problem
solving involves stuffing data into huge data frames which then pass
to the high level problem solving functions like glm.

you have full access to sql within the pl/r function, so nothing is
keeping you from paging data into the frame via a cursor, but that
only helps so much.

a lot depends on the specific problem you solve of course.


Thanks Merlin and Joe. As an occasional APL user "terse and oppressive"
doesn't really bother me :-)

As a particular example of the sort of thing I'm thinking, using "pure" SQL
the operation of summing the columns in each row and summing the rows in
each column are very different.

In contrast, in APL if I have an array

        B
1  2  3  4
5  6  7  8
9 10 11 12

I can perform a reduction operation using + over whichever axis I specify:

        +/[1]B
15 18 21 24
        +/[2]B
10 26 42

or even by default

        +/B
10 26 42

Does PL/R provide that sort of abstraction in a uniform fashion?


certainly (for example see here:
http://stackoverflow.com/questions/13352180/sum-different-columns-in-a-data-frame)
-- getting good at R can take some time but it's worth it.   R is
"hot" right now with all the buzz around big data lately.  The main
challenge actually is the language is so rich it can be difficult to
zero in on the precise behaviors you need.   Also, the documentation
is all over the place.

pl/r plays in nicely because with some thought you can marry the R
analysis functions directly to the query in terms of both inputs and
outputs -- basically very, very sweet syntax sugar.   It's a little
capricious though (and be advised: Joe has put up some very important
and necessary fixes quite recently) so usually I work out the R code
in the R console first before putting in the database.

[Peruse] Thanks, I think I get the general idea. I'm aware of thesignificance of R, and in particular that it's attracting attention dueto the undesirability of hiding functionality in spreadsheets wherethese usurped APL for certain types of operation.


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PL/R etc.

Reply via email to