Re: [R] parsing strings between [ ] in columns

2010-02-18 Thread Barry Rowlingson
On Thu, Feb 18, 2010 at 8:29 AM, milton ruser milton.ru...@gmail.com wrote:
 Dear all,

 I have a data.frame with a column like the x shown below
 myDF-data.frame(cbind(x=c([[1, 0, 0], [0, 1]],
   [[1, 1, 0], [0, 1]],[[1, 0, 0], [1, 1]],
   [[0, 0, 1], [0, 1]])))
 myDF

 After identify the groups I would like
 to idenfity the subgroups:
  A1 A2 A3  B1 B2
 1 1  0  0   0  1
 2 1  1  0   0  1
 3 1  0  0   1  1
 4 0  0  1   0  1

Maybe it's not too early in the morning. Given your myDF above:

# how is the first one structured?
 lets = unlist(lapply(fromJSON(as.character(myDF[1,])),length))

# 3 then 2:
 lets
[1] 3 2

# make the letters (fails for 26 groups)
 rep(LETTERS[1:length(lets)],lets)
[1] A A A B B

# handy sequence function makes the numbers:
 sequence(lets)
[1] 1 2 3 1 2

# splat them together:
 paste(rep(LETTERS[1:length(lets)],lets),sequence(lets),sep=)
[1] A1 A2 A3 B1 B2

 then you can just make this the column names of your new dataframe.

 I think the morning coffee has got through the blood-brain barrier now.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rearranging data

2010-02-18 Thread Barry Rowlingson
On Thu, Feb 18, 2010 at 9:21 AM, Anna Carter anna_carte...@yahoo.com wrote:

 My objective is to rearrange filtered1 as

 date corp1   corp2   corp11    corp17
 17-Feb   65    95 30  16
 16-Feb   70   135
 15-Feb   69   140
 14-Feb   89
 13-Feb   88

 #(The above figures represent the corp-wise and date-wise rates for 
 investment_id = 1.)

 Please guide me how the filtered1 can be rearranged?

 Ooh, so close! If you'd said 'reshaped' you'd be half way there:

 filtered1
   corp_id   date investment_id stock_rate
1corp1 17-Feb 1 65
2corp1 16-Feb 1 70
3corp1 15-Feb 1 69
4corp1 14-Feb 1 89
5corp1 13-Feb 1 88
6corp2 17-Feb 1 95
7corp2 16-Feb 1135
8corp2 15-Feb 1140
13  corp11 17-Feb 1 30
14  corp17 17-Feb 1 16

 cast(filtered1, date~corp_id,value=stock_rate)
date corp1 corp11 corp17 corp2
1 13-Feb88 NA NANA
2 14-Feb89 NA NANA
3 15-Feb69 NA NA   140
4 16-Feb70 NA NA   135
5 17-Feb65 30 1695

The ordering is different to your desire but it's trivial to rearrange
it if you need. Also you get NA's where there's no value in that cell.

Are you doing this purely for presentation? Because if not then might
be easier to keep the data in the long format and work on it like that
with the various 'apply' type functions. See also the 'plyr' package.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rearranging data

2010-02-18 Thread Barry Rowlingson
On Thu, Feb 18, 2010 at 12:14 PM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:

  Ooh, so close! If you'd said 'reshaped' you'd be half way there:

 .. because that is all in the reshape package. So do

  library(reshape)

 first.

Oops

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Barry Rowlingson
On Thu, Feb 18, 2010 at 12:12 PM, John Sorkin
jsor...@grecc.umaryland.edu wrote:
 It is easy to devolve into visceral response mode, lose objectivity and slip 
 into intolerance. R, S, S-Plus, SAS, PASW (nee SPSS), STATA, are all tools. 
 Each has strengths and weaknesses. No one is inherently better, or worse than 
 the other.

Sometimes it seems the name of the tool is more important. SPSS became
PASW for a brief inkling of time until someone at IBM perhaps
recognised the enormous value of just the name and then decided they
better stick with it, but decided to prefix everything with 'IBM'.
Corporate ego trip anyone?

http://spss.com/software/statistics/

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading quattro pro spreadsheet .qpw into R

2010-02-16 Thread Barry Rowlingson
On Tue, Feb 16, 2010 at 4:12 PM, stephen sefick ssef...@gmail.com wrote:
 I have many quattro pro spreadsheets and no quattro pro.  Is there a
 way to access the data using R, or any other solution that anyone can
 think of?

 OpenOffice claims it can read Quattro Pro 6.0 'wb2' files, but maybe
they are different to .qpw files. MS Excel claims some Quattro Pro
readability - I've just read something about wb1 files. Maybe Gnumeric
can read them? What have you tried?

 Perhaps if you put a representative file somewhere we can download and try?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] executable R script under xp (to avoid migration toward Matlab or C++)

2010-02-15 Thread Barry Rowlingson
On Mon, Feb 15, 2010 at 10:37 AM, PtitBleu ptit_b...@yahoo.fr wrote:

 Hello,

 I discovered R two years ago and thanks to the R-community I managed to
 write some scripts to analyze my data stored in mysql databases.
 The only problem is that I am the only one using R in the lab. Colleagues
 mainly use Matlab (but not with mysql, only with text files) but regularly
 come to me to get data treated with R-scripts !!!.

 To allow the use of my scripts by other people, my boss asked me to make
 executables (.exe) with my scripts or to pay someone (I'm only end-user and
 not a computer scientist) to translate them into matlab langage (and then
 into .exe) or into C++ meaning abandoning R. And I don't want to.

 So is it possible to make executables from R scripts ? It seems it is not
 the case but I hope I missed a way to do it.

If your boss isn't dumb enough to insist on binary executables and
just wants some processing programs that can be delivered to the
matlab guys, then you can create standalone R scripts - easy with
littler:

http://dirk.eddelbuettel.com/code/littler.html

You still need R and any required packages installed on the target
machine, but the user doesn't need to know R at all. Just run the R
script, and feed it whatever arguments it needs. You can even build
simple GUIs with the tcltk package (or complex GUIs with RGtk?).

Your colleagues can also look at the R scripts and modify them if they
don't quite do what they want them to do. Then these guys and girls
will be learning R. You can't do that with a binary .exe file. Big
wins all round.

Barry


-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] executable R script under xp (to avoid migration toward Matlab or C++)

2010-02-15 Thread Barry Rowlingson
On Mon, Feb 15, 2010 at 12:03 PM, PtitBleu ptit_b...@yahoo.fr wrote:

 Thanks to all for your advices.

 littler is not for xp, isn'it ?

Maybe, maybe not, but Rscript definitely is:

http://blog.revolution-computing.com/2009/01/using-r-as-a-scripting-language-with-rscript.html

You didn't actually mention your OS in your first post!

Barry
--

blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which method is called in command like class(x)='something'?

2010-02-15 Thread Barry Rowlingson
On Mon, Feb 15, 2010 at 5:07 PM, blue sky bluesky...@gmail.com wrote:
 x=3
 `class-`(x,'something')#this command prints
 [1] 3
 attr(,class)
 [1] something
 x=3
 class(x)='something'#this command doesn't print anything

 The first of the above two commands print the content of 'x' but the
 second doesn't, although both of them set the argument 'x'. I'm
 wondering which is method is called in the latter one.


The same thing is called. It's not a question of the method doing the
printing - the printing is done by the R interpreter when it finishes
evaluating your  line.

 Normally evaluations (eg sqrt(2)) print out but if you wrap them in
'invisible' they dont - try invisible(sqrt(2)).

All assignments have their invisibility set when run interactively:

  x=1:10
  dim(x)=c(2,5)
  x

 see how nothing is printed, either at the 'x=1:10' or the
'dim(x)=...'? Calling the assignment function directly returns a value
without the invisibility cloak, which is what you want when typing
'sqrt(2)':

 x=1:10
 `dim-`(x,c(2,5))
[,1] [,2] [,3] [,4] [,5]
[1,]13579
[2,]2468   10

 I suspect this behaviour is specified somewhere in the R parser. But
be honest, why does it matter? You would quickly get irritated by R
telling you what you just did every time you typed an assignment:

  x=1:10
 [1] 1 2 3 4 5 6 7 8 9 10

Barry



-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a new Access database with R

2010-02-11 Thread Barry Rowlingson
On Thu, Feb 11, 2010 at 6:23 PM, Dieter Menne
dieter.me...@menne-biomed.de wrote:


 Paul- wrote:

 As a workaround, you can keep an empty mdb file on your filesystem. When
 you need a new database, you can copy and rename the empty file.



 Creating a new database is not part of (R)ODBC because there are too many
 differences between implementations. You you use some RDCOM method to do
 that, though.

 Create an empty mdb file in the usual way, then read it into R using
a binary file connection, save it as an R object. To create, spew the
raw bytes back out to another connection. Something like:

mdb = readBin(test1.mdb,what=raw,n=7)
 length(mdb)
[1] 65536

Ooh, 64kbytes. If that's too big, run length encoding will shrink it somewhat:

 rle(mdb)
Run Length Encoding
  lengths: int [1:3953] 1 1 2 1 1 1 1 1 1 1 ...
  values : raw [1:3953] 00 01 00 53 ...

 However, I just created two via the MS ODBC dialog, and they aren't
identical. They are wildly different. I suspect it's just creation
dates and times in the system tables. You should be okay.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difftime result for days not an integer?

2010-02-11 Thread Barry Rowlingson
On Thu, Feb 11, 2010 at 8:40 PM, Jonathan jonsle...@gmail.com wrote:
 Anybody have an idea why I would get a non-integer value for the
 number of days here?

 difftime('2004-08-05','2001-01-03',units='days')
 Time difference of 1309.958 days


 Would you just round off?

It's one hour short of an integer number of days:

 difftime('2004-08-05','2001-01-03',units='hours')/24
Time difference of 1309.958 hours
 (1+difftime('2004-08-05','2001-01-03',units='hours'))/24
Time difference of 1310 hours

 why do you think that might be?

 here's another hint: a few years ago my birthday only lasted 23
hours. I never got that hour back.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R to format a file using a server (PDB to PQR file)

2010-02-10 Thread Barry Rowlingson
On Wed, Feb 10, 2010 at 6:16 AM, Amitoj S. Chopra amit...@gmail.com wrote:

 I am trying to write a program that uses R and takes a pdb file, and converts
 it to a pqr file. This task is simple generally, using the website,
 http://pdb2pqr-1.wustl.edu/pdb2pqr/. How do you use R to input a pdb file
 (that is on hand) into the upload pdb file input, and run the website and
 give the return file to be a pqr file. Thanks for your help.

 You can use RCurl to upload a file to an HTML form POST URL using
postForm (and see also fileUpload).

 Then you need to poll the returned URL to see when the conversion is
finished, so sleep for 30 seconds or so, and use getURL until the
returned HTML looks like the completion page. Then find the link to
the output file and use getURL to download it.

RCurl is on CRAN, and docs are here - lots of examples:

 http://www.omegahat.org/RCurl/

Warning: this post contains small parts. Some assembly required.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transparent concentric circles

2010-02-09 Thread Barry Rowlingson
On Tue, Feb 9, 2010 at 2:20 PM, Karin Lagesen kar...@cbs.dtu.dk wrote:
 I have a data set which I would like to plot as a set of concentric
 circles. The data represent a count of the number of characteristics
 shared by various elements - an example would look like this:

 1 100
 2 75
 3 50
 4 25

 I.e. all four sets share 25 characteristics, three of them share 50
 characteristics, and so on.

 I would like to plot these as concentric circles, with the circle size
 preferentially being proportional to the size of the number of elements
 (this is not a must, however). I would also like the colors of the circles
 to become stronger/deeper as we progress to the innermost circle (which
 would be the one containing the number of characteristics shared by all
 four).

 Can somebody point me to what I can use to do this?

 help.search(circle)?

Have you tried any of those? Specifically:

plotrix::draw.circleDraw a circle.
shape::filledcircle adds colored circle to a plot
grid::grid.circle   Draw a Circle

 - assuming you have those packages loaded...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The KJV

2010-02-07 Thread Barry Rowlingson
On Sun, Feb 7, 2010 at 8:28 AM, Ted Harding
ted.hard...@manchester.ac.uk wrote:

 Delightful! And fascinating in the detail too.

  length(tt)
  # [1] 5078

 with slight changes like:

  barplot(rev(tt[1:50]),horiz=TRUE,las=1,cex.names=0.6,log=x)
  # ...
  barplot(rev(tt[101:150]),horiz=TRUE,las=1,cex.names=0.6,log=x)
  # ...

 and see the likes of

  tt[lord]
  # lord
  # 1939

  tt[god]
  # god
  # 822

  tt[men]
  # men
  # 204

  tt[women]
  # women
  #    26

 I'm now wondering how it matches up with Zipf's Law (or perhaps
 Fisher's logarithmic ... )

 Thanks, Ben!

 I'm wondering if someone is now going to write an R package to look
for 'bible codes':

http://en.wikipedia.org/wiki/Bible_code

 it's all in there:

http://www.biblecodewisdom.com/code/model-goodness-fit-test

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert R plots into annotated web-graphics

2010-02-07 Thread Barry Rowlingson
On Sun, Feb 7, 2010 at 2:35 PM, Rainer Tischler rainer_...@yahoo.de wrote:
 Dear all,

 I would like to make a large scatter plot created with R available as an 
 interactive web graphic, in combination with additional text-annotations for 
 each data point in the plot. The idea is to present the text-annotations in 
 an HTML-table and inter-link the data points in the plot with their 
 corresponding entries in the table, i.e. when clicking on a data point in the 
 plot, the corresponding entry in the table should be highlighted or centered 
 and vice-versa, when clicking on a table-entry, the corresponding point in 
 the plot should be highlighted.

 I have seen that CRAN contains various R-packages for SVG-based output of 
 interactive graphics (with hyperlinks and tool-tip annotations for each data 
 point); however, SVG is not supported by all browsers. Is anybody aware of 
 another solution for this problem (maybe based on image-maps and javascript)?
 If you have alternative ideas for interlinking tabular annotations with 
 plotted data points, I would appreciate any recommendation/suggestion.
 (I work with R 2.8.1 on different 32-bit PCs with both Linux and Windows 
 operating systems).


 My 'imagemaps' package?

https://r-forge.r-project.org/projects/imagemap/

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert R plots into annotated web-graphics

2010-02-07 Thread Barry Rowlingson
On Sun, Feb 7, 2010 at 2:35 PM, Rainer Tischler rainer_...@yahoo.de wrote:

 If you have alternative ideas for interlinking tabular annotations with 
 plotted data points, I would appreciate any recommendation/suggestion.
 (I work with R 2.8.1 on different 32-bit PCs with both Linux and Windows 
 operating systems).

 As an alternative suggestion to my imagemap package, you could use a
javascript chart plotting library and just generate a data file and
the html from R. Maybe flot:

http://code.google.com/p/flot/

I find the R 'brew' package ideal for creating JS or HTML output files
from object.

 Warning: this answer contains small parts. Some assembly required.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reading csv files

2010-02-05 Thread Barry Rowlingson
On Fri, Feb 5, 2010 at 10:23 AM, analys...@hotmail.com
analys...@hotmail.com wrote:
 the csv files are downloaded from a database and it looks like some
 character fields contain the CR-LF sequence within them.

 This causes R to see a new record/row and the number of rows it sees
 is different (usually higher) from the number of rows actually
 extracted.

 Hard to tell without an example, but I just tried this in a file:

1,2,this
is a test,99
2,3,oneliner,45

and:

 read.table(test.csv,sep=,)
  V1 V2  V3 V4
1  1  2 this\nis a test 99
2  2  3oneliner 45

seemed to work. But if your strings aren't quoted (hard to tell
without an example) then you might have to find another way. Hard to
tell without an example.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot single frames as a movie?

2010-02-04 Thread Barry Rowlingson
On Thu, Feb 4, 2010 at 10:58 AM, Eik Vettorazzi
e.vettora...@uke.uni-hamburg.de wrote:
 Hi Javier,
 have a look at the animation-package on CRAN  and
 http://animation.yihui.name/

I've had some fun making web-based animations using the mootools
javascript library. Instructions here:

http://www.maths.lancs.ac.uk/~rowlings/Chicas/DIY/

It's simply a case of making a bunch of plot0001.png files with
frame numbers and then editing a bit of javascript.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to read this data file into R?

2010-02-03 Thread Barry Rowlingson
On Tue, Feb 2, 2010 at 11:40 PM, David Winsemius dwinsem...@comcast.net wrote:

 The real solution is to grab the miscreant sender by the throat , er,
  tactfully discuss with your valued customer ,,, and shake out a machine
 readable form that has all of one row in a row.

Indeed. But you might get away with something else...

It is composed of blocks of (header +  25 data) rows - so using
read.table with skip= set to N*26  and nrows=25 would let you read
each block, and then use cbind to make up a big matrix.

# Here's my test example, which I did with 26 rows just to make sure
you understand it and don't just blindly cut n paste (or maybe I can't
count):

# test - create a matrix and dump it in this format to /tmp/m.txt:
m=matrix(sample(26*40),26,40)
m
sink(/tmp/m.txt)
m
sink()
# now read the second chunk:
read.table(/tmp/m.txt,skip=27,nrows=26,sep=)

# how to do the whole thing:

# gotta have something to bind on to for starters:
 mm=matrix(0,nrow=26,ncol=1)

 for(i in 0:3){
+  mm = cbind(mm,read.table(/tmp/m.txt,skip=i*27,nrows=26,sep=))
+ }
# get rid of that first column:
 mm=mm[,-1]

# and now
 all(mm==m)
[1] TRUE

 Recovery!


But yes, if someone gave you this file then they done wrong, but
sometimes all you have is an R transcript from the distant past (or
possibly even an old S-plus transcript with an S-plus .Data that you
can't read any more).

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R performance

2010-01-31 Thread Barry Rowlingson
On Sun, Jan 31, 2010 at 7:55 PM, Marc Jekel feuerw...@gmx.de wrote:
 Dear R Fans,

 I was recently asking myself how quick R is in code execution.

 R can be a million times quicker than C code. Badly written C code.
Next question...

 I have been
 running simulations lately that need quite a time for execution and I was
 wondering if it is reasonable at all to do more computational extensive
 projects with R. Of course, it is possible to compare execution time for the
 same code written in several languages but maybe someone has some experience
 on the subject?

 The corollary to my first answer is that badly written R can be a
million times slower than well written R. There are lots of techniques
for writing faster (==better? discuss!) R code, including vectorizing
loops, writing parts in C or Fortran and so on. The one thing you need
to do first is...

 Profile Your Code!

 See help(Rprof). Also make sure you know how to tell what system
resources your program is using in terms of real and swap memory (you
didn't mention your platform - Unix or Windows?).

 Eventually you will hit speed limits and CPUs aren't getting much
faster. But there are more of them, so you'll need to look into the
various parallel processing magic things, eg in the HPC Task View on
CRAN:

http://ftp.heanet.ie/mirrors/cran.r-project.org/web/views/HighPerformanceComputing.html

 The future is very definitely multi-core.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Randomly rearranging elements of sets

2010-01-27 Thread Barry Rowlingson
On Wed, Jan 27, 2010 at 11:11 PM, Rolf Turner r.tur...@auckland.ac.nz wrote:

 ?sample


 And/Or read Knuth.

 Does this sound like a homework problem?

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-forge getting the wrong package

2010-01-26 Thread Barry Rowlingson
On Sun, Jan 24, 2010 at 10:09 AM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:
 After accusing someone of typing 'install.packages(weather)' instead
 of 'install.packages(webmaps)', I discovered that R-forge really is
 currently returning the wrong source tarball for packages after
 'Repitools' in the alphabet.

  I'm hypothesizing it's in the way the PACKAGES file is being
 constructed, but haven't tested that hypothesis yet - I'm sure the
 R-forge admins will fix this.

Stefan Theussl has now fixed this, so if anyone wants to try
install.packages(webmaps,repos=http://R-Forge.R-project.org;) they
now can, without getting the 'weather' package installed instead.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R-forge getting the wrong package

2010-01-24 Thread Barry Rowlingson
After accusing someone of typing 'install.packages(weather)' instead
of 'install.packages(webmaps)', I discovered that R-forge really is
currently returning the wrong source tarball for packages after
'Repitools' in the alphabet.

The data returned from available.package in install.packages goes out
of sync at 'Repitools':

RemoteREngine  RemoteREngine_0.0-8.tar.gz
RemoteSensing  RemoteSensing_0.2-5.tar.gz
Repitools  RepitoolsExamples_1.01.tar.gz
Rglpk  Repitools_0.0.107.tar.gz
Ripop  Rglpk_0.3-2.tar.gz
Rllvm  Ripop_0.1.tar.gz
RlpSolveAPIRllvm_0.1.tar.gz

 There's a report on the R-forge forum, but I thought I'd post this
here in case anyone else is staring at the screen in bewilderment.

 I'm hypothesizing it's in the way the PACKAGES file is being
constructed, but haven't tested that hypothesis yet - I'm sure the
R-forge admins will fix this.

Barry

-- 

blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read files in a folder when new data files come

2010-01-24 Thread Barry Rowlingson
On Sun, Jan 24, 2010 at 8:05 PM, jlfmssm jlfm...@gmail.com wrote:
 Hello,

 I am working on a project. The new data files is coming as the data
 collectors get data, then
 the data collectors put these new data files in a folder. I need to
 read these new data files when they are in folder.
 so far, I did this job manually, that is to say, each time I go to
 that folder and find new data files, then use my R program to
 read these new data files. I am wondering if anyone know how to
 perform this job automatically in R.

Without needing some operating-system specific hackery, the easiest
way would be to use 'list.files()' and look for new files every so
many minutes or seconds (depending on how urgent it is). Or to check
file.info() on your directory and test the modification time. You'd
then write that into a  .R file and run that in the background using
your operating system's background job functionality (as a 'service'
in Windows, or as a background process in Unix). Use
Sys.sleep(seconds) to wait in your loop. Something like (totally
untested):

lastChange = file.info(dumpLocation)$mtime
while(TRUE){
  currentM = file.info(dumpLocation)$mtime
  if(currentM != lastChange){
lastChange = currentM
doSomethingWithStuffIn(dumpLocation)
  }
# try again in 10 minutes
Sys.sleep(600)
}

 There are ways for programs to get directory content change events
when files appear in directories, but they will probably be very
operating system specific. There's also the problem of your code
firing up when a file is only half-uploaded - what do you do then?
Does your data format have an 'end of data' marker?

 Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in plot.new()

2010-01-24 Thread Barry Rowlingson
On Sun, Jan 24, 2010 at 9:15 PM, jean luc picard peter.wohlm...@gmx.at wrote:

 Dear all,

 I have received the following error message since R 2.10.1 for the first
 time and I am not able to draw graphics any more:

 plot(1:5,1:5)
 plot(1:5,1:5)
 Error in plot.new() : figure margins too large
 In addition: Warning message:
 Display list redraw incomplete
 sessionInfo()
 R version 2.10.1 (2009-12-14)
 i486-pc-linux-gnu

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base


 Do you have an idea, how to solve the problem?

 I have an identical system according to sessionInfo() but no problem...

 Are you sure you don't have a 'plot' function defined yourself that
is being loaded? Run R with --vanilla from the command line:

R --vanilla

 and see if it still fails. The '--vanilla' option stops R loading in
a .RData file from the current directory.

 Ooh, I can duplicate your problem if I resize my plot window very
small after the first plot. What do you see after your first
plot(1:5,1:5)? Anything? Possibly you've got some setting that's
making your graphics window very small, but I don't see why it happens
on the second plot.

 Anyway, try --vanilla and that might cut out some possibilities.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in plot.new()

2010-01-24 Thread Barry Rowlingson
On Sun, Jan 24, 2010 at 9:40 PM, jean luc picard peter.wohlm...@gmx.at wrote:

 Running R --vanilla

 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 plot(1:5,1:5)
 Error in plot.new() : figure margins too large
 In addition: Warning messages:
 1: Display list redraw incomplete
 2: Display list redraw incomplete

 The plot statement only opens the graphics window but no graphic is
 displayed (the same problem as in the original session).

 Spooky. What's your Linux distribution?

 Can you try this: again from R --vanilla, do plot(1:5,1:5) and then
from another unix shell do a:

xwininfo

 and click on the plot window. Paste the result back here...

 And while you're at it, what do you get for the output of
'capabilities()' in R:

 capabilities()
jpeg  png tifftcltk  X11 aqua http/ftp  sockets
TRUE TRUE TRUE TRUE TRUEFALSE TRUE TRUE
  libxml fifo   clediticonv  NLS  profmemcairo
TRUE TRUE TRUE TRUE TRUE TRUE TRUE

 Does it happen if you do X11(type=Xlib) first to get an Xlib
graphics device, then compare and contrast with X11(type=cairo) -
again, restarting R from a --vanilla run every time.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use R from python

2010-01-21 Thread Barry Rowlingson
On Thu, Jan 21, 2010 at 8:35 PM, Massimo Di Stefano
massimodisa...@yahoo.it wrote:
 Hi All,

 please aplogize me if my qustion is a bit OT here,
 but maybe is there someone that uses R from inside python
 using rpy or rpy2 interface.

 In [54]: x = rdiv( ( rdiff( x, rmin(x) ) ) , ( rdiff( rmax(x) , rmin(x) ) ) )

 In [55]: y = rdiv( ( rdiff( r_sorted, rmin(r_sorted) ) ) , ( rdiff( 
 rmax(r_sorted) , rmin(r_sorted) ) ) )
 Errore in .Primitive(-)(my...@data$elevation.dem, 
 my...@data$elevation.dem) :
  argomento non numerico trasformato in operatore binario
 
 Traceback (most recent call last):
  File ipython console, line 1, in module
  File build/bdist.macosx-10.6-universal/egg/rpy2/robjects/__init__.py, line 
 423, in __call__
 RRuntimeError: Errore in .Primitive(-)(my...@data$elevation.dem, 
 my...@data$elevation.dem) :
  argomento non numerico trasformato in operatore binario

 My non-existent Italian is telling me this is non-numeric argument in
binary operator. Something like:

 hello - goodbye
Error in hello - goodbye : non-numeric argument to binary operator

 - because you are subtracting the strings my...@data$elevation.dem.

 Tracking back, those strings come from:

r_sorted = rsort('my...@data$elevation.dem', decreasing=True)

 - which is sorting the string vector! Like this:

 sort('my...@data$elevation.dem', decreasing=TRUE)
[1] my...@data$elevation.dem

 You want to sort the *value* of that object.  You want to sort the
$elevation.dem column of the @data slot of the python R object mymap.

 In a functional form which will translate to your style of rpy2 would
be this in R:

get($)(slot(mymap,data),elevation.dem)

 You may need to get 'get' and 'slot' from r.robjects in the way you
do other functions. This looks a bit weird to me, but I'm used to
rpy-1 - maybe rpy 2 is like this!

 Hope that points you in the right direction.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predict polynomial problem

2010-01-19 Thread Barry Rowlingson
On Tue, Jan 19, 2010 at 1:36 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:

 Its the environment thing.

 I think you want something like this:

        models[[i]]=lm( bquote( y ~ poly(x,.(i)) ), data=d)

 Use
        terms( mmn[[3]] )

 both with and without this change and


        ls( env = environment( formula( mmn[[3]] ) ) )
        get(i,env=environment(formula(mmn[[3]])))
        sapply(mmn,function(x) environment( formula( x ) ) )


 to see what gives.

 Think I see it now. predict involves evaluating poly, and poly here
needs 'i' for the order. If the right 'i' isn't gotten when predict is
called then I get the error. Your fix sticks the right 'i' into the
environment when predict is called.

 I haven't quite got my head round _how_ it does it, and I have no
idea how I could have figured this out for myself. Oh well...

 The following lines are also illustrative:

d = data.frame(x=1:10,y=runif(10))

i=3
#1 naive model:
m1 = lm(y~poly(x,i),data=d)
#2,3 bquote, without or with i-wrapping:
m2 = lm(bquote(y~poly(x,i)),data=d)
m3 = lm(bquote(y~poly(x,.(i))),data=d)

#1 works, gets 'i' from global i=3 above:
predict(m1,newdata=data.frame(x=9:11))
#2 fails - why?
predict(m2,newdata=data.frame(x=9:11))
#3 works, gets 'i' from within:
predict(m3,newdata=data.frame(x=9:11))

rm(i)

#1 now fails because we removed 'i' from top level:
predict(m1,newdata=data.frame(x=9:11))
#2 still fails:
predict(m2,newdata=data.frame(x=9:11))
#3 still works:
predict(m3,newdata=data.frame(x=9:11))

Thanks

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predict polynomial problem

2010-01-19 Thread Barry Rowlingson
On Tue, Jan 19, 2010 at 5:37 PM, Charles C. Berry cbe...@tajo.ucsd.eduwrote:


 Note:

   i - 20
  bquote(y ~ poly(x,.(i)))

 y ~ poly(x, 20)


 I see it now. bquote(y~poly(x,.(i))) gets it's 'i' there and then, sticks
it in the returned expression as the value '20', so any further evaluations
get poly(x,20). This is reminiscent of the way macro languages work...

Thanks,

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Number of download.

2010-01-19 Thread Barry Rowlingson
On Tue, Jan 19, 2010 at 8:23 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Jan 19, 2010, at 2:51 PM, Christophe Genolini wrote:

 Hi the list

 Is there a way to know how many times an R package (on CRAN) has been
 download ?

 No, or at least not a comprehensive number. The question came up and was
 discussed in March last year. Search term popular.:

 http://finzi.psych.upenn.edu/Rhelp08/2009-March/thread.html

  The was considerable disagreement about the validity of any such number.
  This is Dirk's offering, which I assume is specific to Debian and appears
 to be the only data offered:

 http://qa.debian.org/popcon.php?package=r-base

 More generally, it was stated that the CRAN mirroring mechanism does not
 support data collection of this sort. (But it appears that the UCLA server
 may be an exception.)

 In a similar vein, has anyone ever put any 'phone home' code in a
package, so that authors can track usage? Something in the package
startup code that pings a logging server, for example?

 Yes I know doing such a thing without telling the user and giving
them an opt-out is evil.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Number of download.

2010-01-19 Thread Barry Rowlingson
On Tue, Jan 19, 2010 at 9:51 PM, Liviu Andronic landronim...@gmail.com wrote:

 Why would this be evil? For R, for example? I've already read some
 objections to this on r-help, but I'm not sure I understand the
 reasons. As long as the 'ping' happens once, at first start,
 anonymously, and requires confirmation from the user, I do not see an
 issue with the behaviour.

I did say 'without telling the user'.

This kind of behaviour got its bad name from closed-source software
'phoning home'  and raising privacy concerns since nobody could tell
what the seemingly random stream of bytes heading off your computer
consisted of.

 This kind of behaviour seems acceptible:

  library(foo)
 Foo library would like to report its usage to foo.com. Would you like
it to do this every time you
 use it? No personal data about you or your computer is transmitted [y/N] Y
 Reporting use of library(foo) at 12:34:56 7-8-2010 to foo.com

 Then next time:

  library(foo)
  Reporting use of library(foo) at 12:34:56 7-8-2010 to foo.com
  To stop usage reporting, do foo.reporting(off)

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Predict polynomial problem

2010-01-18 Thread Barry Rowlingson
 I have a function that fits polynomial models for the orders in n:

lmn - function(d,n){
  models=list()
  for(i in n){
models[[i]]=lm(y~poly(x,i),data=d)
  }
  return(models)
}

 My data is:

  d=data.frame(x=1:10,y=runif(10))

 So first just do it for a cubic:

  mmn = lmn(d,3)
  predict(mmn[[3]])
1 2 3 4 5 6 7 8
0.6228353 0.5752811 0.5319524 0.4957381 0.4695269 0.4562077 0.4586691 0.4798001
910
0.5224893 0.5896255

and lets extrapolate a bit:

  predict(mmn[[3]],newdata=data.frame(x=c(9,10,11)))
1 2 3
 0.5224893 0.5896255 0.6840976

 now let's to it for cubic to quintic:

  mmn = lmn(d,3:5)

 check the cubic:

  predict(mmn[[3]])
1 2 3 4 5 6 7 8
0.6228353 0.5752811 0.5319524 0.4957381 0.4695269 0.4562077 0.4586691 0.4798001
910
0.5224893 0.5896255

 - thats the same as last time. Extrapolate?

  predict(mmn[[3]],newdata=data.frame(x=c(9,10,11)))
Error: variable 'poly(x, i)' was fitted with type nmatrix.3 but type
nmatrix.5 was supplied
In addition: Warning message:
In Z/rep(sqrt(norm2[-1L]), each = length(x)) :
  longer object length is not a multiple of shorter object length

it falls over. I can't see the difference between the objects,
summary() looks the same. Is something wrapped up in an environment
somewhere, or some lazy evaluation thing, or have I just done
something stupid?

Here's a complete example you can paste in - R --vanilla  this.R
gives the error above - R 2.10.1 on Ubuntu, and also on R 2.8.1 I had
lying around on a Windows box:

d = data.frame(x=1:10,y=runif(10))

lmn - function(d,n){
  models=list()
  for(i in n){
models[[i]]=lm(y~poly(x,i),data=d)
  }
  return(models)
}

mmn = lmn(d,3)
predict(mmn[[3]])
predict(mmn[[3]],newdata=data.frame(x=c(9,10,11)))

mmn2 = lmn(d,3:5)
predict(mmn2[[3]])
predict(mmn2[[3]],newdata=data.frame(x=c(9,10,11)))


Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] advice/opinion on - vs = in teaching R

2010-01-15 Thread Barry Rowlingson
On Fri, Jan 15, 2010 at 3:45 AM, Erin Hodgess erinm.hodg...@gmail.comwrote:

 Hi R People:

 I'm teaching a statistical computing class using R starting next week
 (yay!) and I have an opinion type question, please.

 I'm old school and use - in an assignment.


 You call that 'old school'?? I still use ' x_1'! Of course ESS turns the
underscore into '-' magically.

 Perhaps these guys should redo their tests with slightly different syntax:

http://www.cs.mdx.ac.uk/research/PhDArea/saeed/paper1.pdf

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] advice/opinion on - vs = in teaching R

2010-01-15 Thread Barry Rowlingson
On Fri, Jan 15, 2010 at 6:57 AM, Ted Harding
ted.hard...@manchester.ac.ukwrote:


 There is at least one context where the distinction must be
 preserved. Example:

  pnorm(1.5)
  # [1] 0.9331928
  pnorm(x=1.5)
  # Error in pnorm(x = 1.5) : unused argument(s) (x = 1.5)
  pnorm(x-1.5)
  # [1] 0.9331928
  x
  # [1] 1.5

 Ted.


 I would regard modifying a variable within the parameters of a function
call as pretty tasteless. What does:


 foo(x-2,x)
or
 foo(x,x-3)

do that couldn't be done clearer with two lines of code?

 Remember: 'eschew obfuscation'.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Updated comparison table for SAS-SPSS Add-ons and R Functions

2010-01-14 Thread Barry Rowlingson
On Thu, Jan 14, 2010 at 12:39 AM, Muenchen, Robert A (Bob) muenc...@utk.edu
 wrote:


 One of the things I updated was to *remove* the now-obsolete PASW! Since
 IBM bought the company, they did away with that and renamed things IBM SPSS
  See the list at:
 http://spss.com/software/statistics/


Sheesh! I knew IBM had bought it up but I guess I missed the memo about
changing the name-changing to a different name-change.

 I'm wondering if you should have a spatial statistics category as well as
(or instead of) a GIS category.  GIS would be more about making maps and
manipulating geographic data in a non-statistical way, whereas spatial
statistics has wider applications to 2-d data in general. The Spatial Task
View you reference talks about both aspects. For R, spatial stats packages
include spatstat, splancs and gstat. I don't know if SAS or SPSS do
point-pattern analysis or Kriging.

 It seems odd that SPPS Maps is 'defunct', as you say, since there's such a
rise in geospatial technologies. Have they got anything to replace it? Can
you make maps in IBM SPSS Whatever?

Barry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Updated comparison table for SAS-SPSS Add-ons and R Functions

2010-01-13 Thread Barry Rowlingson
On Wed, Jan 13, 2010 at 11:53 PM, Muenchen, Robert A (Bob) muenc...@utk.edu
 wrote:

 Hi All,

 I have substantially expanded the table that compares SAS and SPSS
 add-on modules to somewhat equivalent R packages. This new version is
 at:
 http://r4stats.com/add-on-modules
 and I would very much appreciate any feedback you might have on it.

 The site http://r4stats.com is the replacement to
 http://RforSASandSPSSusers.com and includes the support files for both
 R for SAS and SPSS Users and the new R for Stata Users, due out in
 March from Springer. I'll phase the older site out eventually and change
 the URL to point to the new one.


Maybe the first thing you should do is a global search and replace of 'SPSS'
with 'PASW'

 http://www.spss.com/software/product-name-guide/

Barry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame names in sequence. please help!!!

2010-01-10 Thread Barry Rowlingson
On Sun, Jan 10, 2010 at 7:16 AM, Berend Hasselman b...@xs4all.nl wrote:



 Zoho wrote:

 I've been stuck with this problem for a whole afternoon. It's silly but
 totally pissed me off. I have a set of data frames with names in a
 sequence: df_1, df_2, df_3, ..., df_20. Now I want to access each data
 frame (read or write) in a for loop, in a way something like this:

 for (i in 1:20) {
   df_i - ##
   length(which(df_i[,7]==1))
   ##
 }

 I tried paste or cat (df_, i, sep=). But neither way works. Your help
 is highly appreciated!! Thanks in advance!


 df_1 - data.frame(x1=3,x2=5)
 df_2 - data.frame(x1=2,x2=7)
 df_3 - data.frame(x1=-1,x2=1)

 for(k in 1:3){v - paste(df_,k,sep=); print(get(v))}
 for(k in 1:3){v - paste(df,k,sep=_); print(get(v)[,2])}

 Have a look at get:

 ?get

 Or better still, have a look at making a *list* instead of a bunch of
data frames with numbers in their names, then you can index in a
sensible way without having to construct names with paste and get.
Here's a list of data frames:

 L = list()
 for(i in 1 :10){
  L[[i]]=data.frame(x=runif(10))
}

 Now you can loop over L[[i]]

 This has been asked a zillion times on R-help. Sure, if you've
already mistakenly created 200 data frames then you need the paste/get
solution, but don't make the same mistake twice. Use a list.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding 3D arrows to 3D plots

2010-01-10 Thread Barry Rowlingson
Have a go with this:

arrow3d - function(p0=c(0,1,0),p1=c(1,1,1),s=0.1,theta=pi/4,n=3,...){
 ##p0: start point
 ##p1: end point
 ## s: length of barb as fraction of line length
 ## theta: opening angle of barbs
 ## n: number of barbs
 ##   ...: args passed to lines3d for line styling

 require(geometry)
 require(rgl)

 ## rotational angles of barbs
 phi=seq(0,2*pi,len=n+1)[-1]

 ## length of line
 lp = sqrt(sum((p1-p0)^2))

 ## point down the line where the barb ends line up
 cpt=(1-(s*lp*cos(theta)))*(p1-p0)

 ## draw the main line
 line = lines3d(c(p0[1],p1[1]),c(p0[2],p1[2]),c(p0[3],p1[3]),...)

 ## need to find a right-angle to the line. So create a random point:
 rpt = jitter(c(
   runif(1,min(p0[1],p1[1]),max(p0[1],p1[1])),
   runif(1,min(p0[2],p1[2]),max(p0[2],p1[2])),
   runif(1,min(p0[3],p1[3]),max(p0[3],p1[3]))
   ))

 ## and if it's NOT on the line the cross-product gives us a vector
at right angles:
 r = extprod3d(p1-p0,rpt)
 ## normalise it:
 r = r / sqrt(sum(r^2))

 ## now compute the barb end points and draw:
 pts = list()
 for(i in 1:length(phi)){
   ptb=rotate3d(r,phi[i],(p1-p0)[1],(p1-p0)[2],(p1-p0)[3])
   lines3d(
   c(p1[1],cpt[1]+p0[1]+lp*s*sin(theta)*ptb[1]),
   c(p1[2],cpt[2]+p0[2]+lp*s*sin(theta)*ptb[2]),
   c(p1[3],cpt[3]+p0[3]+lp*s*sin(theta)*ptb[3]),
   ...
   )
 }
 return(line)
}

This creates a line with 'n' arrow barbs at one end, equally spaced
when you look at the line end-on. The barb length is 's' times the
length of the line, and the opening angle is theta. Just do arrow3d()
to get something.

Might be useful, plus I wanted to brush up on my vector geometry anyway...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] parsing pdf files

2010-01-09 Thread Barry Rowlingson
On Sat, Jan 9, 2010 at 1:11 PM, David Kane d...@kanecap.com wrote:
 I have a pdf file that I would like to parse into R:

 http://www.williams.edu/Registrar/geninfo/faculty.pdf

 For now, I open the file in Acrobat by hand, then save it as text
 and then use readLines(). That works fine but a) I am concerned that
 some information may be lost and b) I may be doing this a lot, so I
 would rather have R grab the information from the pdf file directly.

 So: is there something like readPDF() for R?

 What could it do that saving as text from Acrobat couldn't do? Here's
the problem - PDF is a page description format, it's not designed to
be read back. There's no guarantee that the letters on the page appear
in the PDF in the same order as they seem on the page. The page could
have all the letter 'a's, then the 'b's and so on, positioned in their
right places to make up words. To reconstruct the words you'd have to
spot where the letters were being placed, and then figure out the
breaks and make up the words. Good luck making the sentences.

 Most PDFs aren't that perverse, and you can often get sensible text
out of them. But then you run into font encodings and graphics and
column layouts and stuff. Any effort put into writing a readPDF()
would have to be redone every time someone tried to read a PDF :)

 On Linux/Unix there's a bunch of command line tools for trying to do
this kind of thing with PDF files - see pdftotext for example. You
could run that from R with system() and then read the text with
readLines. But there's absolutely no guarantees this will work.
Windows/Mac versions (did you say what your platform was?) of the
command line tools may be available.

 The real answer is to get the original data in a format with some
kind of semantics that R could read, for example a CSV or some nice
XML format.

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Working with source file

2010-01-06 Thread Barry Rowlingson
On Wed, Jan 6, 2010 at 8:34 PM, D Kelly O'Day ko...@processtrends.comwrote:


 I am trying to build an easy to use climate data analysis tool kit that
 will
 let non-R users run my detailed r script with minimum R learning curve
 effort.

 Here's an example:

 link -
 http://chartsgraphs.wordpress.com/files/2010/01/nsidc_trend_plot_2.doc;
 source(link)

  Why is it called '.doc'?  .R would be normal, this makes me think it's a
word document...



 My questions:

 1. how can user get list of just data.frames to investigate the data?


 ls() ? I'm not sure exactly what you mean here...


 2. how can user list my R script in the session so that he/she can adjust
 it?


 My R scripts retrieve raw data, process it and organize it for analysis.
 I'd
 like the users to be able to save the data, subset, modify data and make
 their own plots.

 I can't seem to find out any information on how to look at sourced R
 scripts. Suggestions on where to look will be appreciated.


 You can use a URL in many places where you can use a filename in R. So I
can read in your script above with:

 f=readLines(
http://chartsgraphs.wordpress.com/files/2010/01/nsidc_trend_plot_2.doc;)

 now f[1] is the first line of your file, f[2] is the second and so on.

 You might be better off using download.file(), which gets and stores a
file. Then persuade your users to edit it with an editor (use
file.edit(filename)) and then source it again.


 Useful?

Barry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Emacs vs Eclipse vs Rcmdr

2010-01-04 Thread Barry Rowlingson
On Sun, Jan 3, 2010 at 10:59 PM, Charlotte Maia mai...@gmail.com wrote:

 I'm not so much interested in which is the best user interface for R.
 Rather which is the best ***platform*** for developing ***new*** user
 interfaces for R.
 Noting I'm using the term user interface is a very general sense.
 (i.e. Can include anything from console/pseudoterminal widgets, to
 text editors with customised syntax highlighting, to elaborate menus
 and dialog boxes).

 Here are my initial thoughts:

 Emacs Pros:
 - A lot of computer experts use it.
 - Plus some high profile R people are involved in the development of ESS.
 - High level of customisation.

 Emacs Cons:
 - Need to know Lisp.
 - Counter intuitive.
 - It's really ugly.
 - No decent widget set (which is probably why it's ugly).

 Eclipse Pros:
 - It's kind of fashionable and nice looking.

 Eclipse Cons:
 - Unnecessarily complicated.
 - Need to know SWT (and maybe XML too?).
 - The process for installing (and finding) add on packages, is terrible.

 Rcmdr Pros and Cons:
 - I haven't used it for a long time, so can't really comment.
 - However, I was surprised by how many reverse dependencies it has. So
 I will assume it has some potential.

 Other people's thoughts welcome...

 Python + Rpy +  Widget set of your choice (Qt or wx would be the front-runners)

Pros:
 cross-platform
 full, mature, and standard widget set
 easy integration between R and Python
 can integrate existing code for editing (e.g. Scintilla)

I'd say it was a medium-weight solution - you need R, Python, Rpy, and
PyQt/wx but they are all open source so you can distribute them with
your code or get your users to install them quite easily.

There appears to be an effort to make a direct interface to Qt from R:

http://qtinterfaces.r-forge.r-project.org/

but that seems to be in a very early stage, but would let you not need Python.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F77_CALL, F77_NAME definition

2010-01-03 Thread Barry Rowlingson
On Sun, Jan 3, 2010 at 2:11 PM,  rkevinbur...@charter.net wrote:
 I give up. Maybe it is my search (Windows) but I cannot seem to find the 
 definition of the F77_CALL or F77_NAME macros. Either there are too many 
 matches or the search just doesn't find it. For example where is the source 
 for:

 F77_CALL(dpotri)

 ?

 I'm not sure what the Windows equivalent of 'grep -r F77_CALL .' is,
but the developer who wrote lbfgsb.c left a blatant clue which popped
up as the third match:

./appl/lbfgsb.c:#include R_ext/RS.h /* for F77_CALL */

About three screenfulls later the actual definition itself appeared.

 If you are going to do a lot of this on a windows box, get cygwin and
learn to use the unix utilities in a cygwin bash shell!


Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] F77_CALL, F77_NAME definition

2010-01-03 Thread Barry Rowlingson
On Sun, Jan 3, 2010 at 3:20 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:

 I think it's better to use a reasonable text editor here; I used Textpad.  I
 don't think there's anything too special about it, but it does have Search
 | Find in files, and I can list the file pattern (obviously *.h for a macro
 definition), and the folder (R-devel/src on my system), and then I only get
 six hits:  two definitions and 4 uses. That's a lot better than 3
 screenfuls.

 Of course. Emacs + Etags. Then it's about four keystrokes :)

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove double quotation marks

2009-12-29 Thread Barry Rowlingson
On Tue, Dec 29, 2009 at 6:31 PM, Lisa lisa...@gmail.com wrote:

 Thank you for your reply. But in the following case, “cat()” or “print()”
 doesn’t work.

 data.frame(cbind(variable 1, variable 2, cat(paste(variable, x), \n))),
 where x is a random number generated by other R script.

 Lisa

 Yes, because you are Doing It Wrong. If you have data that is indexed
by an integer, don't store it in variables called variable1, variable2
etc, because very soon you will be posting a message to R-help that is
covered in the R FAQ...

 Store it in a list:

 v = list()
 v[[1]] = c(1,2,3,4,5)
 v[[2]] = c(4,5,6,7,8,9,9,9)

then you can do v[[i]] for integer values of i.

If you really really must get values of variable by name, perhaps
because someone has given you a data file with variables called
variable1 to variable99, then use the paste() construction together
with the 'get' function:

[ not tested, but should work ]

  v1=99
  v2=102
  i=2
  get(paste(v,i,sep=))
 [1] 102

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What might be the security issues from installing R?

2009-12-28 Thread Barry Rowlingson
On Mon, Dec 28, 2009 at 6:23 PM, Peterson, Eric B. ebpeter...@usbr.gov wrote:

 My guess is that we may run into problems due to R being open-source, leading 
 to a potential perception that the code might be poorly controlled. This 
 could be further complicated by the need for downloading additional 
 open-source packages.  At present, I am not aware of any open source software 
 that has passed through the approval process, though I am also not aware of 
 any policy against open-source.

 The 'Core' of R is code committed (and therefore 'controlled') by a
smallish group of  people:

http://www.r-project.org/contributors.html

 The real problem would come when you start adding additional packages
from CRAN or R-forge or some other source. These are written by
hundreds or possibly thousands of people.

 I've not heard of any malicious code ever being found in an R
package, but maybe one day I'll sneak a back-door server into one of
mine and see how long before it gets spotted. I don't think any formal
review of CRAN package code is ever done (someone may prove me wrong
here, but there's zillions of lines of code in CRAN now).

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Object of type 'closure' not subsettable

2009-12-21 Thread Barry Rowlingson
On Mon, Dec 21, 2009 at 1:43 PM, Muhammad Rahiz
muhammad.ra...@ouce.ox.ac.uk wrote:
 Thanks Barry for the clarification.

 With regards to the following;

 d2[[i]] - file[[i]] - mean

 renamed to

 d2[[i]] - f[[i]] - m

 The object f contains the following so clearly f is subsettable

 [[1]]
  V1
 1 10
 2 10
 3 10

 [[2]]
  V1
 1 11
 2 11
 3 11

 [[3]]
  V1
 1 12
 2 12
 3 12

 My plan is to subtract m from each subset of f and to store the output as
 each individual subsets. So,

 output1 - f[[1]] - m
 output2 - f[[2]] - m
 output3 - f[[3]] - m

 If I run the following to achieve the desired result, there is an error as
 written in the subject heading.

 d2[[i]] - f[[i]] - m

 If I run the following, the error is gone but I'm not getting the output for
 each individual file I require

 d2 - f[[i]] - m

 The issue is, how do I make d2 subsettable?



 Well you haven't shown us this time how you initialised d2. Create it
as an empty list:

 d2 - list()

and then you can do:

  d2[[1]]=c(1,2,3)
  d2[[2]]=c(4,5,6)
  d2
[[1]]
[1] 1 2 3

[[2]]
[1] 4 5 6

 As I said, read one of the basic R documents linked on the
documentation section of the R web site, and play with lists and
vectors for a while.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Object of type 'closure' not subsettable

2009-12-20 Thread Barry Rowlingson
On Sun, Dec 20, 2009 at 7:40 PM, Muhammad Rahiz
muhammad.ra...@ouce.ox.ac.uk wrote:
 Hi all,

 How can I overcome the error object of type 'closure' not subsettable

 I ran the following script
 seq - paste(seq(1914, 1916, by=1), *.y, sep=.) # make sequence
 c - 3 # total number of files
 d2 - file # creates dummy file

 No it doesn't. It copies the object called 'file' into an object
called 'd2'. What's the object called 'file'? If you've not created
one already, its the 'file' function that R uses to read stuff from
files. So when you do:


 d2[[i]] - file[[i]] - mean

 you are trying to subset from d2 (and from 'file'). If I do this:

  file[[2]]

 I get your error message:

Error in file[[2]] : object of type 'closure' is not subsettable

 So clearly you aren't doing what you think you're doing.

 Some hints:

1. Read a good introduction to R. You are making a number of
fundamental mistakes here.

2. Run each line separately and check what value you get back by printing it.

3. Don't give your objects the same name as R functions (you're using
'seq', 'file', and 'mean'). Although this may work, it will confuse
people later...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exchange NAs for mean

2009-12-17 Thread Barry Rowlingson
2009/12/17 Joel Fürstenberg-Hägg joel_furstenberg_h...@hotmail.com:

 Hi all,



 I'm have a matrix (X) with observations as rows and parameters as columns. 
 I'm trying to exchange all missing values in a column by the column mean 
 using the code below, but so far, nothing happens with the NAs... Can anyone 
 see where the problem is?



 N-nrow(X) # Calculate number of rows = 108
 p-ncol(X) # Calculate number of columns = 88


 # Replace by columnwise mean
 for (i in colnames(X)) # Do for all columns in the matrix
 {
   for (j in rownames(X)) # Go through all rows
   {
      if(is.na(X[j,i])) # Search for missing value in the given position
      {
         X[j,i]=mean(X[1:p, i]) # Change missing value to the mean of the 
 column
      }
   }
 }



 mean(anything with an NA in it) == NA. You want mean(X[1:p,i],na.rm=TRUE)

  mean(c(1,2,3,NA,4))
 [1] NA
  mean(c(1,2,3,NA,4),na.rm=TRUE)
 [1] 2.5

I'll leave it to someone else to show you how to speed this code up by
removing the loops...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R question type in Moodle

2009-12-15 Thread Barry Rowlingson
On Tue, Dec 15, 2009 at 8:05 PM, David Kane d...@kanecap.com wrote:
 Moodle (www.moodle.org) is an open source course management system, a
 competitor to Blackboard. I am writing several hundred R questions
 that will be used within the quiz module in Moodle. Unfortunately,
 Moodle does not have a built in question type for R. You can read
 about the different questions types in Moodle here:

 http://moodle.org/mod/data/view.php?d=13perpage=40search=Question+Typesort=0order=DESCadvanced=0filter=1advanced=1f_44=f_45=f_46=Question+Type

 Note the Junit question type. Given that, it should be easy (?) to
 make an R question type. Here are the instructions:

 http://docs.moodle.org/en/Development:Question_type_plugin_how_to

 Question: Has anyone made an R question type for Moodle? If not, would
 anyone be interested in collaborating with me on the project?

 The tricky bit is going to be the security model. It seems that java
can run in a security sandbox with the various -Djava.security
options. Otherwise how do you stop your students doing 'system(rm -rf
/)'?

 My approach to testing programming code is that it should never be
run on the server - the server should send test case data, the client
(ie the user who wrote the thing) runs it, and returns the results.
The server then compares the results with what it thinks is the right
answer.

 I have written such a client in R, as a proof of concept, but then
the people interested in automatic testing then went on to say they
wanted the automatic system to decide if the code was good code style
or bad code style. At that point I decided they had unrealistic
expectations and got on with stuff I felt could be achieved in my
lifetime. They then decided humans had to see the code to grade its
style, and those same humans could check if there was any nasty code
(system(rm -rf /)) and run the examples themselves. Fair enough.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS datalines or cards statement equivalent in R?

2009-12-07 Thread Barry Rowlingson
On Mon, Dec 7, 2009 at 3:53 PM, Marshall Feldman ma...@uri.edu wrote:
 Regarding the various methods people have suggested, what if a typical
 tab-delimited data line looks like:

     SMS11001 1990 M01 688.0

 and the SAS INPUT statement is

   INPUT survey $ 1-2 seasonal $ 3 state $ 4-5 area $ 6-10 supersector $
 11-12 @13 industry $8. datatype $ 21-22  year period $ value footnote $ ;

 Note that most data lines have no footnote item, as in the sample.

 Here (I think) we'd want all the character variables to be read as factors,
 possibly year as a date, and value as numeric.

 Actually I'm surprised that nobody has yet said what a clearly
bonkers thing it is to mix up your data and your analysis code in a
single file. Now suppose you have another set of data you want to
analyse with the same code? Are you going to create a new file and
paste the new data in? You've now got two copies of your analysis code
- good luck keeping corrections to that code synchronised.

 This just seems like horrendously bad practice, which is one reason
it's kludgy in R. If it was good practice, someone would surely have
written a way to do it neatly.

 Keep your data in data files, and your functions in .R function
files. You'll thank me later.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS datalines or cards statement equivalent in R?

2009-12-07 Thread Barry Rowlingson
On Mon, Dec 7, 2009 at 5:37 PM, Marshall Feldman ma...@uri.edu wrote:
 I totally agree with Barry, although it's sometimes convenient to include
 data with analysis code for debugging and/or documentation purposes.

 However, the example actually applies equally to separate data files. In
 fact, the example is from the U.S. Bureau of Labor Statistics at
 ftp://ftp.bls.gov/pub/time.series/sm/, which contains nothing but data and
 documentation files. At issue is not where the data come from, but rather
 how to parse relatively complex data organized inconsistently. SAS has
 built-in the ability to parse five different organizations of data: list
 (delimited), modified list, column, formatted, and mixed (see
 http://www.masil.org/sas/input.html). It seems R can parse such data, but
 only with considerable work by the user. It would be great to have a
 function/package that implements something with as easy (hah!) and flexible
 as SAS.

 I'd love to duplicate this functionality of SAS, however, I fear:

http://www.sas.com/news/preleases/SASsuit.html

 but yes, some kind of declarative, template-driven data definition
system might be useful. Such a thing may already exist, and be based
on XML instead of what looks like line noise:

INPUT #1 No 7.0 #2 name $CHAR15. / address $CHAR50. #4 phone $CHAR12.;

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS datalines or cards statement equivalent in R?

2009-12-07 Thread Barry Rowlingson
On Mon, Dec 7, 2009 at 8:54 PM, Marshall Feldman ma...@uri.edu wrote:


 Barry Rowlingson wrote:

  I'd love to duplicate this functionality of SAS, however, I fear:

 http://www.sas.com/news/preleases/SASsuit.html



 Amazing, since input statements in SAS bear an uncanny resemblance to how
 PL/I handles input from text files.


 The lawsuit appears to not be for copying behaviour of SAS per se,
but in using an Educational License version of SAS to test and
benchmark the WPS version.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation Question

2009-12-04 Thread Barry Rowlingson
On Thu, Dec 3, 2009 at 9:52 PM, John Filben johnfil...@yahoo.com wrote:
 Can R support data manipulation programming that is available in the SAS 
 datastep?  Specifically, can R support the following:
 -  Read multiple dataset one record at a time and compare values from 
 each; then base on if-then logic write to multiple output files
 -  Load a lookup table and then process a different file; based on 
 if-then logic, access and lookup values in the table
 -  Support modular “gosub”programming
 -  Sort files
 -  Date math and conversions
 -  Would it be able to support the following type of logic:
 o   Start
 §  Read Record from File 1
 §  Read Record from File 2
 §  Match
 · If Key 1  Key 2 and Key 1  Key 2, Write to output file A
 · If Key 1 = Key 2, Write to output file B
 · If Key 1  Key 2 and Key 1  Key 2, Write to output file C§  Goto 
 Start until File 1 Done
  John Filben

I'll expand on Hadley Wickham's Yes, to say Yes, and it wouldn't be
much of a 'system for statistical computation and graphics' if it
couldn't do that.

Remember R uses the 'S' and C programming languages and is Open
Source. If it _cant_ do something you want it to do, you can write
code that does it. Like the date math and conversions. Originally,
maybe wy back in R version 0.something, it didn't have that. But
someone wrote it, and wisely contributed it, and the community saw
that it was good. And now we have date math and conversions. And
nobody has to write any date math or conversion codes ever again.

  Now tell me how to get something into the SAS core code.

Barry

P.S. I see a very obvious optimisation you can do on this line:

  If Key 1  Key 2 and Key 1  Key 2, Write to output file A

but maybe that's some kind of weird SASism

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop doesn't work

2009-12-01 Thread Barry Rowlingson
On Tue, Dec 1, 2009 at 2:01 PM, Trafim rdapam...@gmail.com wrote:
 Hi everybody,

 I have the following problem, the following code seems to run only once for
 i and j and for k from one to M.
 Doesn't R for increase the argument by itself?

 for (i in 1:N){
  for (j in 1:(Tk-1)){
  if((XGrid[i]  Xk[j+1])(Xk[j] = XGrid[i])){
        for (k in 1:M){
           if ((RBins[k]=Rk[j+1])(Rk[j+1]RBins[k+1])){
              GR[k] - +1
           }
        }
  }
  }
 }


 Of course it does. Try this, which is something we call a complete
reproducible example:

N=10
for(i in 1:N){
 print(i)
}

How do we know your N isn't 1 and your Tk isn't 2?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to display an image on RGL plot?

2009-11-26 Thread Barry Rowlingson
On Thu, Nov 26, 2009 at 7:14 AM, Vladimir Eremeev wl2...@gmail.com wrote:

 The underlying picture is a JPEG image, loaded with the rimage package and
 coerced to the matrix.
 Spheres denote control points, collected from this picture and must be
 situated over the certain points of the image.
 I display the image with rgl.points.
 In case of the standard video camera image (704x576) it has to display over
 40 points which is rather slow and memory consuming.
 How can I put an original JPEG on this plot?

 Another problem is that the picture is color initially, but was converted to
 the grayscale. I'd like to preserve colors.


 Try using surface3d with a flat surface (z=0) and a matrix of colours
in the col= argument. See help(surface3d) for an example.

 Also STOP POSTING!!! There seems to be about 8 copies of your message
in my inbox!!!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Concave hull

2009-11-26 Thread Barry Rowlingson
On Thu, Nov 26, 2009 at 9:45 PM, Ted Harding
ted.hard...@manchester.ac.uk wrote:

 So it is still an undefined solution. As is yours -- since you might
 want to use different radii of spheres from different directions.

 I think the formal and rigorous definition is a nice polygon that goes round
my points.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mysterious R script behavior when called from webserver

2009-11-25 Thread Barry Rowlingson
On Wed, Nov 25, 2009 at 8:15 PM, Dylan Beaudette
debeaude...@ucdavis.edu wrote:
 Hi,

 I am trying to transition a system based on dynamic image generation (via R)
 from our development system to a production environment. Our R script
 functions as expected when run by a regular user. However the script dies
 when calling the png() function, when started by the webserver user.

 Here are some details

sessionInfo()
 R version 2.9.2 (2009-08-24)
 i686-pc-linux-gnu
 locale:
 C
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 The script gets to this line:

 png(file=filename, width=600, height=400)

 and then dies. It leaves an empty PNG file where it should be, however it
 never finishes the file. If

 I replace png() with pdf() an output file is generated and closed by dev.off()
 as expected.

 It seems like the environment is setup just as when started by a regular user,
 specifically the LD_LIBRARY_PATH variable.

 This behavior suggests that R is encountering an error, and stopping. However
 there is no reporting of the error. Is there any way to get more verbose
 error reporting?

 How is R run from your web server? Does it start a new R process or
is it an apache module thing with a dynamically linked R (if such a
thing even exists)? Can't figure out how you could get more error
reporting without knowing that - you need to see where stderr is
going, possibly to the apache error.log file - have you looked there?

 Have you tried a trivial png generating example, just a three liner:

png(file=/tmp/wherever/foo.png)
plot(1:10)
dev.off()

 just in case it's something else previous in your script that's
breaking things.

 In the old days of R you needed an X11 display connection to do PNG
graphics, but that was fixed before 2.9, I think. Try it interactively
but unset the DISPLAY variable first:

 export -n DISPLAY
 R
   png(... etc etc)

 Does that work?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Concave hull

2009-11-25 Thread Barry Rowlingson
On Wed, Nov 25, 2009 at 11:39 PM, Remko Duursma remkoduur...@gmail.com wrote:
 See the function 'convhulln' in the 'geometry' package. It uses this
 algorithm : http://www.qhull.org/

 That looks like a CONVEX hull, the original poster asked about
CONCAVE hulls (and in all CAPS to emphasise this!).

 I've seen various algorithms for generating 'concave hulls' of point
sets, the one that tops google searches is not available in source
code but there is a web applet and set of java class files which
appear to be based on a patented algorithm.

 There's a lot of discussion on algorithms for this, and some
implementations by processing the point data with GRASS. There main
discussion appears to be to first generate the convex hull and then
replace single edges with two edges based on some minimum or maximum
distance criteria...

 Couldn't find an implementation in R or Python though...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Natural colours for topographic data

2009-11-24 Thread Barry Rowlingson
On Tue, Nov 24, 2009 at 8:42 AM, Karl Ove Hufthammer k...@huftis.org wrote:
 On Mon, 23 Nov 2009 12:21:03 -0500 David Winsemius
 dwinsem...@comcast.net wrote:
  I would be happy with a simple one, that just mapped negative values
  to water colours and positive values to land colours.

 Searching with the strategy color positive negative zero in r-search
 and limiting it to r-help replies,  I get this Jim Lemon reply using
 (naturally) plotrix's color.scale:

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/90837.html

 The application to your needs looks pretty immediate.

 Thanks for the suggestion, but the arguments that 'color.scale' takes
 (range of red, green and blue values) makes it not very useful for this
 purpose.


 Have you tried my colourscheme package? Its not on CRAN but you can
get it from here:

http://r-forge.r-project.org/projects/colourscheme/

And can be installed thus:
  install.packages(colourschemes,repos=http://r-forge.r-project.org;)

 [sorry about the inconsistency between 'colourscheme' and 'colourschemes'!]

Vignette via r-forge's source code browser:
http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/pkg/inst/doc/colourschemes.pdf?rev=19root=colourscheme

 It defines various ways of mapping values to colours and using those
colours in plots. I use topographic-style colour schemes in the
examples, so this might be just what you want. The example in
?multiRamp is this:

# topological colour scheme - water, land, ice:
 tramp = multiRamp(rbind(c(-2000,0),c(0,1000),c(1000,9000)),
  list(c(black,blue),c(green,brown),c(gray70,gray70))
  )

then:

  tramp(-100)
 [1] #F2FF

 - is  a colour between black and blue in the ocean

 tramp(500)
 [1] #539515FF

 - is somewhere between green and brown in the land

  tramp(1500)
 [1] #B3B3B3FF

 - is the gray of the ice.

No logarithmic colour scaling, but I do detail in the vignette how to
write your own colour scheme functions that are compatible with the
ones supplied. Will be glad to help more on this!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do i persuade IT to install R on PCs ?? ...and should I ??

2009-11-23 Thread Barry Rowlingson
On Mon, Nov 23, 2009 at 1:01 PM, David Winsemius dwinsem...@comcast.net wrote:

 It was a good read. We had a recent example submitted to r-help where
 I had occasion to test their solution 2 (use OO.org' Calc) and found
 it to be just as bad at curve fitting for a polynomial as had been
 Excel. Take a look at the pdf attached to this r-help item:

 https://stat.ethz.ch/pipermail/r-help/2009-November/218005.html

 Well, the OO.org guys are trying to make something compatible with a
piece of MS software, but maybe this is taking it too far.

 I love the story that sections of Microsoft XML spec for Office says
things like Do whatever Excel does, thus setting into stone the bugs
inherent in that package as an ISO standard...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Do you keep an archive of useful R code? and if so - how?

2009-11-22 Thread Barry Rowlingson
On Sun, Nov 22, 2009 at 5:45 PM, Tal Galili tal.gal...@gmail.com wrote:
 Hello Marc and Jeff,
 Thank you for replying.

 I am using winXP, and any recommendation for GUI based system will be
 welcomed.

 However, my initial question was not how to maintain code that I write
 and develop, but rather how to keep a filing system for other peoples code
 that I find useful.
 Here are some simple examples:

   - A code to allow me to start a window with history recording turned
   on.
   - A code to have wider margins so to allow more space for the plot
   labels.
   - A code for creating an ellipse plot of a matrix of correlations.

 All of these example are things I wouldn't put into a Subversion system or a
 new package.

I just use plain text files for keeping notes - generally each project
directory I work on has a 'notes.txt' file which is a working log of
what I'm doing. If I think 'how did I do that the other day?' I can
search my text files.

 Recently I've been experimenting with using 'personal' or 'desktop'
wiki systems for this. Like Wikipedia but just for you, and stored as
files on your PC, and edited with a local client program instead of
over the web (although some personal wikis work over the web). I've
found 'zim' to be pretty good for this. It organises notes, lets you
link pages, timestamps things, has various plugins and MOST
importantly it's Open Source so you won't ever have your notes locked
up in a proprietary format that you need to keep paying a license fee
for.

 Not sure if there's a Windows port of it, but I'm certain similar
systems exist for Windows.

 Another idea is to have a public blog for R tips and tricks. That way
not only do you get free storage (from blogspot.com or some other blog
provider) but also it's searchable and other people can find it and
comment and improve on it.

 Or you could contribute to the R-wiki:

http://wiki.r-project.org/rwiki/doku.php?id=tips:tips

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] python

2009-11-21 Thread Barry Rowlingson
On Sat, Nov 21, 2009 at 2:29 PM, Jean Legeande jean.legea...@gmail.com wrote:
 Dear R users,

 I would like to make my R code for MCMC faster. It is possible to integrate
 C code into R but I think C is too complicated for me. I would need a C
 introduction only for MCMC and I do not know if such a thing exists.

 I was thinking of Python (and scipy). Where could I read about its
 integration into R ? How developed are the statistical packages in Python ?
 I could not find a Python package on the web with functions to simulate
 Wishart, or multivariate gamma or student distributions.

 Since I am a little bit lost, I write this message to the R help list. Sorry
 for these naive questions and thanks for your help.


 Have you done a profile of your MCMC code to see where the bottleneck
is? Without doing that first any effort could be a total waste of
time.

 R can do a lot of it's calculations at the same level as C, so if 80%
of your time is spent inverting matrices then converting to Python or
C (or even assembly language) isn't going to help much since R's
matrix inversion is done using C code (and quite possibly very
optimised C code with maybe some assembly language too).

 So do a profile (see ?Rprof) and work out the bottleneck. It might be
one of your functions, in which case just re-writing that in C and
linking to R (see programmers guide and a good C book) will do the
job.

 My hunch is that Python and R run at about the same speed, and both
use C libraries for speedups (Python primarily via the numpy package).

 You can call the GSL from Python, and there are probably tricks for
getting the distributions you want:

http://www.mailinglistarchive.com/help-...@gnu.org/msg00096.html

 describes how to get samples from a Wishart.

 However using the GSL from Python probably wont be much faster than
using R because again it's all at the C level already. Did I suggest
you profile your code?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can R scripts detect signals sent by the task scheduler ?

2009-11-20 Thread Barry Rowlingson
On Fri, Nov 20, 2009 at 1:25 PM,  mau...@alice.it wrote:
 In general, is it possible to run R scripts through cron jobs ?

 Yes, the only problem might be if you use anything that needs a
graphics window. In the old days you needed an X11 display to create
png graphics with the png() function, but not any more. I'm not sure
if yo need an X11 system for anything apart from showing graphs. But I
don't know everything.

 Is it possible to make  the script detect the system interrupt, save its 
 current status and then exit so that next time it is rescheduled it can pick 
 up from where it left ?

?Signals:

Interrupting Execution of R
Description:
 On receiving ‘SIGUSR1’ R will save the workspace and quit.
 ‘SIGUSR2’ has the same result except that the ‘.Last’ function and
 ‘on.exit’ expressions will not be called.

 So if you can send that signal to your process you might be in luck.
However if by 'system interrupt' you mean the SIGQUIT signal, then I'm
not sure that R can trap interrupts to that extent. You might want to
see if your operating system supports 'checkpointing', in which case
any process can be saved and restarted.

 If by 'system interrupt' you mean SIGKILL (signal 9) then you are very stuck.

 What is going on in your system? Are you trying to checkpoint if the
machine is shut down?

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can R scripts detect signals sent by the task scheduler ?

2009-11-20 Thread Barry Rowlingson
On Fri, Nov 20, 2009 at 11:01 PM,  mau...@alice.it wrote:
 I have just updated R version to 2.10 for Windows.
 I cannot find package fork which seems to include  the exception handling
 functions.
 The list that pops up when I select Install Package does not contain any
 fork package (even spelt
 with capital letters).  Where am I supposed to get it from ?

 I suspect there's no such thing for Windows. Process control and
signalling is something very different between operating systems, so
someone needs to write a dedicated Windows version of that package.

 See here:

http://ftp.heanet.ie/mirrors/cran.r-project.org/web/packages/fork/index.html

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] parsing Google search results

2009-11-17 Thread Barry Rowlingson
On Mon, Nov 16, 2009 at 7:29 PM, Philip Leifeld leif...@coll.mpg.de wrote:
 Hi,

 how can I parse Google search results? The following code returns
 integer(0) instead of 1 although the results of the query clearly
 contain the regex cran.

 
 address - url(http://www.google.com/search?q=cran;)
 open(address)
 lines - readLines(address)
 grep(cran, lines[3])
 

 Hmmm how could that be? It's not like you're getting any warnings or
anything...

 Or are you? I get a couple:

  address - url(http://www.google.com/search?q=cran;)
  open(address)
  lines - readLines(address)
 Warning message:
 In readLines(address) :
   incomplete final line found on 'http://www.google.com/search?q=cran'

 - but that's probably because there's no newline at the end of the
data. Ignore that.

  grep(cran,lines[3])
 integer(0)
 Warning message:
 In grep(cran, lines[3]) : input string 1 is invalid in this locale

 Oh now that looks serious. And relevant. Did you get this warning?
You didn't say. I'll assume you didn't, because otherwise you surely
would have mentioned it. So I won't waste my time typing my solution
in now.

 Oh alright. You may need to set the encoding when you open the url to 'latin1':

  address - url(http://www.google.com/search?q=cran,encoding=latin1;)

  grep(cran,lines[3])
 [1] 1

So is that the problem? Did you get the warning message and not show
us? Transcripts (inputs and outputs) are good.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Barry Rowlingson
On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman dimitri...@gmail.com wrote:
 Hello,

 This is what I am trying to do: I wrote a little function that takes
 addresses (coordinates) as input, and returns the road distance between
 every two points using Google Maps. Catch is, there are 2000 addresses, so I
 have to get around 2x10^6 addresses. On my first go, this is what I did:

 I hope on your first go you didn't run it with 2000 addresses. You
did test it with 13 addresses first didn't you?

 Another idea is to replace your Distance function with a function
that returns runif(1). This will either make your code fail much much
quicker or identify that the problem is in the Distance function (some
memory leak there).

 Also, you should check the return value from your google query - I've
seen google get a bit upset about repeated automated queries and
return a message saying This looks like an automated query and a
CAPTCHA test.


 grid2=grid[!is.na(grid)]
 n = length(grid2)
 for (i in 1:n) {
 temp = Distances(grid2[i])
 write.table(temp,distances.csv,col.names=F,row.names=F,append=T)
 }

This won't work - you're overwriting distances.csv with the new value
of 'temp' every time. Another good reason to test with 13 values
before waiting and failing after six hours, and then having to hammer
google's map server again.

I'd write this as a simple loop, and dump all the apply stuff. And
rewrite Distance to be a function of two lat-longs:

Distance=function(lat1,lon1,lat2,lon2){

return(distance)
}

Then (untested):

Dmat = matrix(NA,nrow(X),nrow(X))

for(i in 2:nrow(X)){
 for(j in 1:i){
  d = Distance(X[i,1],X[i,2],X[j,1],X[j,2])
  Dmat[i,j]=d
}
}

 I'm not sure apply wins much here.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Barry Rowlingson
On Sun, Nov 15, 2009 at 11:57 AM, Dimitri Szerman dimitri...@gmail.com wrote:


 Thanks. The reason I didn't want to do something like that is because, in
 the event of a crash, I'll loose everything that was done. That's why I
 though of appending the results often.

 Oops yes, I missed the 'append=TRUE' flag. That's a good idea.

 Last time I did something similar to this I used a relational
database for saving. I created a table of all the i,j pairs with
columns i,j,distance and 'ok'. 'ok' was set to False initially. Then
I'd query the db for a row with 'ok=False', and go about getting the
distance. If I got a good distance back I set 'ok=True' and never
bothered getting that again.

  This was in Python with SQLite as the database engine, but you can
do something similar in R. With a distributed database you could
easily split the queries between as many servers as you can get your
hands on.

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R, NIH and FDA

2009-11-13 Thread Barry Rowlingson
On Fri, Nov 13, 2009 at 11:21 AM, Federico Calboli
f.calb...@imperial.ac.uk wrote:
 Dear All,

 I will soon be working with NIH and possibly FDA. Will I be able to use R or
 will I be forced to use SAS?

 Working with is different to Working for. Assuming they want to
work with you then they want you for your abilities and skills, and if
those skills are with R then you go ahead and use R.

 You don't employ a bricklayer to build a wall and then tell them you
want it made from reinforced concrete.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] polygon kills X-server

2009-11-10 Thread Barry Rowlingson
2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:

 This one is extraordinary dangerous: it also killed my kind of X server
 called Windows completely so that I had to reset the machine.

 Perhaps it should be debugged on the Linux side with less serious side
 effects

 The best way to debug this might be via a nested X server - something
like Xnest or Xephyr - or possibly on a virtual machine in the hope
that the virtual machine's display system crashes before the host's
does.

 I haven't tried this yet!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which data structure to choose to keep multile objects?

2009-11-07 Thread Barry Rowlingson
On Fri, Nov 6, 2009 at 11:58 PM, clue_less suhai_tim_...@yahoo.com wrote:

 I have a function called nnmf which takes in one matrix and  returns two
 matrices. for example,

 X
     [,1] [,2] [,3] [,4]
 [1,]    1    4    7   10
 [2,]    2    5    8   11
 [3,]    3    6    9   12

 z=nnmf(X,2)

 z$W
          [,1]      [,2]
 [1,] 0.8645422 0.6643681
 [2,] 1.7411863 0.5377504
 [3,] 2.6179287 0.4111063
 z$H
           [,1]     [,2]     [,3]     [,4]
 [1,] 1.14299486 1.692260 2.241279  2.79030
 [2,] 0.01838514 3.818559 7.619719 11.42087

 

 Now I would like to run it many times --

 z2 = nnmf(X,2)
 z3 = nnmf(X,3)
 z4 = nnmf(X,4)
 z5 = nnmf(X,5)
 ...

 But I would like to do it automatically , something like -

 xprocess-function(max_val) {
   for (iter in  2: max_val) {

      zz = sprintf( z%s, iter )

      zz -nnmf(X,iter)

   }

 }

 xprocess(10)


 

 But how could I keep collection of my results each run?

 Shall I have a data structure to keep appending results?

 something like

 theta = {}

 ?

 which data structure to choose to keep multile objects?


 You're already using one! It's called a list:

zz=list()
 for(i in 1:10){
  zz[[i]] = nnmf(X,i)
}

then you can do:

zz[[1]]$W and zz[[1]]$H

 Note the BIG difference between zz[1] and zz[[1]] though.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.table but more tables at once

2009-10-28 Thread Barry Rowlingson
On Wed, Oct 28, 2009 at 9:38 AM, Sybille Wendel
wendel.sybi...@googlemail.com wrote:
 Dear all,

 I have a lot of data files (.txt) that I want to read in all at once, if
 possible.
 the files have names in time system. for example: RA940101, RA940102,
 RA940103, RA940104 an so on.
 (meaning: RA, year:91, month: here january, day of the month.)

 I tried something like

 vektor - c(RA940101,RA940102,RA940103)

 for (x in 1:3)
 { data - read.table(paste(vektor[x],sep=),header=F) }

 But how can I put the vektor on the left side, so that data would be instead
 of data the three first days of the year 1994?

 Store in a list:

data = list()
 for(x  in 1:3){
   data[[vektor[x]]] = read.table(...)
}

 then you can do data[[RA940101]] to get that set of data.

 You can also do this by number:

 data[[x]] = read.table()

 and then get data[[1]], data[[2]] etc etc.

See any basic R help/tutorial for more information about 'lists'.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression on large file

2009-10-28 Thread Barry Rowlingson
On Wed, Oct 28, 2009 at 11:50 AM, Georg Ehret georgeh...@gmail.com wrote:
 Dear R community,
   I have a fairly large file with variables in rows. Every variable
 (thousands) needs to be regressed on a reference variable. The file is too
 big to load into R (or R gets too slow having done it) and I do now read in
 line by line with scan (see below) and write the results to out. Although
 improved, this is still very slow... Can someone please help me and suggest
 how I can make this faster?

 Thank you and best regards, Georg.
 ***
 Georg Ehret, Johns Hopkins U, Baltimore MD, USA


 for (i in 16:nmax){

 line-scan(file=paste(file),nlines=1,skip=(i-1),what=integer,sep=,)
        d-as.numeric(line[-1])
        name-line[1]
        modela - lm(s1~a+a2+b+s+M+W)
        modelb - lm(s2~a+a2+b+s+M+W+d)
        modelc - lm(s3~a+2+b+s+M+W+d+d*s)
        p_main - anova(modela,modelb)$P[2]
        p_main_i - anova(modela,modelc)$P[2]
        p_i - anova(modelb,modelc)$P[2]

 cat(c(name,p_main,p_main_i,p_i),file=paste(out,.txt,sep=),append=T)
        cat(\n,file=paste(out,.txt,sep=),append=T)
 }

 Normally you shouldn't try to optimise something until you know where
the time is going. It could be that fitting your three linear models
is taking most time, in which case there's no point optimising the
input/output...

 But I reckon (and this is a guess) the time is taken by the fact that
scan() is having to skip from the start every time. You can confirm
this by commenting out all the stuff inside the loop except for the
line-scan(...) line. If this still takes ages then we've found the
bottleneck.

 So, what you then do to fix that is to get R to read from a
connection - this is an object that you can read from sequentially
without having to skip from the start every time. There's examples in
help(connections) that will get you going.


Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] User input when running R code in batch mode

2009-10-27 Thread Barry Rowlingson
On Tue, Oct 27, 2009 at 9:20 AM, Kaushik Krishnan
kaushik.s.krish...@gmail.com wrote:

 $ r --vanilla  test.r
 a - scan(what='character',n=1); a
 1: Read 0 items
 character(0)
 
 Now it's not working.

 Assuming this is a unix environment, the syntax ' test.r' means 'my
standard input stream is the file test.r'. That's not what you want.
Give R the file name as an argument and let the standard input stream
remain user input:

$ r --vanilla  test.r
1: hello
Read 1 item
[1] hello

 Note that this is 'r' and not 'R'. For me this comes from the
'littler' package in Ubuntu Linux. The same thing with 'R' doesn't
work:

$ R --vanilla  test.r
ARGUMENT 'test.r' __ignored__
[banner]
   [the R prompt appears]

 Maybe there's a way of doing this with big R, but I think littler is
designed for this kind of thing.

 Is there any way to make R stop for the user to enter values when
 running in batch mode either by changing the way I invoke scan() or
 readLines() or by using any other function?

 An alternative is to use the tcltk package to make a dialog for user input.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Datasets for The Statistical Sleuth

2009-10-25 Thread Barry Rowlingson
On Sun, Oct 25, 2009 at 5:48 AM, Yihui Xie xieyi...@gmail.com wrote:
 Hi everyone,

 I wonder if there already exists any R packages containing all the
 data sets for the book The Statistical Sleuth
 (http://www.proaxis.com/~panorama/home.htm; also available at StatLib
 http://lib.stat.cmu.edu/datasets/sleuth).

 I'm writing an R package with a friend for one of our stat courses
 where SAS is the main tool being used. As the time is limited and half
 of the semester has gone, we want to finish the package ASAP before
 the biased (my personal feeling) impression towards R comes up. It
 will save us some time (especially the time on writing R
 documentation) if anyone has already done the work of packing up all
 the data sets. Thanks a lot!

 You should be able to read the spss versions of the data files using
'read.spss' from the foreign package. I've just read in all the .sav
files from the 2nd edition data sets with no errors.

 Probably all you then need to do is convert them to data frames and
save them as a .RData file which your students can attach. Actually
it's turning out quicker for me to do this than to tell you how :)

 Get the spss.exe, unzip it to create a load of .sav files, install
the 'foreign' package if you don't have it already, then do this in R:

require(foreign)
e=new.env()
for(f in list.files(pattern=.sav)){
  name = sub(.sav,,f)
  data = as.data.frame(read.spss(f))
  assign(name,data,env=e)
}
save(file=statsleuth.RData,list=ls(e),envir=e)

Then to test start a new R session and do:

  attach(statsleuth.RData)
  summary(ex1611)
  COUNTRY  PCTCATH P2PRATIO   PCTINDIG
 Argentina: 1   Min.   : 1.20   Min.   : 0.9   Min.   : 13.00
 Australia: 1   1st Qu.:28.60   1st Qu.: 1.8   1st Qu.: 58.50
 Bolivia  : 1   Median :82.10   Median : 3.8   Median : 76.00
 Brazil   : 1   Mean   :63.74   Mean   : 5.1   Mean   : 70.53
 Chile: 1   3rd Qu.:95.50   3rd Qu.: 8.3   3rd Qu.: 92.00
 Ecuador  : 1   Max.   :97.60   Max.   :11.9   Max.   :100.00
 (Other)  :15  NA's   :  2.00

  ls(file:statsleuth.RData)
  [1] case0101 case0102 case0201 case0202 case0301 case0302
  [7] case0401 case0402 case0501 case0502 case0601 case0602
 [13] case0701 case0702 case0801 case0802 case0901 case0902
[etc etc etc etc]

 My only worry is whether all the data sets convert to data frames
okay, and nothing is lost in the conversion. It's possible that SPSS
has all sorts of other metadata that is dropped, or something. I'd
suggest you check all 140 data sets first...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arima crashes too

2009-10-23 Thread Barry Rowlingson
On Fri, Oct 23, 2009 at 12:32 PM, Alberto Monteiro
albm...@centroin.com.br wrote:

 I mean that, if I run a loop, it doesn't finish. Or, more
 catastrophically, if I am running a loop and saving data to an
 open file, it terminates the loop and does not close the file.

 Reproducible example:

 test.arima - function() {
  lets.crash.arima - c(71, 78, 95, 59) # , 113
  for (x in 90:120) {
    reg - arima(c(lets.crash.arima, x), order = c(1,0,0))
    cat(ok for x =, x, \n)
  }
  cat(close file and prepare a nice summary\n)
  return(arima passed the test)
 }

 test.arima()

 As you can see, the loop aborts, the function never returns, with
 potentially nasty effects (namely: I have to finish R with q() to
 close the files and examine them).

 If you're doing anything in a loop that has the potential to fail
because of singularities or other conditions when your model can't be
fitted, you need to stick what you are doing in a 'try' clause. This
lets you trap errors and do something with them.

 Plenty of examples in help(try) or this from me:

 for(i in 1:10){
 print(solve(matrix(c(3,3,3,i),2,2)))
 }

 This stops the loop at i=3. Now stick it in a try() clause:

 for(i in 1:10){
 print(try(solve(matrix(c(3,3,3,i),2,2
 }

 and it gives a warning and carries on. If you want your code to do
something with the failure cases then the help for try() tells you
what to look for.

 I'm not sure why your arima produces an error, but I'm assuming the
numbers are such that the model can't be fitted. I don't really know
what arima is doing.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determining which file is newer

2009-10-23 Thread Barry Rowlingson
On Fri, Oct 23, 2009 at 5:02 PM, Dennis Fisher fis...@plessthan.com wrote:
 Colleagues,

 I wish to execute a task only if a particular file is newer than a second
 file.  I can access the file modification dates using file.info()$mtime.
 This yields a time object (? POSIX).
                        size    isdir   mode    mtime
           ctime
        Testfile        4421    FALSE   755     2009-10-23 08:59:09
   2009-10-23 08:59:09
 What is the simplest means to compare these two objects?

 Split the string representation of the object into date and time
parts, then split those parts into day, month , year and hour minute
second, taking care of internationalisation and localisation, then
recombine all those parts into a number of seconds since Jan 1 1970,
taking care of leap seconds and the slowing down of the earth due to
the tides of the moon.

 Or just use the usual operators:

  file.info(/etc/passwd)$mtime = file.info(/etc/motd)$mtime
[1] TRUE

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arima crashes too

2009-10-22 Thread Barry Rowlingson
On Thu, Oct 22, 2009 at 6:19 PM, Alberto Monteiro
albm...@centroin.com.br wrote:
 Another pathological test.

 arima does not crash for that series that crashes arma:

 arima(c(2.01, 2.22, 2.09, 2.17, 2.42), order=c(1,0,0))

 However, arima crashes for this:

 arima(c(1.71, 1.78, 1.95, 1.59, 2.13), order=c(1,0,0))

 arima seems pretty consistent in its crashing behaviour, since crashing for
 one series means crashing for all affine series:

 I'm not getting what I'd call 'crashes' with your arma or arima
examples- I get an error message and a warning:

  arma(c(2.01, 2.22, 2.09, 2.17, 2.42), order=c(1,0))
Error in AA %*% t(X) : requires numeric/complex matrix/vector arguments
In addition: Warning message:
In ar.ols(x, order.max = k, aic = FALSE, demean = FALSE, intercept =
include.intercept) :
  model order: 2singularities in the computation of the projection
matrixresults are only valid up to model order1

 You've not told us what you get, and the phrase 'crash' normally
means some kind of memory error that *terminates* a running R session.
Are you really crashing R such that it terminates? In which case, what
version number/platform etc?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PDF too large, PNG bad quality

2009-10-22 Thread Barry Rowlingson
On Thu, Oct 22, 2009 at 8:28 PM, Greg Snow greg.s...@imail.org wrote:
 The problem with the pdf files is that they are storing the information for 
 every one of your points, even the ones that are overplotted by other points. 
  The png file is smaller because it only stores information on which color 
 each pixel should be, not how many points contributed to a particular pixel 
 being a given color.  But then png files convert the text to pixel 
 information as well which don't look good if there is post scaling.

 If you want to go the pdf route, then you need to find some way to reduce 
 redundant information while still getting the main points of the plot.  With 
 so many point, I would suggest looking at the hexbin package (bioconductor I 
 think) as one approach, it will not be an identical scatterplot, but will 
 convey the information (possibly better) with much smaller graphics file 
 sizes.  There are other tools like sunflower plots or others, but hexbin has 
 worked well for me.


 I've seen this kind of thing happen after waiting an hour for one of
my printouts when queued after something submitted by one of our
extreme value stats people. I've seen them make plots containing maybe
a million points, most of which are in a big black blob, but they want
to be able to show the important sixty or so points at the extremes.

 I'm not sure what the best way to print this kind of thing is - if
they know where the big blob is going to be then they could apply some
cutoff to the plot and only show points outside the cutoff, and fill
the region inside the cutoff with a black polygon...

 Another idea may be to do a high resolution plot as a PNG (think 300
pixels per inch of your desired final output) but do it without text
and add that on later in a graphics package.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] News on R s largest corporate partner REVolution Computingand SPSS CEO:

2009-10-21 Thread Barry Rowlingson
On Wed, Oct 21, 2009 at 8:09 PM, Charles Annis, P.E.
charles.an...@statisticalengineering.com wrote:
 David:

 Do you mean inappropriate or embarrassing?

 How would we R-ians know what has happened at REVolution were it not for
 Ajay's note?  Were you planning a press release?  Something like, 47% of
 Revolution summarily fired.  Nobody left with more than a year of
 experience...?

 Ajay's note was a verbatim cut n paste from Danese Cooper's blog,
with no comment, editorial, opinion, or anything. Although Danese may
well be an open-source evangelist, that doesn't mean her blog is
public domain. This was essentially MLP[1] which I don't think we want
on the R-help list. Anyone who cares about REvolution will already be
following the relevant blogs, reading the New York Times, and the
words of the prophet that were written on the subway walls. Oops I
went all Simon and Garfunkel there.

 If people want to add value to relevant blog posts onto R-news by
commenting and inviting discussion then that's fine by me. But no more
cut n paste jobs.

Barry

[1] Mindless Link Propogation - when someone just emails Hey guyz
look at this : http://example.com/lulz;

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random numbers between 0 and 1

2009-10-21 Thread Barry Rowlingson
On Wed, Oct 21, 2009 at 8:37 PM, Gábor Csárdi csa...@rmki.kfki.hu wrote:
 I would suggest to use the generator at
 http://submoon.freeshell.org/pix/valium/dilbert_rng.jpg
 and subtract 8.5.

 You may laugh (indeed I did) but some medical trials have used (and
poss still do) telephone-a-human random numbers. When deciding to give
medicine or placebo the doctor calls the phone number and asks for a
random number, which is read out, and this decides what the patient
gets.

 I suppose this could be implemented in R with an interface to a
speech recognition engine and a telephone... but runif(100) is easier.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variant of cloud with sticks from points to surface

2009-10-19 Thread Barry Rowlingson
On Mon, Oct 19, 2009 at 5:26 AM, PerfectTiling perfecttil...@yahoo.com wrote:

 Hi,

 I'd like to

  (1) plot a perspective view of a 3D scatterplot, with a fitted (curved)
 surface;
  (2) have a stick from each point vertically to the surface.

 The latter helps one visualize where a point lies in 3D, relative to the
 surface.  Is there a variant of the cloud function (lattice package) which
 might do this?  As far as I can tell, the cloud function will (essentially)
 only plot the a stick from a point to z=0, rather than to the surface.

 Thanks in advance!

 If you use the persp function from base graphics instead of cloud
from lattice you can then use the trans3d function to convert from
your 3d coords to 2d viewport coords. Then you can add the sticks
using 'lines' or 'segments'.

 There's an example usage in help(persp) where a sine-curve is plotted
along the edge of a 3d plot.

 I don't know if an equivalent transformation function exists for
lattice graphics.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pausing R

2009-10-19 Thread Barry Rowlingson
On Mon, Oct 19, 2009 at 2:01 PM, Karl Ove Hufthammer k...@huftis.org wrote:

 If you start the application using the command line, just press
 'Ctrl + Z' to pause/suspend it. Then type 'fg' when you want to
 resume it.

 If you can't get to the command line where you started R, then you
can send the process the 'STOP' and 'CONT' signals using the 'kill'
command. You need to get the process ID (see man ps for this) and
then use kill -STOP 12345 and kill -CONT 12345 where 12345 is the
process ID.

 Quite what happens if you are running multiple R threads via the
multicore package or any of the other multiple process packages... I
don't know...

 Of course the real answer is to run long processes on a server -
preferably one with an uninterruptible power supply and a diesel
generator - or make your process checkpointable so you can kill it and
restart it again.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrixes as data

2009-10-16 Thread Barry Rowlingson
On Fri, Oct 16, 2009 at 4:36 PM, Kjetil Halvorsen
kjetilbrinchmannhalvor...@gmail.com wrote:
 Hola!

 I am working on a problem where data points are (square) matrices. Is
 there a way to make a
 vector of matrices, such that it can be stored in a data.frame? Can
 that be done with S3?
 or do I have to learn S4 objects  methods?


 If the matrices are all the same size then you could store them in an
array, which is essentially a 3 or more dimensional matrix.

 Otherwise, you can store them in a list, and get them by number:

foo = list(matrix(1:9,3,3),matrix(1:16,4,4))
foo[[1]]
foo[[2]]

and so forth.

You'll only need to create new object classes (with S3 or S4) if you
want special behaviour of vectors of these things (such as plot(foo)
doing something sensible).

 With S3 it's easy:

class(foo)=squareMatrixVector

plot.squareMatrixVector=function(x,y,...){
  cat(ouch\n)
}

 plot(foo)
ouch

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Complex? import of pdf files (criminal records) into R table

2009-10-15 Thread Barry Rowlingson
On Thu, Oct 15, 2009 at 3:28 PM, Marc Schwartz marc_schwa...@me.com wrote:
 On Oct 15, 2009, at 3:43 AM, Biedermann, Jürgen wrote:

 You don't indicate the OS you are on, but you will want to get a hold of
 'pdftotext', which is a command line application that can extract the
 textual content from the PDF files.

 That's assuming the text is in the PDF as a text object. If it's a
scan of a paper document the chances are that all you have is an
image, in which case you need to do OCR (optical character
recognition) or get someone to type it all in again.

 Even if you can get all the text out with pdftext, R might not be the
right tool for the job - I'd do this kind of text processing and
matching job in Python (and before Python, I'd have used Perl). But if
all you have is a wRench...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] default borders in boxplot and barplot

2009-10-14 Thread Barry Rowlingson
On Wed, Oct 14, 2009 at 2:21 AM, Jennifer Young
jennifer.yo...@math.mcmaster.ca wrote:
 This is my first post so hopefully I haven't mucked up the rules.

 I'm trying to change the default borders in either boxplot or barplot so
 that, at the request of a journal, all of my figures have the same type of
 border.

 I've successfully used par(bty=o)  using plot(1:10, bty=o), but it
 seems that barplot and boxplot have their own defaults that override this.

 I've tried both
 par( bty=o)
 barplot(stuff)

 and

 barplot(stuff, bty=o)


 Does anyone know a trick that doesn't involve using abline() to force
 borders?


 Just do box() to draw a box round your plot area? Using the example
from ?barplot

  require(grDevices) # for colours
   tN - table(Ni - stats::rpois(100, lambda=5))
   r - barplot(tN, col=rainbow(20))
  box()

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to find prime numbers

2009-10-13 Thread Barry Rowlingson
On Tue, Oct 13, 2009 at 2:41 PM, Thomas Lumley tlum...@u.washington.edu wrote:
 On Tue, 13 Oct 2009, AJ83 wrote:


 I need to create a function to find all the prime numbers in an array. Can
 anyone point me in the right direction?

 This almost sounds like a homework problem to me... So here's a
solution that you can happily present to a tutor - if you can explain
how it works, then you deserve full marks!

primer=function(v){
  
return(regexpr(^1$|^(11+?)\\1+$,unlist(lapply(v,function(z){paste(rep(1,z),sep='',collapse='')})),perl=TRUE)
== -1)
}

Test:

  (1:30)[primer(1:30)]
 [1]  2  3  5  7 11 13 17 19 23 29

I'm not sure how big a number this works for

R golf anyone?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] General means of matching a color specification to an official R color name

2009-10-13 Thread Barry Rowlingson
On Tue, Oct 13, 2009 at 9:43 PM, Bryan Hanson han...@depauw.edu wrote:
 Hello List Dwellers:

 I¹ve looked around quite a bit, but don¹t quite see an answer that I
 understand.

 I¹m looking for a way to take any kind of color specification (rgb, hsv,
 hcl, hex) and match it to the n-nearest R official color names.  Clearly it
 is easy to interconvert different specification schemes and color spaces,
 but matching to the name seems a bit trickier.  Seems like if one has a
 specification, it could be fuzzy-matched to the list of official R colors
 expressed in the same specification.  Unfortunately, I don¹t know much about
 fuzzy matching.

 For example, following some examples I found in the archives and the wiki, I
 wrote this little function to create a table of official R colors and sort
 it if desired:

 colorSpecTable - function(col = colors(), sort = NULL){
    require(gplots)
    rgbcodes - t(col2rgb(col))
    names - col
    hex - col2hex(col)
    df - data.frame(name = names, hex.code = hex, rgbcodes)
   # additional elements for other color spaces could be added
    if (!identical(sort, NULL)) df - sort.data.frame(df, by = sort)
    }

 Note that sort.data.frame is from the R-wiki and is appended below.  Is
 there a clever way to search a table created by this function, and identify
 the n-closest colors based upon some reasonable criteria?  What I hope for
 is something like this:

 colorMatch - function(hex = NULL, n, plot = HOPEFULLY) {
    df.rgb - colorSpecTable(sort = ~red+green+blue) # master table
    # now search for the n closest matches of hex in df.rgb$hex.code
    # perhaps hex should be converted into a different color space 1st
    # eventually would like to display matches side by side w/hex
    }

 You just need to define your distance in colour space. Simplest might
be a euclidean distance in three-dimensional r,g,b coordinates,
something like:

nearColour - function(r,g,b){
  ctable = col2rgb(colors())
  cdiff = ctable - c(r,g,b)
  cdist = cdiff[1,]*cdiff[1,]+cdiff[2,]*cdiff[2,]+cdiff[3,]*cdiff[3,]
  return(colors()[cdist == min(cdist)])
  }

 This gives colour names nearest to r,g,b triples, with possible
multiple results:

  nearColour(0,0,0)
 [1] black gray0 grey0
  nearColour(1,1,1)
 [1] black gray0 grey0
  nearColour(255,255,255)
 [1] white   gray100 grey100
  nearColour(128,0,0)
 [1] darkred red4

Any good? You could also do it in hsv space, but there's probably
enough colours in the colors() vector that it wouldn't make much
difference...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] General means of matching a color specification to an official R color name

2009-10-13 Thread Barry Rowlingson
On Tue, Oct 13, 2009 at 10:58 PM, Bryan Hanson han...@depauw.edu wrote:
 Works perfectly!  Thanks Barry.  I had actually seen some suggestions on
 using a distance, but by then I was thinking about hcl spaces and distance
 isn't so as simple there.  I'm too tired I think.

 Anyway, you've got me running again!  Thanks, Bryan

 There's a CPAN module for Perl that does hcl colour similarity:

 http://search.cpan.org/~mbarbon/Color-Similarity-HCL-0.04/lib/Color/Similarity/HCL.pm

 the Perl code is pretty neat, looks easy to R-ify - released under
the perl license.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I reverse the digits of a number

2009-10-11 Thread Barry Rowlingson
On Sun, Oct 11, 2009 at 12:53 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Try this:

 library(tcltk)
 as.numeric(tcl(string, reverse, 123))
 [1] 321

The bit where the original poster said 'unknown length' worried me:

  as.numeric(tcl(string, reverse, 12377656534))
[1] 0.4356568

  as.numeric(paste(rev(strsplit(as.character(1234567890123727723),)[[1]]),collapse=))
[1] 6.167273e+18

As well as the use of the word 'number' - both solutions give NA for
negative integers and various things for decimals...

Just a bit of Sunday morning pedantry for you :)

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in family$family : $ operator is invalid for atomic vectors

2009-10-11 Thread Barry Rowlingson
On Sun, Oct 11, 2009 at 4:54 PM, romunov romu...@gmail.com wrote:
 Dear List,

 I'm having problem with an exercise from The R book (M.J. Crawley) on page
 567.
 Here is the entire code upto the point where I get an error.

 data(UCBAdmissions)
 x - aperm(UCBAdmissions, c(2, 1, 3))
 names(dimnames(x)) - c(Sex, Admit?, Department)
 ftable(x)
 fourfoldplot(x, margin = 2)
 dept-gl(6,4)
 sex-gl(2,1,24)
 admit-gl(2,2,24)
 model1-glm(as.vector(x) ~dept*sex*admit,poisson)

 This last line returns:

 Error in family$family : $ operator is invalid for atomic vectors

 I've searched older posts but found nothing that would help resolve my
 problem. Has anyone encountered anything similar and/or knows a fix?


 Works for me:

 model1-glm(as.vector(x) ~dept*sex*admit,poisson)
 model1

Call:  glm(formula = as.vector(x) ~ dept * sex * admit, family = poisson)

Coefficients:
  (Intercept)  dept2  dept3  dept4
  6.23832   -0.37186   -1.45083   -1.31107
 [etc]

What's your version:

 version
   _
platform   i486-pc-linux-gnu
arch   i486
os linux-gnu
system i486, linux-gnu
status
major  2
minor  9.2
year   2009
month  08
day24
svn rev49384
language   R
version.string R version 2.9.2 (2009-08-24)

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Social networking around R

2009-10-11 Thread Barry Rowlingson
On Sun, Oct 11, 2009 at 7:51 PM, HBaize hba...@buttecounty.net wrote:



 Harsh-7 wrote:

 Hi R users,

 I'd be interested in what R users think about social networking around all
 things R. For this, I've set up a social network @
 www.rstuff.socialgo.comand it would be great if you could post your
 comments on the forum created
 for this discussion.



 Could you provide a complete URL. I'm getting a 404 not found.

http://www.rstuff.socialgo.com/

 - original post had a space missing between 'com' and 'and'. Or your
local DNS hasn't sorted itself out yet if this is a new name, but it
should

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running R scripts from a GUI interface

2009-10-10 Thread Barry Rowlingson
On Sat, Oct 10, 2009 at 1:01 PM, Jason Rupert jasonkrup...@yahoo.com wrote:
 Thank you very much for your response and it looks like R Commander is very 
 capable, but I think it is heading the wrong direction from where we are 
 looking to go, i.e. simpler interface.

 I guess (and I may be dating myself) when I was previously working with 
 MATLAB I could use something like Real-time Workshop and EmbeddedCoder to 
 export a C/C++ code of all our scripts that could be built into a DLL.  I 
 would then use VisualBasic (I would like to use TCL/TK or other more current 
 language) to create a very basic GUI that just had four radials and one text 
 entry field and load in the DLL.  It would not have all commandline and all 
 the other unnecessary bits, especially since the script reached maturity and 
 would not need to be altered.


There is a tcltk package for R that lets you build guis, or you could
write your application in Python, use PyQt4 to build your gui (which
comes with a graphical gui designer) and Rpy to communicate between
Python and R.

 There's some other suggestions in the Graphics Task View:
http://ftp.heanet.ie/mirrors/cran.r-project.org/web/views/Graphics.html

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with code

2009-10-09 Thread Barry Rowlingson
On Fri, Oct 9, 2009 at 4:54 PM, Anne Buunk ann3bu...@hotmail.com wrote:
                          text(0.5,0.5, text = paste(letters[i], +, 
 numbers[j],=, letters [i+j+k])

Missing ) on the end there. You have one ( for text( and one for
paste( but only one ).

 Use an editor that matches parentheses, and read error messages to
figure out where things are going wrong.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Satellite ocean color palette?

2009-10-09 Thread Barry Rowlingson
On Fri, Oct 9, 2009 at 7:51 PM, Tim Clark mudiver1...@yahoo.com wrote:
 Dear List,

 Is there a color palette avaliable similar to what is used in satellite ocean 
 color imagery?  I.e. a gradient with blue on one end and red on the other, 
 with yellow in the middle?  I have tried topo.colors(n) but that comes out 
 more yellow on the end.  I am looking for something similar to what is found 
 on the CoastWatch web page:

 http://oceanwatch.pifsc.noaa.gov/imagery/GA2009281_2009282_sst_2D_eddy.jpg

 Thanks!

 You could build one yourself with the colorRamp function:

satRampP = 
colorRampPalette(c(black,blue,cyan,yellow,orange,red,black))

 that looks roughly like the one in the jpg, but I'm not sure about
the black at the far end...anyway, let's see:

image(matrix(seq(0,1,len=100),100,1),col=satRampP(100))

Or you could try my colour schemes package:

https://r-forge.r-project.org/projects/colourscheme/

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Something wrong with my function Please Help

2009-09-29 Thread Barry Rowlingson
On Tue, Sep 29, 2009 at 4:29 AM, Chunhao Tu tu_chun...@yahoo.com wrote:

 Hi R users,
 I try to build a function to compute odds ratio and relative risk however
 something wrong. I stuck for many hours but I really don't know how to solve
 it. Would someone please give me a hint?

 OR.RR-function(x){

 What is this line doing:

      x - as.matrix(any(dim(x)==2))

 If you run it on your 2x2 matrix you get this:

  as.matrix(any(dim(tt)==2))
 [,1]
[1,] TRUE

 and then the rest of your code is working with that value of 'x', and
so failing.

 I don't know what that line is there for - what do you think it is doing?

 You shouldn't get stuck for hours over something this simple. Try
each line on its own and check the return value is what you think it
should be. You can either type each line in to the  prompt, or learn
to use the 'debug' function. See help(debug) for more.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectors levels are carried through to subsets...

2009-09-29 Thread Barry Rowlingson
On Tue, Sep 29, 2009 at 6:47 PM, chipmaney chipma...@hotmail.com wrote:

 I have a dataset.  Initially, it has 25 levels for a certain factor,
 Description.

 However, I then subset it, because I am only interested in 2 of the 25
 factors.  When I subset it, I get the following. The vector lists only the
 two factors, yet there remain 25 levels:

 Quadrats.df$Description
  [1] Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75
 Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent
 25x75
 [10] Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75
 Emergent 25x75  Emergent 25x75  Hydroseed 25x75 Hydroseed 25x75 Hydroseed
 25x75
 [19] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75
 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed
 25x75
 [28] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75
 25 Levels: Black Cottonwood Black Cottonwood Enhanced Emergent Emergent
 25x75 Floodplain 1 Floodplain 2 Floodplain 3 Hydroseed 25x75 ... Western Red
 Cedar Enhanced

 This seems rather innocuous; however, when I run a by statement, it returns
 a list with 25 entries, 23 of which are of course NAis there a way to
 avoid this?


 Just re-factor() it when you select a subset - and also it's nice if
you give us a simple example - all your Emergent this and Hydroseed
doesn't look very clear!

 Like this:

# make a factor:
 x=factor(sample(letters,10))
 x
 [1] z x f i n b y e p c
Levels: b c e f i n p x y z

# a subset:

 x[1:3]
[1] z x f
Levels: b c e f i n p x y z

# - still has all the levels. So re-factor():

 factor(x[1:3])
[1] z x f
Levels: f x z

 et voila?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and REST API's

2009-09-28 Thread Barry Rowlingson
On Mon, Sep 28, 2009 at 3:01 PM, Gary Lewis gary.m.le...@gmail.com wrote:
 Hi - Many organizations now make their data available as XML via a
 REST web service architecture. Is there any R package or facility to
 access this type of data directly (eg, to make the HTTP GET request
 and have the downloaded data put into an R data  frame)?

 I used several R search sites to look for an answer, but came up with
 very little. Any help would be appreciated.

 You can (in many places) replace a filename with url(address) and R
functions such as read.table will work. If the response is XML then
getting the response into a data frame will be so dependent on the
format then it's unlikely there's a generic solution.

Specific solutions exist - for example my geonames package that
queries the geonames server for geographic information:

http://r-forge.r-project.org/projects/geonames/

 Note that instead of getting XML and having to parse that, it gets
the JSON representation and use rjson to decode it - which saves a bit
of weight since XML parsing is a bit heavy.

 I suspect that if SOAP and WSDL interfaces were more widely used then
some of the R Omegahat projects might be more useful:

http://www.omegahat.org/download/R/packages/

but it seems that informal service definitions via simple REST
interfaces and XML responses are winning.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove single entries

2009-09-28 Thread Barry Rowlingson
On Mon, Sep 28, 2009 at 5:03 PM, Raymond Danner rdan...@vt.edu wrote:
 Dear Community,

 I have a data set with two columns, bird number and mass.  Individual birds
 were captured 1-13 times and weighed each time.  I would like to remove
 those individuals that were captured only once, so that I can assess mass
 variability per bird.  Iąve tried many approaches with no success.  Can
 anyone recommend a way to remove individuals that were captured only once?

 Approach this one step at a time. My sample data is:

  wts
  bird mass
11  2.3
21  3.2
31  2.1
42  1.2
53  5.4
63  4.5
73  4.4
84  3.2

 how many times was each bird measured? Use table()

  table(wts$bird)

1 2 3 4
3 1 3 1

  table uses the row.names() function to get the row names of the
original dataframe, so we want the row names where the count is
greater than one:

  row.names(table(wts$bird))[table(wts$bird)1]
[1] 1 3

 [This calls 'table' twice, so you might want to save the table to a new object]

Now we want all the rows of our original dataframe where the bird
number is in that set, so we select rows using %in%:

  wts[wts$bird %in% row.names(table(wts$bird))[table(wts$bird)1],]
  bird mass
11  2.3
21  3.2
31  2.1
53  5.4
63  4.5
73  4.4

 Looks a bit messy, I'm not pleased with myself... Must be a better way...

 Aha! A table-free way of computing the bird counts is:

  unique(wts$bird[duplicated(wts$bird)])
[1] 1 3

 So you could do:

  wts[wts$bird %in% unique(wts$bird[duplicated(wts$bird)]),]
  bird mass
11  2.3
21  3.2
31  2.1
53  5.4
63  4.5
73  4.4

 which looks a bit neater! You might want to unravel
unique(wts$bird[duplicated(wts$bird)]) to see what the various bits
do. And read the help pages.

TMTOWTDI, as they say.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any month object like LETTERS ?

2009-09-11 Thread Barry Rowlingson
On Fri, Sep 11, 2009 at 8:13 AM, megh megh700...@yahoo.com wrote:

 There is an object LETTERS which displays all letters from a to z. Is
 there any similar object whicg displays the months as well in
 chronological order? like jan, feb,...,dec

 You could construct a vector of the first of the month for some year
and then use months() or format() on it:

  months(ISOdatetime(1960,1:12,1,0,0,0))
  [1] January   February  March April May   June
  [7] July  AugustSeptember October   November  December

  format(ISOdatetime(1960,1:12,1,0,0,0),%b)
  [1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

 Note these are locale-dependent, so les Francais will see something
else. Probably Jan,Fev,Mar,Avr and so on...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running R in Windows server

2009-09-10 Thread Barry Rowlingson
On Thu, Sep 10, 2009 at 3:07 PM, srpd TCLTKsrpd2...@hotmail.com wrote:

 Hi,



 I'm trying to set up a server which allows different users to use R 
 simultaneously. Is it possible in Windows?
 I know that a LINUX Server is probably a better option, but I had already 
 created a GUI with Tcl/tk in Windows. So some of the events don't work in 
 LINUX.

  You need Windows Terminal Server 2008 (or 2003) plus the right
client access licenses (CALs) depending on how you expect the machine
to be used. A thousand dollars gets you a single server license plus 5
CALs. Then your 5 users connect to it from their Windows PCs using the
Windows Remote Desktop Connection program. So your users will need
Windows XP (or Vista, or 7) licenses as well, unless they have
linux/mac desktops in which case they can use rdesktop to connect.

  Is it a thousand dollars worth of hassle to make your Tcl/Tk program
OS-independent? Plus the time it takes to learn how to set up and
admin a Windows TS 2008 box...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data separated by spaces, getting data into R using field lengths

2009-09-08 Thread Barry Rowlingson
On Tue, Sep 8, 2009 at 12:53 PM, Lauri Nikkinenlauri.nikki...@iki.fi wrote:
 I have a text file similar to this (separated by spaces):

 x - DF12 This is an example 1 This
 DF12 This is an 1232 This is
 DF14 This is 12334 This is an
 DF15 This 23 This is an example
 

 and I know the field lengths of each variable (there is 5 variables in
 this data set), which are:

 varlength - c(2, 2, 18, 5, 18)

 How can I import this kind of data into R, using the varlength
 variable as an field separator indicator?

?read.fwf

Read Fixed Width Format Files

Description:

 Read a table of *f*ixed *w*idth *f*ormatted data into a
 'data.frame'.

Usage:

 read.fwf(file, widths, header = FALSE, sep = \t,
  skip = 0, row.names, col.names, n = -1,
  buffersize = 2000, ...)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    1   2   3   4   5   6   7   >